Drop global `pytest-timeout` cap from `pyproject.toml`

`timeout = 200` was firing via SIGALRM (the default `method='signal'`) which synchronously raises `Failed` in trio's main thread mid-`epoll.poll()`, abandoning trio's runner mid-flight and leaving `GLOBAL_RUN_CONTEXT` half- installed. EVERY subsequent `trio.run()` in the same pytest session then bails with `RuntimeError: Attempted to call run() from inside a run()`. Empirical impact: a session that hits a single 200s hang cascades into 30-40 false-positive failures across every downstream test file that uses `trio.run`. Recent UDS run saw 1 real timeout (`test_unregistered_err_still_relayed`) poison 38 sibling tests with cascade-fails — a debugging nightmare. Same architectural bug we already documented in `tests/test_advanced_streaming.py::test_dynamic_pub_sub` (see its module-level NOTE) — both `pytest-timeout` enforcement modes are incompatible with trio under fork- based spawn backends. Now scoped session-wide. For tests that legitimately need a wall-clock cap, the canonical pattern is `with trio.fail_after(N):` INSIDE the test — trio's own `Cancelled` machinery cleanly unwinds the actor nursery without disturbing global state. For CI: rely on job-level wall-clock timeouts (e.g. GitHub Actions `timeout-minutes`) to abort genuinely-stuck suites. `pyproject.toml` comment block spells this all out so a future contributor doesn't reach back for `timeout =` and re-introduce the bug. ALSO, bump `xonsh` to at least `0.23.0` release. (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code (cherry picked from commit 3c366cac13) (factored: the xonsh pin/editable-source hunks already landed with the devenv segment)
2026-04-28 16:00:16 -04:00 · 2026-04-28 16:00:16 -04:00 · 9a844b91f3
parent dcb00e5a8f
commit 9a844b91f3
1 changed files with 29 additions and 1 deletions
--- a/pyproject.toml
+++ b/pyproject.toml
@ -209,7 +209,35 @@ all_bullets = true

 [tool.pytest.ini_options]
 minversion = '6.0'
-timeout = 200  # per-test hard limit
+# NOTE: `pytest-timeout`'s global per-test cap is intentionally
+# NOT set — both of its enforcement methods break trio's
+# runtime under our fork-based spawn backends:
+#
+# - `method='signal'` (the default; SIGALRM) raises `Failed`
+#   synchronously from the signal handler in trio's main
+#   thread, which leaves `GLOBAL_RUN_CONTEXT` half-installed
+#   ("Trio guest run got abandoned"). EVERY subsequent
+#   `trio.run()` in the same pytest session then bails with
+#   `RuntimeError: Attempted to call run() from inside a
+#   run()` — full-session poison: a single 200s hang
+#   cascades into 30+ false-positive failures across
+#   downstream test files.
+#
+# - `method='thread'` calls `_thread.interrupt_main()` which
+#   can let the resulting `KeyboardInterrupt` escape trio's
+#   `KIManager` under fork-cascade teardown races, killing
+#   the whole pytest session.
+#
+# For tests that legitimately need a wall-clock cap, use
+# `with trio.fail_after(N):` INSIDE the test — trio's own
+# Cancelled machinery handles the timeout cleanly through
+# the actor nursery without disturbing global state. See
+# `tests/test_advanced_streaming.py::test_dynamic_pub_sub`'s
+# module-level NOTE for the canonical pattern.
+#
+# CI environments should rely on job-level wall-clock
+# timeouts (e.g. GitHub Actions `timeout-minutes`) for an
+# escape hatch on genuinely-stuck suites.
 # https://docs.pytest.org/en/stable/reference/reference.html#configuration-options
 testpaths = [
  'tests'