Complete the xontrib-side slimming from "Mv core impl
`tractor_diag.xsh` to `_testing.trace`" (its xontrib hunk
couldn't ride with the testing-harness segment since the
xontrib lands in this one) and sync a couple `_reap.py`
issue/doc-ref comment strings to their final form.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Make the sub-actor proc-title prefix a single
authoritative constant (`_proctitle._def_prefix`) so
the reap-recognition markers and `xontrib` banner pick
it up automatically — one place to flip the prefix
shape going fwd.
Deats,
- `_proctitle._def_prefix: str = '_subactor'`. New
module-level const consumed by everything that needs
to know the prefix.
- `set_actor_proctitle(actor, prefix=_def_prefix)`:
takes an explicit `prefix` arg (default = the const)
so callers can override per-spawn if they want.
- Default proc-title format:
`'tractor[<reprol>]'` → `f'{prefix}[<reprol>]'`
i.e. `_subactor[<reprol>]` by default.
- `_testing/_reap.py`: cmdline + comm markers source
the prefix from `_proctitle._def_prefix` instead of
the hardcoded `'tractor['`. So
`_is_tractor_subactor()` tracks the const
automatically.
- `xontrib/tractor_diag.xsh`: `acli.reap` orphan-mode
banner now interpolates the
`_TRACTOR_PROC_CMDLINE_MARKERS` tuple directly so
the human-readable mode line stays in sync if the
prefix shape changes again.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit 3a45dbd503)
Rework reap/diag tooling to identify tractor sub-actors via
intrinsic proc signals — cmdline/comm markers from `setproctitle` —
instead of env-var or cwd matching.
Deats,
- new `_is_tractor_subactor()` checks cmdline for `tractor[` /
`tractor._child` markers, falls back to `/proc/<pid>/comm` for
zombie-resilient detection (kernel preserves `comm` past exit
until reap)
- `_read_comm()` reads kernel per-task name set by `setproctitle()`
— the zombie-safe ID signal
- `_read_status_state()` reads single-letter proc state from
`/proc/<pid>/status` (`Z` = zombie)
- `find_orphans()` drops `repo_root` requirement, uses
`_is_tractor_subactor()` for intrinsic sub-actor ID instead of
cwd coincidence-matching
- new `find_zombies()` with optional `parent_pid` filter for
zombie-state sub-actors
Also,
- rename `pytree` -> `ptree` throughout xontrib
- add `_which_cgroup_slice()` — reads `/proc/<pid>/cgroup` to
distinguish `system.slice` services vs `user.slice` desktop apps
from genuinely leaked orphans
- `_ptree` classifies `ppid==1` procs into `system-slice`,
`user-slice`, and `orphans` buckets with per-section output
- `_tractor_reap` drops `git rev-parse` / `sys.path` hack — assumes
tractor importable from active venv
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit 522b57570b)
Only flag `tractor._child` procs as cross-test ghosts of
THIS run if `ppid==1` (init-adopted real leak) or `ppid`
is in the walk's `seen` set (descendant we missed via
race).
Previously, procs whose `ppid` points to some OTHER live non-`pytest`
(in the use of `acli.ptree pytest`) process belong to a different
tractor app (`piker`, another `pytest` shell, a long-running tractor
daemon) and were being falsely flagged as cross-test ghosts.
Deats,
- post-cmdline-match check via `_ppid_from_proc(pid)`,
short-circuit on `None` (proc died in-flight).
- expand module docstring to spell out the ownership
filter rule + its rationale.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit a6d4ac3aac)
Post-yield now also reaps init-adopted (`ppid==1`) tractor procs
that appeared during the test — leaked subactors whose mid-tier
parent died during cascade teardown, reparenting them to init.
Pre-yield snapshot of existing orphans scopes reap to THIS test's
leaks only, avoiding reap of unrelated tractor uses (piker, etc.)
on the box.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit 01ce2857ea)
(factored: also default `find_orphans(repo_root=None)` -> cwd so the new bare call sites work ahead of the later intrinsic-identity rewrite)
`reap(include_descendants=True)` now expands each orphan-root pid
into its full psutil subtree before delivering SIGINT, so a
multi-level leaked actor-tree gets torn down in a single pass
instead of requiring repeated calls (each pass kills the current
`ppid==1` level, the level below becomes init-adopted, etc.).
Falls back to the original flat `pids` list when `psutil` is
unavailable. Emits a log line when expansion adds descendant pids.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit 8de684f5de)
- `_testing/trace.py`: add `_SNAPSHOT_INDEX` session- scoped list
populated by `_do_capture_snapshot()` on each successful dump;
add TODO for future `TRACTOR_TRACE_HOLD=1` pause-on-hang mode
- `_testing/pytest.py`: add `pytest_terminal_summary` hook that
prints all captured snapshot dirs at end-of-session so paths
don't get buried in scrollback
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit fb87c36263)
Deats,
- `_find_tractor_strays()`: scan `/proc/*/cmdline` for
`tractor._child` procs NOT in the walk's `seen` set — surfaces
ghost subactor trees from prior test runs (cross-test launchpad
contamination).
- `dump_proc_tree(include_strays=True)`: refactor classification
into `_classify_walk()` closure, walk stray roots as additional
trees, emit stray-root summary in header. Also: `tractor._child`
procs reparented to init are now always classified as orphans
regardless of cgroup-slice (leaked subactor ≠ desktop-launched
app).
- `_do_capture_snapshot()`: use `sys.__stderr__` to bypass pytest
`--capture=sys` redirection so snapshot paths always land on the
real terminal
- `fail_after_w_trace()`: capture diag snapshot on
non-`TooSlowError` exceptions when the `fail_after` scope's
cancel had already fired (e.g. nursery wraps `Cancelled` into a
`BaseExceptionGroup` that escapes before `TooSlowError` can be
raised).
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit 3a243a1fd4)
Extract all pure-Python diagnostic helpers (`dump_proc_tree`,
`dump_hung_state`, `scan_bindspace`, `dump_all`, `resolve_pids`,
`ensure_sudo_cached`, etc.) from the xonsh xontrib into a new
`tractor/_testing/trace.py` module so the same logic is callable
from both the `acli.*` terminal aliases AND in-test capture-on-hang
fixtures.
Deats,
- `_testing/trace.py`: new module (1171 lines) — proc-tree walker,
hung-state dumper, bindspace scanner, `dump_all()` snapshot
archiver, `AFKAlarmTimeout` exc, `fail_after_w_trace()` async CM
(trio `fail_after` + auto-snapshot on `TooSlowError`),
`afk_alarm_w_trace()` sync CM (`signal.alarm` + snapshot on
`SIGALRM`), plus pytest fixture wrappers for both.
- `_testing/pytest.py`: re-export the two fixtures via `from .trace
import` so pytest plugin-discovery picks them up.
- `tractor_diag.xsh`: thin terminal wrappers that import from
`_testing.trace` — drops ~627 lines of inline impl. Add
`acli.dump_all` alias for full snapshot-bundle CLI access.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit 7509e313ff)
(factored: the xontrib-side move-out hunk rides with the diag-xontrib segment)
Refactor `pytest_load_initial_conftests()` to split
the fork-spawn × capture-mode check into two policies:
- CI (`CI` env-var set): `pytest.exit(rc=2)` on
mismatch — forces every matrix-row to declare
`--capture=sys` explicitly.
- local: `warnings.warn()` + continue — lets devs
experiment with `--capture=fd` to validate fixes.
Deats,
- drop `_cap_fd_set` global; add
`_CAPSYS_REQUIRED_SPAWNERS` frozenset for the
spawner-name lookup
- move inline comment wall → proper docstring w/
Background, Trade-off, Validation-policy sections
- `maybe_xfail_for_spawner()` now takes
`request: pytest.FixtureRequest` and reads
`request.config.option.capture` instead of the
`_cap_sys_passed_as_flag` global
- recognize `tee-sys` as fork-safe (only `fd`-level
capture deadlocks)
- `set_fork_aware_capture()` returns the actual
capture mode str from config, not a hardcoded
`'sys'`
- lift `import warnings` to module level (was duped
inside `pytest_configure`)
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit 255c9c3a7c)
Rename `_track_orphaned_uds_per_test` and
`_detect_runaway_subactors_per_test` to public names (drop `_` prefix),
drop `autouse=True`. Tests that need per-test reap blame now opt in via
`pytestmark = pytest.mark.usefixtures(...)`.
Also,
- reduce `sample_interval` from 0.5 -> 0.05s so the CPU probe is cheaper
per pid.
- add empty-`only_pids` fast-path in `find_runaway_subactors` to skip
psutil import when no descendants were spawned.
- extract `new_pids` intermediate var for clarity.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit e4953851de)
New `find_runaway_subactors()` helper + autouse
`_detect_runaway_subactors_per_test` fixture that
samples `psutil.cpu_percent()` on descendants to
catch tight-loop bugs (e.g. #452-class `recvfrom`
on a closed socket). Checks both at setup
(leftovers from a prior hung test) and teardown
(spawned by this test).
Intentionally does NOT kill the runaway — emits
a loud warning with diag commands (`strace`,
`lsof`, `ss`, `kill`) so the pid stays alive for
hands-on investigation. Session-end reaper still
SIGINT/SIGKILL survivors on normal exit.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit 5cf0312c78)
Extend the pytest plugin with helpers that detect
and adapt to `--capture=sys` under fork-based
spawners (`main_thread_forkserver`, `mp_forkserver`)
where fd-capture causes hangs.
Deats,
- track `_cap_sys_passed_as_flag` + `_cap_fd_set`
globals in `pytest_load_initial_conftests()`.
- add `@pytest.hookimpl(tryfirst=True)` + re-parse
args after appending `--capture=sys`.
- `_is_forking_spawner()` predicate + fixture.
- `maybe_xfail_for_spawner()` — enalbes skipping tests that need capsys
but weren't passed `--capture=sys`.
- `set_fork_aware_capture` fixture — returns the appropriate capture
fixture per spawner backend based on `start_method: str` set via CLI.
- wire `set_fork_aware_capture` into `tractor_test`
wrapper's fixture injection.
Also,
- add `alert_on_finish` session fixture (terminal
bell on completion; tho not sure it works fully..)
- add `ids=` to `start_method` parametrize.
- restore `default=False` on `--enable-stackscope`.
- drop commented-out `--ll` option block; we will likely factor it to
our plugin eventually however..
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit d549c72052)
Move `--capture=sys` enforcement from a static ini
flag to a `pytest_load_initial_conftests()` bootstrap
hook that dynamically flips capture mode only when a
fork-based spawner (like `main_thread_forkserver`) is
detected; non-fork backends keep `--capture=fd`.
Also,
- load `tractor._testing.pytest` via `-p` in ini
(bc bootstrapping hooks must register before
conftest `pytest_plugins` runs).
- register `_reap` as sub-plugin via `pytest_plugins`
tuple in `._testing.pytest`.
- drop now-duplicate reap fixtures (already in `_reap`
per 1cdc7fb3).
- rename `tractor_enable_stackscope` dest -> `enable_stackscope`
and pop env var on disable.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit 61d4525137)
Extend the `_testing._reap` mod with UDS sock-file leak detection +
cleanup, complementing the existing shm and subactor-process
reaping:
- `get_uds_dir()`, `_parse_uds_name()`, `find_orphaned_uds()`,
`reap_uds()` — detect `<name>@<pid>.sock` files under
`${XDG_RUNTIME_DIR}/tractor/` whose binder pid is dead (including
the `1616` registry sentinel).
- `_reap_orphaned_subactors` session-scoped autouse fixture: SIGINT
lingering subactors, wait, SIGKILL survivors, then sweep orphaned
UDS files.
- `_track_orphaned_uds_per_test` fn-scoped autouse fixture:
snapshot sock-file dir before/after each test, warn + reap new
orphans to prevent cascade flakiness under `--tpt-proto=uds`.
- `reap_subactors_per_test` opt-in fn-scoped fixture for modules
with known-leaky teardown.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit 1cdc7fb302)
Function-scoped, NON-autouse zombie-subactor reaper for
modules whose teardown is known-leaky enough to cascade-
fail every following test in a session.
Sibling to the autouse session-scoped `_reap_orphaned_subactors`. The
session-scoped one fires at session end — too late to save tests that
follow a hung/leaky test in the suite. The new fixture, opted into via
`pytestmark = pytest.mark.usefixtures(...)`, runs between tests in
a problem-module so a leftover subactor from test N can't squat on
registrar ports / UDS paths / shm segments needed by tests N+1,
N+2, ...
Intentionally NOT autouse — the fixture's presence on a module signals
"this module's teardown leaks; please root-cause instead of relying
forever on cleanup". A visibility-vs-convenience trade picked in favor
of the former.
Apply to `tests/test_infected_asyncio.py` since both recent full-suite
runs (parallel-tpt-proto + TCP-only) showed the cascade originating in
this file's KBI- and SIGINT-flavored tests under
`main_thread_forkserver`. Module-comment names the specific offenders so
future de-flake work has a starting point.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit b376eb0332)
Previously the random port was a default-arg expression
(`_rando_port: str = random.randint(1000, 9999)`) — evaluated
ONCE at module import time, making it a per-process singleton.
Two parallel pytest sessions had a 1/9000 birthday-pair chance
of picking the same port; when it hit, every `reg_addr`-using
test in BOTH runs would cascade-fail with "Address already in
use".
Switch to per-call `random.randint()` salted with `os.getpid()`
so:
- within one session: two calls return distinct ports — e.g.
`test_tpt_bind_addrs::bind-subset-reg` now actually gets two
different reg addrs on the TCP backend (it was silently
duplicating before),
- across parallel sessions: pid salt biases each process's
port choices apart, making cross-run collisions
vanishingly rare.
Drop the bogus `: str` annotation (was always `int`). UDS already gets
per-process isolation via `UDSAddress.get_random()`'s `@<pid>`
socket-path suffix, so no change needed there.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit 7c5dd4d033)
Since `tractor.ipc._mp_bs.disable_mantracker()` turns off
`mp.resource_tracker` entirely (see the conc-anal doc
`subint_forkserver_mp_shared_memory_issue.md`), a
hard-crashing actor can leave `/dev/shm/<key>` segments
that nothing else GCs. New `tractor-reap` phase 2 sweeps
them.
Deats,
- `tractor/_testing/_reap.py`: add `find_orphaned_shm()`
+ `reap_shm()` helpers. Match criteria: regular file
under `/dev/shm`, owned by current uid, AND no live
proc has it open (mmap'd or fd-held). In-use
enumeration via `psutil.Process.memory_maps()` +
`.open_files()` — xplatform, kernel-canonical (same
answer `lsof` would give), no reliance on
tractor-specific shm-key naming.
- `_ensure_shm_supported()` guard: helpers raise
`NotImplementedError` outside Linux/FreeBSD bc macOS
POSIX shm has no fs-visible path (`shm_open` only)
and Windows is a different story.
- `scripts/tractor-reap`: new `--shm` (run after
process reap) and `--shm-only` (skip process phase)
flags. `-n` dry-runs both phases. Exit code is `1`
if either phase had survivors/errors.
- `pyproject.toml` + `uv.lock`: add `psutil>=7.0.0` to
the `testing` dep group; lazy-imported in `_reap.py`
so the process-reap path stays import-clean without
it.
Also,
- doc `--shm` in `.claude/skills/run-tests/SKILL.md`
(new section 10c) — covers match criteria + the
preservation guarantee for unrelated apps.
- flip mitigation status in
`subint_forkserver_mp_shared_memory_issue.md` from
"could extend `tractor-reap`" to "implemented", with
a note that callers should still UUID-pin shm keys to
avoid cross-session collisions.
Verified locally vs 81 in-use segments held by `piker`,
`lttng-ust-*`, `aja-shm-*` — all preserved; only the
genuinely-orphaned tractor segments got unlinked.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit 4f12d69b41)
(factored: dropped subint_forkserver conc-anal doc update)
Zombie-subactor cleanup for the test suite, SC-polite discipline
(`SIGINT` first, bounded grace, `SIGKILL` only on survivors). Two parts:
a shared reaper module + an autouse session-end fixture that runs it.
Deats,
- new `tractor/_testing/_reap.py` (+230 LOC) — Linux- only reaper using
`/proc/<pid>/{status,cwd,cmdline}` inspection. Two detection modes:
- `find_descendants(parent_pid)` for the in-session case
(PPid-direct-match while pytest is still alive).
- `find_orphans(repo_root)` for the CLI / post- mortem case (`PPid==1`
reparented to init + `cwd` filter to repo root + `python` cmdline
filter).
- `reap(pids, *, grace=3.0, poll=0.25)` does the signal ladder: SIGINT
all, poll up to `grace` for exit, SIGKILL any survivors. Returns
`(signalled, killed)` for caller-side reporting.
- new `_reap_orphaned_subactors` session-scoped autouse fixture in
`tractor/_testing/pytest.py` — after `yield`, runs
`find_descendants(os.getpid())` + `reap(...)` so each pytest session
leaves no surviving forks.
- companion CLI scaffolding lives at `scripts/tractor-reap` (separate
commit) for the pytest-died-mid-session case where the in-session
fixture didn't get to run.
Also,
- promote `from tractor.spawn._spawn import SpawnMethodKey` to
module-top in `pytest.py` (was inline-imported inside
`pytest_generate_tests`), and reuse it in
`pytest_collection_modifyitems` to assert each `skipon_spawn_backend`
mark arg is a valid spawn-method literal — catches typos at collection
time.
- inline `# ?TODO` flags running these through the `try_set_backend`
checker for stronger validation.
Cross-refs `feedback_sc_graceful_cancel_first.md` for the
SIGINT-before-SIGKILL discipline rationale.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit eae478f3d5)
A reusable `@pytest.mark.skipon_spawn_backend( '<backend>' [, ...],
reason='...')` marker for backend-specific known-hang / -borked cases
— avoids scattering `@pytest.mark.skipif(lambda ...)` branches across
tests that misbehave under a particular `--spawn-backend`.
Deats,
- `pytest_configure()` registers the marker via
`addinivalue_line('markers', ...)`.
- New `pytest_collection_modifyitems()` hook walks
each collected item with `item.iter_markers(
name='skipon_spawn_backend')`, checks whether the
active `--spawn-backend` appears in `mark.args`, and
if so injects a concrete `pytest.mark.skip(
reason=...)`. `iter_markers()` makes the decorator
work at function, class, or module (`pytestmark =
[...]`) scope transparently.
- First matching mark wins; default reason is
`f'Borked on --spawn-backend={backend!r}'` if the
caller doesn't supply one.
Also, tighten type annotations on nearby `pytest`
integration points — `pytest_configure`, `debug_mode`,
`spawn_backend`, `tpt_protos`, `tpt_proto` — now taking
typed `pytest.Config` / `pytest.FixtureRequest` params.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit 3b26b59dad)
Two coupled changes that let downstream projects (eg. `modden`) inherit
the test-harness loglevel plumbing for free via
`tractor._testing.pytest`:
Plugin lift (`tests/conftest.py` → `_testing/pytest.py`),
- mv `pytest_addoption(--ll)`, the `loglevel` autouse
fixture, and `test_log` fixture out of the test-suite-
local conftest into the reusable plugin.
- add `--tl`/`--tractor-loglevel` as a DISTINCT flag from
`--ll`: `--ll` is the consuming-project's OWN app
loglevel (scoped to its pkg-hierarchy), `--tl` is the
`tractor.*` runtime loglevel. `--tl` falls back to
`--ll` when unset (preserves current `tractor`-suite
behavior).
- add `testing_pkg_name` session fixture (default
`'tractor'`) — downstream projects override to e.g.
`'modden'` so `--ll` scopes to their own hierarchy
instead of `tractor.*`.
- `loglevel` fixture now yields the resolved
tractor-runtime level (passed to
`open_root_actor(loglevel=<.>)` by `@tractor_test`)
AND separately applies `--ll` to the
`testing_pkg_name` hierarchy when that isn't
`tractor`. `test_log` scopes the per-test logger to
`testing_pkg_name`.
`tractor.log` "logging-spec" mini-DSL,
- `LogSpec = str|bool`. Accepted forms:
- `True` → enable `pkg_name` root at `default_level`
(fallback `'cancel'`).
- `False` → no-op.
- bare level eg. `'info'` → root-logger at that level.
- `'sub:info,x:cancel'` → per-sub-logger filter-spec;
each `<name>` is RELATIVE to `pkg_name` (must NOT
include the pkg-token).
- `parse_logspec()` → `{sublog|None: level}` mapping.
`None` key = root-logger. Mixed bare-level + filters
in one spec is rejected w/ a helpful err msg; so is
embedding the `pkg_name` token in a sub-name.
- `apply_logspec()` → `(primary_level, {name: log})`:
parses then enables a `colorlog` stderr handler per
named (sub)logger. Authoritative sub-logger filters
get `propagate=False` so they don't double-emit
through a parallel root-level handler.
- !GRANULARITY CAVEAT! sub-logger names match at
sub-pkg granularity, not leaf-module — so `devx.debug`
collapses to the same `tractor.devx` logger as a bare
`devx`, and top-level lib modules (eg.
`tractor.to_asyncio`) emit under the *root* logger
rather than a phantom `to_asyncio` child. Documented
inline on `LogSpec`.
Other,
- `tests/conftest.py` keeps a NOTE pointing to the
plugin for future-debugging clarity (don't remove
silently — the lift is the relevant signal).
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit 19a77708ba)
New `--enable-stackscope` CLI flag installs a SIGUSR1 →
trio-task-tree-dump handler in pytest itself + every
spawned subactor for live stack visibility during hang
investigations. Lighter than `--tpdb` (no pdb machinery
/ tty-lock contention) — pure stack-only triage.
Plumbing:
- `_testing.pytest.pytest_addoption()` adds the flag.
- `_testing.pytest.pytest_configure()` (when flag set):
* exports `TRACTOR_ENABLE_STACKSCOPE=1` so fork-children
inherit it via environ,
* installs the handler in pytest itself via
`enable_stack_on_sig()`.
- `runtime._runtime.Actor.async_main()` extends the
existing `_debug_mode` gate to ALSO fire when
`TRACTOR_ENABLE_STACKSCOPE` is in env — so subactors
install the same handler at runtime startup.
Capture-bypass tee in `dump_task_tree()`:
Pytest's default `--capture=fd` swallows `log.devx()`
output, making SIGUSR1 dumps invisible right when you
need them. Render the dump once to a `full_dump` str,
then unconditionally tee to:
- `/tmp/tractor-stackscope-<pid>.log` (append-mode,
always written) — guaranteed-readable artifact even
under CI / `nohup` / no-tty. `tail -f` to follow.
- `/dev/tty` (best-effort) — pytest never captures the
tty; ignored if device is missing.
Other,
- squelch the benign `RuntimeWarning` ("coroutine method
'asend'/'athrow' was never awaited") from
`stackscope._glue`'s import-time async-gen type
introspection so `--enable-stackscope` setup stays
quiet.
- log msg in the `_runtime` ImportError branch now
mentions `--enable-stackscope` alongside debug-mode.
Usage,
pytest --enable-stackscope -k <hang-test>
# in another shell, find the pid + signal:
kill -USR1 <pytest-or-subactor-pid>
# tail the artifact:
tail -f /tmp/tractor-stackscope-<pid>.log
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit 5418f2dc3c)
(factored: only the flag + activation hunks; the surrounding skipon-marker/reap-fixture context rides with the testing-harness segment)
Prep for a future sub-interpreter (PEP 734
`concurrent.interpreters`) spawn backend per issue
#379 — land just the py-version range bump and the
test-harness error-gating; the backend itself comes
later.
Deats,
- bump `pyproject.toml` `requires-python` to
`>=3.12, <3.15` and list the `3.14` classifier —
the new stdlib `concurrent.interpreters` module
only ships on 3.14
- `_testing.pytest.pytest_configure` wraps
`try_set_start_method()` in a `pytest.UsageError`
handler so an unsupported `--spawn-backend` on the
running py-version prints a clean banner instead
of a traceback
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit d318f1f8f4)
(factored: kept only the pyproject + `_testing/pytest.py` parts of
"Add `'subint'` spawn backend scaffold (#379)"; dropped
tractor/spawn/_spawn.py + tractor/spawn/_subint.py)
Split the monolithic `spawn._spawn` into a slim
"core" + per-backend submodules so a future
`._subint` backend (per issue #379) can drop in
without piling more onto `_spawn.py`.
`._spawn` retains the cross-backend supervisor
machinery: `SpawnMethodKey`, `_methods` registry,
`_spawn_method`/`_ctx` state, `try_set_start_method()`,
the `new_proc()` dispatcher, and the shared helpers
`exhaust_portal()`, `cancel_on_completion()`,
`hard_kill()`, `soft_kill()`, `proc_waiter()`.
Deats,
- mv `trio_proc()` → new `spawn._trio`
- mv `mp_proc()` → new `spawn._mp`, reads `_ctx` and
`_spawn_method` via `from . import _spawn` for
late binding bc both get mutated by
`try_set_start_method()`
- `_methods` wires up the new submods via late
bottom-of-module imports to side-step circular
dep (both backend mods pull shared helpers from
`._spawn`)
- prune now-unused imports from `_spawn.py` — `sys`,
`is_root_process`, `current_actor`,
`is_main_process`, `_mp_main`, `ActorFailure`,
`pretty_struct`, `_pformat`
Also,
- `_testing.pytest.pytest_generate_tests()` now
drives the valid-backend set from
`typing.get_args(SpawnMethodKey)` so adding a
new backend (e.g. `'subint'`) doesn't require
touching the harness
- refresh `spawn/__init__.py` docstring for the
new layout
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
The `functools` rewrite forwarded all `kwargs`
through `_main(**kwargs)` to `wrapped(**kwargs)`
unchanged — the Windows `start_method` default
could leak to test fns that don't declare it.
The pre-wrapt code guarded against this with
named wrapper params.
Extract runtime settings (`reg_addr`, `loglevel`,
`debug_mode`, `start_method`) as closure locals
in `wrapper`; `_main` uses them directly for
`open_root_actor()` while `kwargs` passes to
`wrapped()` unmodified.
Review: PR #439 (Copilot)
https://github.com/goodboy/tractor/pull/439#pullrequestreview-4091005202
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Realized a bit late that (pretty sure) i already tried this using
`wrapt` idea and waay back and found the same "issue" XD
The `wrapt.decorator` transparently proxies `__code__` from the async
test fn, fooling `pytest`'s coroutine detection into skipping wrapped
tests as "unhandled coroutines". `functools.wraps` preserves the sig for
fixture injection via `__wrapped__` without leaking the async nature.
So i let `claude` rework the latest code to go back to using the old
stdlib wrapping again..
Deats,
- `functools.partial` replaces `wrapt.PartialCallableObjectProxy`.
- wrapper takes plain `**kwargs`; runtime settings extracted via
`kwargs.get()` in `_main()`.
- `iscoroutinefunction()` guard moved before wrapper definition.
- drop all `*args` passing (fixture kwargs only).
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Move the `Arbiter` class out of `runtime._runtime` into its
logical home at `discovery._registry` as `Registrar(Actor)`.
This completes the long-standing terminology migration from
"arbiter" to "registrar/registry" throughout the codebase.
Deats,
- add new `discovery/_registry.py` mod with `Registrar`
class + backward-compat `Arbiter = Registrar` alias.
- rename `Actor.is_arbiter` attr -> `.is_registrar`;
old attr now a `@property` with `DeprecationWarning`.
- `_root.py` imports `Registrar` directly for
root-actor instantiation.
- export `Registrar` + `Arbiter` from `tractor.__init__`.
- `_runtime.py` re-imports from `discovery._registry`
for backward compat.
Also,
- update all test files to use `.is_registrar`
(`test_local`, `test_rpc`, `test_spawning`,
`test_discovery`, `test_multi_program`).
- update "arbiter" -> "registrar" in comments/docstrings
across `_discovery.py`, `_server.py`, `_transport.py`,
`_testing/pytest.py`, and examples.
- drop resolved TODOs from `_runtime.py` and `_root.py`.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Restructure the flat `tractor/` top-level private mods
into (more nested) subpackages:
- `runtime/`: `_runtime`, `_portal`, `_rpc`, `_state`,
`_supervise`
- `spawn/`: `_spawn`, `_entry`, `_forkserver_override`,
`_mp_fixup_main`
- `discovery/`: `_addr`, `_discovery`, `_multiaddr`
Each subpkg `__init__.py` is kept lazy (no eager
imports) to avoid circular import issues.
Also,
- update all intra-pkg imports across ~35 mods to use
the new subpkg paths (e.g. `from .runtime._state`
instead of `from ._state`)
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Refactor the test-fn deco to use `wrapt.decorator`
instead of `functools.wraps` for better fn-sig
preservation and optional-args support via
`PartialCallableObjectProxy`.
Deats,
- add `timeout` and `hide_tb` deco params
- wrap test-fn body with `trio.fail_after(timeout)`
- consolidate per-fixture `if` checks into a loop
- add `iscoroutinefunction()` type-check on wrapped fn
- set `__tracebackhide__` at each wrapper level
Also,
- update imports for new subpkg paths:
`tractor.spawn._spawn`, `tractor.discovery._addr`,
`tractor.runtime._state`
(see upcoming, likely large patch commit ;)
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
- add `is_valid` and `sockpath.resolve()` asserts in
`get_rando_addr()` for the `'uds'` case plus an
explicit `UDSAddress` type annotation.
- rename no-runtime sockname prefixes from
`'<unknown-actor>'`/`'root'` to
`'no_runtime_root'`/`'no_runtime_actor'` with a proper
if/else branch in `UDSAddress.get_random()`.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Add UDS skip-guard to `test_streaming_to_actor_cluster()`
and plumb `tpt_proto` through the `@tractor_test` wrapper
so transport-parametrized tests can receive it.
Deats,
- skip cluster test when `tpt_proto == 'uds'` with
descriptive msg, add TODO about `@pytest.mark.no_tpt`.
- add `tpt_proto: str|None` param to inner wrapper in
`tractor_test()`, forward to decorated fn when its sig
accepts it.
- register custom `no_tpt` marker via `pytest_configure()`
to avoid unknown-marker warnings.
- add masked todo for `no_tpt` marker-check code in `tpt_proto` fixture
(needs fn-scope to work, left as TODO).
- add `request` param to `tpt_proto` fixture for future
marker inspection.
Also,
- add doc-string to `test_streaming_to_actor_cluster()`.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Forward the `tpt_proto` fixture val into spawned daemon
subprocesses via `run_daemon(enable_transports=..)` and
sync `_runtime_vars['_enable_tpts']` in the `tpt_proto`
fixture so sub-actors inherit the transport setting.
Deats,
- add `enable_transports={enable_tpts}` to the daemon
spawn-cmd template in `tests/conftest.py`.
- set `_state._runtime_vars['_enable_tpts']` in the
`tpt_proto` fixture in `_testing/pytest.py`.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Namely any CLI driven runtime-config fixtures such as,
- `--spawn-backend` and `start_method`,
- `--tpdb` and `debug_mode`,
- `--tpt-proto` and `tpt_protos`/`tpt_proto`,
- `reg_addr` as driven by the above.
This moves all fixtures and necessary hook funcs (CLI parsing,
configuring and test-gen) to the `._testing.pytest` module and thus
allows any dependent project to leverage these fixtures in their own
test suites after pointing to that plugin mod using,
```python
# conftest.py
pytest_plugins: tuple[str] = (
"tractor._testing.pytest",
)
```
Also, add a new `._testing.addr` helper mod which now contains
a factored `get_rando_addr()` helper for creating test-sesh unique
tpt-specific registry (or other) IPC endpoint addrs.
EventFD class now expects the fd to already be init with open_eventfd
RingBuff Sender and Receiver fully manage SharedMemory and EventFD lifecycles, no aditional ctx mngrs needed
Separate ring buf tests into its own test bed
Add parametrization to test and cancellation
Add docstrings
Add simple testing data gen module .samples
Just like we do from the `.devx._debug.open_crash_handler()`, this
allows checking various attrs on the raised `ContextCancelled` much like
`with pytest.raises() as excinfo:`.
Since I needed the `break_ipc()` helper from the
`examples/advanced_faults/ipc_failure_during_stream.py` used in the
`test_advanced_faults` suite, might as well move it into a pkg-wide
importable module. Also changed the default break method to be
`socket_close` which just calls `Stream.socket.close()` underneath in
`trio`.
Also tweak that example to not keep sending after the stream has been
broken since with new `trio` that will raise `ClosedResourceError` and
in the wrapping test we generally speaking want to see a hang and then
cancel via simulated user sent SIGINT/ctl-c.
Since importing from our top level `conftest.py` is not scaleable
or as "future forward thinking" in terms of:
- LoC-wise (it's only one file),
- prevents "external" (aka non-test) example scripts from importing
content easily,
- seemingly(?) can't be used via abs-import if using
a `[tool.pytest.ini_options]` in a `pyproject.toml` vs.
a `pytest.ini`, see:
https://docs.pytest.org/en/8.0.x/reference/customize.html#pyproject-toml)
=> Go back to having an internal "testing" pkg like `trio` (kinda) does.
Deats:
- move generic top level helpers into pkg-mod including the new
`expect_ctxc()` (which i needed in the advanced faults testing script.
- move `@tractor_test` into `._testing.pytest` sub-mod.
- adjust all the helper imports to be a `from tractor._testing import <..>`
Rework `test_ipc_channel_break_during_stream()` and backing script:
- make test(s) pull `debug_mode` from new fixture (which is now
controlled manually from `--tpdb` flag) and drop the previous
parametrized input.
- update logic in ^ test for "which-side-fails" cases to better match
recently updated/stricter cancel/failure semantics in terms of
`ClosedResouruceError` vs. `EndOfChannel` expectations.
- handle `ExceptionGroup`s with expected embedded errors in test.
- better pendantics around whether to expect a user simulated KBI.
- for `examples/advanced_faults/ipc_failure_during_stream.py` script:
- generalize ipc breakage in new `break_ipc()` with support for diff
internal `trio` methods and a #TODO for future disti frameworks
- only make one sub-actor task break and the other just stream.
- use new `._testing.expect_ctxc()` around ctx block.
- add a bit of exception handling with `print()`s around ctxc (unused
except if 'msg' break method is set) and eoc cases.
- don't break parent side ipc in loop any more then once
after first break, checked via flag var.
- add a `pre_close: bool` flag to control whether
`MsgStreama.aclose()` is called *before* any ipc breakage method.
Still TODO:
- drop `pytest.ini` and add the alt section to `pyproject.py`.
-> currently can't get `--rootdir=` opt to work.. not showing in
console header.
-> ^ also breaks on 'tests' `enable_modules` imports in subactors
during discovery tests?