Compare commits

..

253 Commits

Author SHA1 Message Date
Gud Boi b524ee4633 Bump `xonsh` to latest pre `0.23` release 2026-04-23 18:47:36 -04:00
Gud Boi b1a0753a3f Expand `/run-tests` venv pre-flight to cover all cases
Rework section 3 from a worktree-only check into a
structured 3-step flow: detect active venv, interpret
results (Case A: active, B: none, C: worktree), then
run import + collection checks.

Deats,
- Case B prompts via `AskUserQuestion` when no venv
  is detected, offering `uv sync` or manual activate
- add `uv run` fallback section for envs where venv
  activation isn't practical
- new allowed-tools: `uv run python`, `uv run pytest`,
  `uv pip show`, `AskUserQuestion`

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-23 18:47:36 -04:00
Gud Boi ba86d482e3 Add `lastfailed` cache inspection to `/run-tests` skill
New "Inspect last failures" section reads the pytest
`lastfailed` cache JSON directly — instant, no
collection overhead, and filters to `tests/`-prefixed
entries to avoid stale junk paths.

Also,
- add `jq` tool permission for `.pytest_cache/` files

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-23 18:47:36 -04:00
Gud Boi d3d6f646f9 Reorganize `.gitignore` by skill/purpose
Group `.claude/` ignores per-skill instead of a
flat list: `ai.skillz` symlinks, `/open-wkt`,
`/code-review-changes`, `/pr-msg`, `/commit-msg`.
Add missing symlink entries (`yt-url-lookup` ->
`resolve-conflicts`, `inter-skill-review`). Drop
stale `Claude worktrees` section (already covered
by `.claude/wkts/`).

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-23 18:47:36 -04:00
Gud Boi 9cf3d588e7 Ignore notes & snippets subdirs in `git` 2026-04-23 18:47:36 -04:00
Bd e75e29b1dc
Merge pull request #444 from goodboy/spawn_modularize
Spawner modules: split up subactor spawning  backends
2026-04-23 18:42:33 -04:00
Gud Boi a7b1ee34ef Restore fn-arg `_runtime_vars` in `trio_proc` teardown
During the Phase A extraction of `trio_proc()` out of
`spawn._spawn` into its own submod, the
`debug.maybe_wait_for_debugger(child_in_debug=...)` call site in
the hard-reap `finally` got refactored from the original
`_runtime_vars.get('_debug_mode', ...)` (the fn parameter — the
dict that was constructed by the *parent* for the *child*'s
`SpawnSpec`) to `get_runtime_vars().get(...)` (a global getter that
returns the *parent's* live `_state`). Those are semantically
different — the first asks "is the child we just spawned in debug
mode?", the second asks "are *we* in debug mode?". Under
mixed-debug-mode trees the swap can incorrectly skip (or
unnecessarily delay) the debugger-lock wait during teardown.

Revert to the fn-parameter lookup and add an inline `NOTE` comment
calling out the distinction so it's harder to regress again.

Deats,
- `spawn/_trio.py`: `child_in_debug=get_runtime_vars().get(...)` →
  `child_in_debug=_runtime_vars.get(...)` at the
  `debug.maybe_wait_for_debugger(...)` call in the hard-reap block;
  add 4-line `NOTE` explaining the parent-vs-child distinction.
- `spawn/__init__.py`: drop trailing whitespace after the
  `'mp_forkserver'` docstring bullet.
- `ai/prompt-io/prompts/subints_spawner.md`: drop duplicated `with`
  in `"as with with subprocs"` prose (copilot grammar catch).

Review: PR #444 (Copilot)
https://github.com/goodboy/tractor/pull/444#pullrequestreview-4165928469

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-23 18:30:11 -04:00
Gud Boi ae5b63c0bc Bump to `msgspec>=0.21.0` in lock file 2026-04-17 19:28:11 -04:00
Gud Boi f75865fb2e Tidy `spawn/` subpkg docstrings and imports
Drop unused `TYPE_CHECKING` imports (`Channel`,
`_server`), remove commented-out `import os` in
`_entry.py`, and use `get_runtime_vars()` accessor
instead of bare `_runtime_vars` in `_trio.py`.

Also,
- freshen `__init__.py` layout docstring for the
  new per-backend submod structure
- update `_spawn.py` + `_trio.py` module docstrings

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-17 19:03:00 -04:00
Gud Boi e0b8f23cbc Add prompt-io files for "phase-A", fix typos caught by copilot 2026-04-17 18:26:41 -04:00
Gud Boi 8d662999a4 Bump to `msgspec>=0.21` for py314 support 2026-04-17 16:54:07 -04:00
Gud Boi d7ca68cf61 Mv `trio_proc`/`mp_proc` to per-backend submods
Split the monolithic `spawn._spawn` into a slim
"core" + per-backend submodules so a future
`._subint` backend (per issue #379) can drop in
without piling more onto `_spawn.py`.

`._spawn` retains the cross-backend supervisor
machinery: `SpawnMethodKey`, `_methods` registry,
`_spawn_method`/`_ctx` state, `try_set_start_method()`,
the `new_proc()` dispatcher, and the shared helpers
`exhaust_portal()`, `cancel_on_completion()`,
`hard_kill()`, `soft_kill()`, `proc_waiter()`.

Deats,
- mv `trio_proc()` → new `spawn._trio`
- mv `mp_proc()` → new `spawn._mp`, reads `_ctx` and
  `_spawn_method` via `from . import _spawn` for
  late binding bc both get mutated by
  `try_set_start_method()`
- `_methods` wires up the new submods via late
  bottom-of-module imports to side-step circular
  dep (both backend mods pull shared helpers from
  `._spawn`)
- prune now-unused imports from `_spawn.py` — `sys`,
  `is_root_process`, `current_actor`,
  `is_main_process`, `_mp_main`, `ActorFailure`,
  `pretty_struct`, `_pformat`

Also,
- `_testing.pytest.pytest_generate_tests()` now
  drives the valid-backend set from
  `typing.get_args(SpawnMethodKey)` so adding a
  new backend (e.g. `'subint'`) doesn't require
  touching the harness
- refresh `spawn/__init__.py` docstring for the
  new layout

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-17 16:48:22 -04:00
Gud Boi b5b0504918 Add prompt-IO log for subint spawner design kickoff
Log the `claude-opus-4-7` design session that produced the phased plan
(A: modularize `_spawn`, B: `_subint` backend, C: harness) and concrete
Phase A file-split for #379. Substantive bc the plan directly drives
upcoming impl.

Prompt-IO: ai/prompt-io/claude/20260417T034918Z_9703210_prompt_io.md

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-17 16:48:22 -04:00
Gud Boi de78a6445b Initial prompt to vibe subint support Bo 2026-04-17 16:48:18 -04:00
Bd 5c98ab1fb6
Merge pull request #429 from goodboy/multiaddr_support
Multiaddresses: a novel `libp2p` peep's idea worth embracing
2026-04-16 23:59:11 -04:00
Gud Boi 3867403fab Scale `test_open_local_sub_to_stream` timeout by CPU factor
Import and apply `cpu_scaling_factor()` from
`conftest`; bump base from 3.6 -> 4 and multiply
through so CI boxes with slow CPUs don't flake.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-16 20:03:32 -04:00
Gud Boi 7c8e5a6732 Drop `snippets/multiaddr_ex.py` scratch script
Since we no longer need the example after integrating `multiaddr` into
the `.discovery` subsys.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-16 17:45:38 -04:00
Gud Boi 3152f423d8 Condense `.raw.md` prompt-IO logs, add `diff_cmd` refs
Replace verbose inline code dumps in `.raw.md`
entries with terse summaries and `git diff`
cmd references. Add `diff_cmd` metadata to each
entry's YAML frontmatter so readers can reproduce
the actual output diff.

Also,
- rename `multiaddr_declare_eps.md_` -> `.md`
  (drop trailing `_` suffix)

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-16 17:44:14 -04:00
Gud Boi ed65301d32 Fix misc bugs caught by Copilot review
Deats,
- use `proc.poll() is None` in `sig_prog()` to
  distinguish "still running" from exit code 0;
  drop stale `breakpoint()` from fallback kill
  path (would hang CI).
- add missing `raise` on the `RuntimeError` in
  `async_main()` when no tpt bind addrs given.
- clean up stale uid entries from the registrar
  `_registry` when addr eviction empties the
  addr list.
- update `discovery.__init__` docstring to match
  the new eager `._multiaddr` import.
- fix `registar` -> `registrar` typo in teardown
  report log msg.

Review: PR #429 (Copilot)
https://github.com/goodboy/tractor/pull/429

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:15 -04:00
Gud Boi 8817032c90 Prefer fresh conn for unreg, fallback to `_parent_chan`
The prior approach eagerly reused `_parent_chan` when
parent IS the registrar, but that channel may still
carry ctx/stream teardown protocol traffic —
concurrent `unregister_actor` RPC causes protocol
conflicts. Now try a fresh `get_registry()` conn
first; only fall back to the parent channel on
`OSError` (listener already closed/unlinked).

Deats,
- fresh `get_registry()` is the primary path for
  all addrs regardless of `parent_is_reg`
- `OSError` handler checks `parent_is_reg` +
  `rent_chan.connected()` before fallback
- fallback catches `OSError` and
  `trio.ClosedResourceError` separately
- drop unused `reg_addr: Address` annotation

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:15 -04:00
Gud Boi 70dc60a199 Bump UDS `listen()` backlog 1 -> 128 for multi-actor unreg
A backlog of 1 caused `ECONNREFUSED` when multiple
sub-actors simultaneously connect to deregister from
a remote-daemon registrar. Now matches the TCP
transport's default backlog (~128).

Also,
- add cross-ref comments between
  `_uds.close_listener()` and `async_main()`'s
  `parent_is_reg` deregistration path explaining
  the UDS socket-file lifecycle

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:15 -04:00
Gud Boi cd287c7e93 Fix `test_registrar_merge_binds_union` for UDS collision
`get_random()` can produce the same UDS filename for a given
pid+actor-state, so the "disjoint addrs" premise doesn't always hold.
Gate the `len(bound) >= 2` assertion on whether the registry and bind
addrs actually differ via `expect_disjoint`.

Also,
- drop unused `partial` import

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:15 -04:00
Gud Boi 7b04b2cdfc Reuse `_parent_chan` to unregister from parent-registrar
When the parent actor IS the registrar, reuse the existing parent
channel for `unregister_actor` RPC instead of opening a new connection
via `get_registry()`. This avoids failures when the registrar's listener
socket is already closed during teardown (e.g. UDS transport unlinks the
socket file rapidly).

Deats,
- detect `parent_is_reg` by comparing `_parent_chan.raddr` against
  `reg_addrs` and if matched, create a `Portal(rent_chan)` directly
  instead of `async with get_registry()`.
- rename `failed` -> `failed_unreg` for clarity.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi 75b07c4b7c Show trailing bindspace-path-div in `repr(UDSAddress)` 2026-04-14 19:54:14 -04:00
Gud Boi 86d4e0d3ed Harden `sig_prog()` retries, adjust debugger test timeouts
Retry signal delivery in `sig_prog()` up to `tries`
times (default 3) w/ `canc_timeout` sleep between
attempts; only fall back to `_KILL_SIGNAL` after all
retries exhaust. Bump default timeout 0.1 -> 0.2.

Also,
- `test_multi_nested_subactors_error_through_nurseries`
  gives the first prompt iteration a 5s timeout even
  on linux bc the initial crash sequence can be slow
  to arrive at a `pdb` prompt

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi ccb013a615 Add `prefer_addr()` transport selection to `_api`
New locality-aware addr preference for multihomed
actors: UDS > local TCP > remote TCP. Uses
`ipaddress` + `socket.getaddrinfo()` to detect
whether a `TCPAddress` is on the local host.

Deats,
- `_is_local_addr()` checks loopback or
  same-host IPs via interface enumeration
- `prefer_addr()` classifies an addr list into
  three tiers and picks the latest entry from
  the highest-priority non-empty tier
- `query_actor()` and `wait_for_actor()` now
  call `prefer_addr()` instead of grabbing
  `addrs[-1]` or a single pre-selected addr

Also,
- `Registrar.find_actor()` returns full
  `list[UnwrappedAddress]|None` so callers can
  apply transport preference

Prompt-IO: ai/prompt-io/claude/20260414T163300Z_befedc49_prompt_io.md

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi c3d6cc9007 Rename `discovery._discovery` to `._api`
Adjust all imports to match.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi cb7b76c44f Use multi-addr `dict` registry, drop `bidict`
Replace `Registrar._registry: bidict[uid, addr]`
with `dict[uid, list[UnwrappedAddress]]` to
support actors binding on multiple transports
simultaneously (multi-homed).

Deats,
- `find_actor_addr()` returns first addr from
  the uid's list
- `get_registry()` now returns per-uid addr
  lists
- `find_actor_addrs()` uses `.extend()` to
  collect all addrs for a given actor name
- `register_actor_addr()` appends to the uid's
  list (dedup'd) and evicts stale entries where
  a different uid claims the same addr
- `delete_actor_addr()` does a linear scan +
  `.remove()` instead of `bidict.inverse.pop()`;
  deletes the uid entry entirely when no addrs
  remain

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi 23677f8a3c Use distinct startup report for registrar vs client
Set `header` to "Contacting existing registry"
for non-registrar actors and "Opening new
registry" for registrars, so the boot log
reflects the actual role.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi 06ff2dd5f2 Permit the `prompt-io` skill by default 2026-04-14 19:54:14 -04:00
Gud Boi a891e003b2 Expose `_multiaddr` API from `tractor.discovery`
Re-export `parse_endpoints`, `parse_maddr`, and
`mk_maddr` in `discovery.__init__` so downstream
(piker) can import directly from the pkg ns.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi e90241baaa Add `parse_endpoints()` to `_multiaddr`
Provide a service-table parsing API for downstream projects (like
`piker`) to declare per-actor transport bind addresses as a config map
of actor-name -> multiaddr strings (e.g. from a TOML `[network]`
section).

Deats,
- `EndpointsTable` type alias: input `dict[str, list[str|tuple]]`.
- `ParsedEndpoints` type alias: output `dict[str, list[Address]]`.
- `parse_endpoints()` iterates the table and delegates each entry to the
  existing `tractor.discovery._discovery.wrap_address()` helper, which
  handles maddr strings, raw `(host, port)` tuples, and pre-wrapped
  `Address` objs.
- UDS maddrs use the multiaddr spec name `/unix/...` (not tractor's
  internal `/uds/` proto_key)

Also add new tests,
- 7 new pure unit tests (no trio runtime): TCP-only, mixed tpts,
  unwrapped tuples, mixed str+tuple, unsupported proto (`/udp/`),
  empty table, empty actor list
- all 22 multiaddr tests pass rn.

Prompt-IO:
ai/prompt-io/claude/20260413T205048Z_269d939c_prompt_io.md

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi 7079a597c5 Add `test_tpt_bind_addrs.py` + fix type-mixing bug
Add 9 test variants (6 fns) covering all three
`tpt_bind_addrs` code paths in `open_root_actor()`:
- registrar w/ explicit bind (eq, subset, disjoint)
- non-registrar w/ explicit bind (same/diff
  bindspace) using `daemon` fixture
- non-registrar default random bind (baseline)
- maddr string input parsing
- registrar merge produces union
- `open_nursery()` forwards `tpt_bind_addrs`

Fix type-mixing bug at `_root.py:446` where the
registrar merge path did `set(Address + tuple)`,
preventing dedup and causing double-bind `OSError`.
Wrap `uw_reg_addrs` before the set union so both
sides are `Address` objs.

Also,
- add prompt-io output log for this session
- stage original prompt input for tracking

Prompt-IO: ai/prompt-io/claude/20260413T192116Z_f851f28_prompt_io.md

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi bc60aa1ec5 Add `tpt_bind_addrs` param to `open_root_actor()`
Allow callers to explicitly declare transport
bind addrs instead of always auto-generating
random ones from ponged registrar addresses.

Deats,
- new `tpt_bind_addrs` kwarg wraps each input
  addr via `wrap_address()` at init time.
- non-registrar path only auto-generates random
  bind addrs when `tpt_bind_addrs` is empty.
- registrar path merges user-provided bind addrs
  with `uw_reg_addrs` via `set()` union.
- drop the deprecated `arbiter_addr` param and
  its `DeprecationWarning` shim entirely.

Also,
- expand `registry_addrs` type annotation to
  `Address|UnwrappedAddress`.
- replace bare `assert accept_addrs` in
  `async_main()` with a descriptive
  `RuntimeError` msg.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi f881683c97 Tweak timeouts and rm `arbiter_addr` in tests
Use `cpu_scaling_factor()` headroom in
`test_peer_spawns_and_cancels_service_subactor`'s `fail_after` to avoid
flaky timeouts on throttled CI runners. Rename `arbiter_addr=` ->
`registry_addrs=[..]` throughout `test_spawning` and
`test_task_broadcasting` suites to match the current `open_root_actor()`
/ `open_nursery()` API.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi 490fac432c Preserve absolute UDS paths in `parse_maddr()`
Drop the `.lstrip('/')` on the unix protocol value
so the lib-prepended `/` restores the absolute-path
semantics that `mk_maddr()` strips when encoding.
Pass `Path` components (not `str`) to `UDSAddress`.

Also, update all UDS test params to use absolute
paths (`/tmp/tractor_test/...`, `/tmp/tractor_rt/...`)
matching real runtime sockpath behavior; tighten
`test_parse_maddr_uds` to assert exact `filedir`.

Review: PR #429 (copilot-pull-request-reviewer[bot])
https://github.com/goodboy/tractor/pull/429#pullrequestreview-4018448152

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi 5f6e45e1d4 Fix `mk_maddr()` crash on absolute UDS paths
Strip leading `/` from `filepath` before building
the `/unix/{path}` multiaddr string; OW absolute
sockpaths like `/run/user/1000/tractor/foo.sock`
produce `/unix//run/..` which `py-multiaddr`
rejects as "empty protocol path".

Woops, missed this in the initial `mk_maddr()` impl
bc the unit tests only used relative `filedir` values
(which was even noted in a comment..). The bug only
surfaces when the `.maddr` property on `UDSTransport`
is hit during logging/repr with real runtime addrs.

Found-via: cross-suite `pytest tests/ipc/ tests/msg/`
where `tpt_proto='uds'` leaks into msg tests

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi 5c4438bacc Add `parse_maddr()` tests + registrar maddr integ test
Cover `parse_maddr()` with unit tests for tcp/ipv4,
tcp/ipv6, uds, and unsupported-protocol error paths,
plus full `addr -> mk_maddr -> str -> parse_maddr`
roundtrip verification.

Adds,
- a `_maddr_to_tpt_proto` inverse-mapping assertion.
- an `wrap_address()` maddr-string acceptance test.
- a `test_reg_then_unreg_maddr` end-to-end suite which audits passing
  the registry addr as multiaddr str through the entire runtime.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi 90ba0e3658 Add `parse_maddr()` + `str` arm in `wrap_address()`
Inverse of `mk_maddr()`: parse a multiaddr string like
`/ip4/127.0.0.1/tcp/1234` back into a tractor `Address`.

Deats,
- add `_maddr_to_tpt_proto` reverse mapping dict
- add `parse_maddr()` fn dispatching on protocol
  combo: `[ip4|ip6, tcp]` -> `TCPAddress`,
  `[unix]` -> `UDSAddress`
- strip leading `/` the multiaddr lib prepends to
  unix protocol values for correct round-trip
- add `str` match case in `wrap_address()` for
  `/`-prefixed multiaddr strings, broaden type hint
  to `UnwrappedAddress|str`

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi 1f1e09a786 Move `test_discovery` to `tests/discovery/test_registrar`
All tests are registrar-actor integration scenarios
sharing intertwined helpers + `enable_modules=[__name__]`
task fns, so keep as one mod but rename to reflect
content. Now lives alongside `test_multiaddr.py` in
the new `tests/discovery/` subpkg.

Also,
- update 5 refs in `/run-tests` SKILL.md to match
  the new path
- add `discovery/` subdir to the test directory
  layout tree

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi 7cf3b5d00d Add `test_multiaddr.py` suite for `mk_maddr()`
Cover `_tpt_proto_to_maddr` mapping, TCP (ipv4/ipv6),
UDS, unsupported `proto_key` error, and round-trip
re-parse for both transport types.

Deats,
- new `tests/discovery/` subpkg w/ empty `__init__.py`
- `test_tpt_proto_to_maddr_mapping`: verify `tcp` and
  `uds` entries
- `test_mk_maddr_tcp_ipv4`: full assertion on
  `/ip4/127.0.0.1/tcp/1234` incl protocol iteration
- `test_mk_maddr_tcp_ipv6`: verify `/ip6/::1/tcp/5678`
- `test_mk_maddr_uds`: relative `filedir` bc the
  multiaddr parser rejects double-slash from abs paths
- `test_mk_maddr_unsupported_proto_key`: `ValueError`
  on `proto_key='quic'` via `SimpleNamespace` mock
- `test_mk_maddr_roundtrip`: parametrized over tcp +
  uds, re-parse `str(maddr)` back through `Multiaddr`

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi c72d495d68 Use `_tpt_proto_to_maddr` lookup in `mk_maddr()`
Address Copilot review: the mapping table was
defined but never referenced. Now `mk_maddr()`
resolves `proto_key` -> maddr protocol name via
the table and rejects unknown keys upfront.

Also add missing `Path` import to the `multiaddr`
usage snippet.

Review: PR #429 (Copilot)
https://github.com/goodboy/tractor/pull/429#pullrequestreview-4010456884

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi 926e861f52 Use upstream `py-multiaddr` for `._multiaddr`
Drop the NIH (notinventedhere) custom parser (`parse_maddr()`,
`iter_prot_layers()`, `prots`/`prot_params` tables) which was never
called anywhere in the codebase.

Replace with a thin `mk_maddr()` factory that wraps the upstream
`multiaddr.Multiaddr` type, dispatching on `Address.proto_key` to build
spec-compliant paths.

Deats,
- `'tcp'` addrs detect ipv4 vs ipv6 via stdlib
  `ipaddress` (resolves existing TODO)
- `'uds'` addrs map to `/unix/{path}` per the
  multiformats protocol registry (code 400)
- fix UDS `.maddr` to include full sockpath
  (previously only used `filedir`, dropped filename)
- standardize protocol names: `ipv4`->`ip4`,
  `uds`->`unix`
- `.maddr` properties now return `Multiaddr` objs
  (`__str__()` gives the canonical path form so all
  existing f-string/log consumers work unchanged)
- update `MsgTransport` protocol hint accordingly

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi d9cb38372f Add `multiaddr` dep to `pyproject.toml`
Bump lock file to match obvi.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi b4c975b48b Add `multiaddr` usage snippet for IP4 and UDS
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi 8344537aa6 Add `uds` to `._multiaddr`, tweak typing 2026-04-14 19:54:14 -04:00
Bd 3a9f4ea383
Merge pull request #434 from mahmoudhas/add-replay-parent-main-opt-out
Add per-actor parent-main inheritance opt-out
2026-04-10 21:29:42 -04:00
mahmoud ca1b01f926 mpi integration test 2026-04-10 20:58:54 -04:00
Gud Boi 570c975f14 Fix test typo + use `wait_for_result()` API
- "boostrap" → "bootstrap" in mod docstring
- replace deprecated `portal.result()` with
  `portal.wait_for_result()` + value assertion
  inside the nursery block

Review: PR #1 (Copilot)
https://github.com/mahmoudhas/tractor/pull/1#pullrequestreview-4091096072

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-10 20:58:54 -04:00
Gud Boi a0a7668670 Fix typos + typing in `_mp_fixup_main`
- "spawing" → "spawning", close unbalanced
  backtick on `` `start_method='trio'` ``
- "uneeded" → "unneeded", "deats" → "details"
- Remove double `d` annotation; filter
  `get_preparation_data()` result into only
  `ParentMainData` keys before returning
- Use `pop('authkey', None)` for safety

Review: PR #1 (Copilot)
https://github.com/mahmoudhas/tractor/pull/1#pullrequestreview-4091096072

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-10 20:58:54 -04:00
Gud Boi 27bf566d75 Guard `_mp_fixup_main` on non-empty parent data
Use walrus `:=` to combine the assignment and
truthiness check for `_parent_main_data` into the
`if` condition, cleanly skipping the fixup block
when `inherit_parent_main=False` yields `{}`.

Review: PR #438 (Copilot)
https://github.com/goodboy/tractor/pull/438

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-10 20:58:54 -04:00
Gud Boi e8f1eca8d2 Tighten `test_spawning` types, parametrize loglevel
Parametrize `test_loglevel_propagated_to_subactor`
across `'debug'`, `'cancel'`, `'critical'` levels
(was hardcoded to just `'critical'`) and move it
above the parent-main tests for logical grouping.

Also,
- add `start_method: str` annotations throughout
- use `portal.wait_for_result()` in
  `test_most_beautiful_word` (replaces `.result()`)
- expand mod docstring to describe test coverage
- reformat `check_parent_main_inheritance` docstr

Review: PR #438 (Copilot)
https://github.com/goodboy/tractor/pull/438

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-10 20:58:54 -04:00
Gud Boi 656c6c30d1 Delegate `_mp_fixup_main` to stdlib `mp.spawn`
Drop hand-copied `_fixup_main_from_name()` and `_fixup_main_from_path()`
in favor of direct re-exports from `multiprocessing.spawn`. Simplify
`_mp_figure_out_main()` to call stdlib's `get_preparation_data()`
instead of reimplementing `__main__` module inspection inline.

Also,
- drop `ORIGINAL_DIR` global and `os`, `sys`, `platform`, `types`,
  `runpy` imports.
- pop `authkey` from prep data (unserializable and unneeded by our spawn
  path).
- update mod docstring to reflect delegation.

Review: PR #438 (Copilot)
https://github.com/goodboy/tractor/pull/438

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-10 20:58:54 -04:00
Gud Boi acf6568275 Clarify `inherit_parent_main` docstring scope
Note the opt-out only applies to the trio spawn
backend; `multiprocessing` `spawn`/`forkserver`
reconstruct `__main__` via stdlib bootstrap.

Review: PR #438 (Copilot)
https://github.com/goodboy/tractor/pull/438

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-10 20:58:54 -04:00
Gud Boi c6c591e61a Drop `spawn_test_support` pkg, inline parent-main tests
Replace the subproc-based test harness with inline
`tractor.open_nursery()` calls that directly check
`actor._parent_main_data` instead of comparing
`__main__.__name__` across a process boundary
(which is a no-op under pytest bc the parent
`__main__` is `pytest.__main__`).

Deats,
- delete `tests/spawn_test_support/` pkg (3 files)
- add `check_parent_main_inheritance()` helper fn
  that asserts on `_parent_main_data` emptiness
- rewrite both `run_in_actor` and `start_actor`
  parent-main tests as inline async fns
- drop `tmp_path` fixture and unused imports

Review: PR #434 (goodboy, Copilot)
https://github.com/goodboy/tractor/pull/434

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-10 20:58:54 -04:00
mahmoud b883b27646 Exercise parent-main inheritance through spawn test support
Move the subprocess probe into dedicated spawn test support files so the inheritance tests cover the real __main__ replay path without monkeypatching or inline script strings.
2026-04-10 20:58:54 -04:00
mahmoud 00637764d9 Address review follow-ups for parent-main inheritance opt-out
Clean up mutable defaults, give parent-main bootstrap data a named type, and add direct start_actor coverage so the opt-out change is clearer to review.
2026-04-10 20:58:54 -04:00
mahmoud ea971d25aa Rename parent-main inheritance flag.
Use `inherit_parent_main` across the actor APIs and helper to better describe the behavior, and restore the reviewer note at child bootstrap where the inherited `__main__` data is copied from `SpawnSpec`.
2026-04-10 20:58:54 -04:00
mahmoud 83b6c4270a Simplify parent-main replay opt-out.
Keep actor-owned parent-main capture and let `_mp_figure_out_main()` decide whether to return `__main__` bootstrap data, avoiding the extra SpawnSpec plumbing while preserving the per-actor flag.
2026-04-10 20:58:54 -04:00
mahmoud 6309c2e6fc Route parent-main replay through SpawnSpec
Keep trio child bootstrap data in the spawn handshake instead of stashing it on Actor state so the replay opt-out stays explicit and avoids stale-looking runtime fields.
2026-04-10 20:58:54 -04:00
mahmoud f5301d3fb0 Add per-actor parent-main replay opt-out
Let actor callers skip replaying the parent __main__ during child startup so downstream integrations can avoid inheriting incompatible bootstrap state without changing the default spawn behavior.
2026-04-10 20:58:54 -04:00
Bd 9f8e9eb739
Merge pull request #440 from goodboy/moar_skillz_squashed
Moar skillz (squashed): refinements, factorings and extensions
2026-04-10 18:31:45 -04:00
Gud Boi 6b04650187 Widen `allowed-tools` and dedup `settings.local`
Expand `run-tests` skill `allowed-tools` to cover
the documented pre-flight workflow: `git rev-parse`
for worktree detection, `python --version`, and
`UV_PROJECT_ENVIRONMENT=py* uv sync` for venv
setup. Also dedup `gh api`/`gh pr` entries in
`settings.local.json` and widen `py313` → `py*`
so non-3.13 setups aren't blocked.

Review: PR #440 (copilot-pull-request-reviewer)
https://github.com/goodboy/tractor/pull/440

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-10 18:21:45 -04:00
Gud Boi a0607f363e Allow open/close-wkt skills for model 2026-04-10 16:37:34 -04:00
Gud Boi 2ee86ddb1a Migrate shared skills to `ai.skillz` symlinks
Drop inline `commit-msg/SKILL.md` — now deployed
as a symlink from the central `ai.skillz` repo via
`deploy-skill.sh`.

Gitignore all symlinked skill dirs so they stay
machine-local:
- fully-symlinked: `py-codestyle`, `close-wkt`,
  `open-wkt`, `plan-io`, `prompt-io`,
  `code-review-changes`, `resolve-conflicts`,
  `inter-skill-review`, `yt-url-lookup`
- hybrid (symlinked SKILL.md + references):
  `commit-msg/SKILL.md`, `pr-msg/SKILL.md`,
  `pr-msg/references`

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-10 16:37:34 -04:00
Gud Boi 0286d36ed7 Add repo-local `claude` skills + settings + gitignore
Add `/run-tests`, `/conc-anal` skill definitions and
`/pr-msg` `format-reference.md` that live in-repo
(not symlinked from `ai.skillz`).

- `/run-tests`: `pytest` suite runner with
  dev-workflow helpers, never-auto-commit rule.
- `/conc-anal`: concurrency analysis skill.
- `/pr-msg` `format-reference.md`: canonical PR
  description structure + cross-service ref-links.
- `ai_notes/docs_todos.md`: `literalinclude` idea.
- `settings.local.json`: permission rules for `gh`,
  `git`, `python3`, `cat`, skill invocations.
- `.gitignore`: ignore commit-msg/pr-msg `msgs/`,
  `LATEST` files, review ctx, session conf, claude
  worktrees.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-10 16:37:34 -04:00
Bd 2db6f97130
Merge pull request #439 from goodboy/revert_wrapt_tractor_test_deco
Drop `wrapt` for `tractor_test`, revert to `functools`
2026-04-10 16:36:03 -04:00
Gud Boi 9af6adc181 Fix runtime kwarg leaking in `tractor_test`
The `functools` rewrite forwarded all `kwargs`
through `_main(**kwargs)` to `wrapped(**kwargs)`
unchanged — the Windows `start_method` default
could leak to test fns that don't declare it.
The pre-wrapt code guarded against this with
named wrapper params.

Extract runtime settings (`reg_addr`, `loglevel`,
`debug_mode`, `start_method`) as closure locals
in `wrapper`; `_main` uses them directly for
`open_root_actor()` while `kwargs` passes to
`wrapped()` unmodified.

Review: PR #439 (Copilot)
https://github.com/goodboy/tractor/pull/439#pullrequestreview-4091005202

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-10 16:20:01 -04:00
Gud Boi 452a32fb23 Drop `wrapt` for `tractor_test`, revert to `functools`
Realized a bit late that (pretty sure) i already tried this using
`wrapt` idea and waay back and found the same "issue" XD

The `wrapt.decorator` transparently proxies `__code__` from the async
test fn, fooling `pytest`'s coroutine detection into skipping wrapped
tests as "unhandled coroutines". `functools.wraps` preserves the sig for
fixture injection via `__wrapped__` without leaking the async nature.

So i let `claude` rework the latest code to go back to using the old
stdlib wrapping again..

Deats,
- `functools.partial` replaces `wrapt.PartialCallableObjectProxy`.
- wrapper takes plain `**kwargs`; runtime settings extracted via
  `kwargs.get()` in `_main()`.
- `iscoroutinefunction()` guard moved before wrapper definition.
- drop all `*args` passing (fixture kwargs only).

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-10 12:08:46 -04:00
Bd f47010d7e9
Merge pull request #436 from goodboy/moc_revert_to_fix
Moc revert to fix: reversion fix for bug in #435
2026-04-09 17:54:41 -04:00
Gud Boi 3f198bc86c Drop commented-out `tractor.pause()` debug hooks
Remove 3 leftover `# await tractor.pause(shield=True)`
/ `# await tractor.pause()` calls in
`maybe_open_context()` that were used during the
`_Cache.run_ctx` teardown race diagnostic session
(PR #436). These are dead commented-out code with no
runtime effect — just noise.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-09 17:41:28 -04:00
Gud Boi 391c8d3566 Address Copilot review fixes on `maybe_open_context()`
Deats,
- drop unused `import tractor` (F401)
- fix `_Cache.locks` annotation to `trio.StrictFIFOLock`
- fix typos: "mabye-value", "Acquir lock"
- add `resources.pop()` cleanup in the caller if
  `service_tn.start()` fails — prevents a
  permanent `_Cache.resources` leak on
  `__aenter__` failure (note: Copilot's suggested
  outer `try/finally` in `run_ctx` would
  re-introduce the atomicity gap)
- add `user_registered` flag so `users -= 1` only
  runs when the task actually incremented
- move lock pop into the `users <= 0` teardown
  block so the last exiting user always cleans up,
  regardless of who created the lock; drop
  now-dead `lock_registered` var

Also,
- swap `fid` for `ctx_key` in debug log msgs
- remove stale commented-out `# fid` refs

Review: PR #436 (copilot-pull-request-reviewer)
https://github.com/goodboy/tractor/pull/436

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-09 15:10:07 -04:00
Gud Boi 4fc477cfd6 Revert `resources.pop()` back inside `run_ctx` inner finally
Reverts the `_Cache.run_ctx` change from 93aa39db which
moved `resources.pop(ctx_key)` to an outer `finally`
*after* the acm's `__aexit__()`. That introduced an
atomicity gap: `values` was already popped in the inner
finally but `resources` survived through the acm teardown
checkpoints. A re-entering task that creates a fresh lock
(the old one having been popped by the exiting caller)
could then acquire immediately and find stale `resources`
(for which now we raise a `RuntimeError('Caching resources ALREADY
exist?!')`).

Deats,
- the orig 93aa39db rationale was a preemptive guard
  against acm `__aexit__()` code accessing `_Cache`
  mid-teardown, but no `@acm` in `tractor` (or `piker`) ever
  does that; the scenario never materialized.
- by popping both `values` AND `resources` atomically
  (no checkpoint between them) in the inner finally,
  the re-entry race window is closed: either the new
  task sees both entries (cache hit) or neither
  (clean cache miss).
- `test_moc_reentry_during_teardown` now passes
  without `xfail`! (:party:)

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-09 14:42:42 -04:00
Gud Boi cd1cd03725 Add prompt-io log for `run_ctx` teardown analysis
Documents the diagnostic session tracing why
per-`ctx_key` locking alone doesn't close the
`_Cache.run_ctx` teardown race — the lock pops
in the exiting caller's task but resource cleanup
runs in the `run_ctx` task inside `service_tn`.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-09 14:42:42 -04:00
Gud Boi febe587c6c Drop `xfail` from `test_moc_reentry_during_teardown`
The per-`ctx_key` locking fix in f086222d intended to resolve the
teardown race reproduced by the new test suite, so the test SHOULD now
pass. TLDR, it doesn't Bp

Also add `collapse_eg()` to the test's ctx-manager stack so that when
run with `pytest <...> --tpdb` we'll actually `pdb`-REPL the RTE when it
hits (previously an assert-error).

(this commit-msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-09 14:42:42 -04:00
Gud Boi 4d3c5b9163 Use per-key locking+user tracking in `maybe_open_context()`
(Hopefully!) solving a long-run bug with the `brokerd.kraken` backend in
`piker`..

- Track `_Cache.users` per `ctx_key` via a `defaultdict[..., int]`
  instead of a single global counter; fix premature teardown when
  multiple ctx keys are active simultaneously.
- Key `_Cache.locks` on `ctx_key` (not bare `fid`) so different kwarg
  sets for the same `acm_func` get independent `StrictFIFOLock`s.
- Add `_UnresolvedCtx` sentinel class to replace bare `None` check;
  avoid false-positive teardown when a wrapped acm legitimately yields
  `None`.
- Swap resource-exists `assert` for detailed `RuntimeError`.

Also,
- fix "whih" typo.
- add debug logging for lock acquire/release lifecycle.

(this commit-msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-09 14:42:42 -04:00
Bd 8b106b9144
Merge pull request #435 from goodboy/moc_coverage_test_by_claude
`.trionics.maybe_open_context()` race-edge-case coverage
2026-04-09 14:41:59 -04:00
Gud Boi d6ece8eab3 Only run CI on pushes to `main`, ow just on PR/dev branches 2026-04-07 14:17:08 -04:00
Gud Boi 8494eb9b8a Run CI workflow on PRs even from forks 2026-04-07 14:02:43 -04:00
Gud Boi cab366cd65 Add xfail test for `_Cache.run_ctx` teardown race
Reproduce the piker `open_cached_client('kraken')` scenario: identical
`ctx_key` callers share one cached resource, and a new task re-enters
during `__aexit__` — hitting `assert not resources.get()` bc `values`
was popped but `resources` wasn't yet.

Deats,
- `test_moc_reentry_during_teardown` uses an `in_aexit` event to
  deterministically land in the teardown window.
- marked `xfail(raises=AssertionError)` against unpatched code (fix in
  `9e49eddd` or wtv lands on the `maybe_open_ctx_locking` or thereafter
  patch branch).

Also, add prompt-io log for the session.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

Prompt-IO: ai/prompt-io/claude/20260406T193125Z_85f9c5d_prompt_io.md
2026-04-06 18:17:04 -04:00
Gud Boi 85f9c5df6f Add per-`ctx_key` isolation tests for `maybe_open_context()`
Add `test_per_ctx_key_resource_lifecycle` to verify that per-key user
tracking correctly tears down resources independently - exercises the
fix from 02b2ef18 where a global `_Cache.users` counter caused stale
cache hits when the same `acm_func` was called with different kwargs.

Also, add a paired `acm_with_resource()` helper `@acm` that yields its
`resource_id` for per-key testing in the above suite.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

Prompt-IO: ai/prompt-io/claude/20260406T172848Z_02b2ef1_prompt_io.md
2026-04-06 14:37:47 -04:00
Gud Boi ebe9d5e4b5 Parametrize `test_resource_cache.test_open_local_sub_to_stream`
Namely with multiple pre-sleep `delay`-parametrizations before either,

- parent-scope cancel-calling (as originally) or,
- depending on the  new `cancel_by_cs: bool` suite parameter, optionally
  just immediately exiting from (the newly named)
  `maybe_cancel_outer_cs()` a checkpoint.

In the latter case we ensure we **don't** inf sleep to avoid leaking
those tasks into the `Actor._service_tn` (though we should really have
a better soln for this)..

Deats,
- make `cs` args optional and adjust internal logic to match.
- add some notes around various edge cases and issues with using the
  actor-service-tn as the scope by default.
2026-04-06 14:37:47 -04:00
Bd bbf01d5161
Merge pull request #430 from goodboy/dependabot/uv/pygments-2.20.0
Bump pygments from 2.19.2 to 2.20.0
2026-04-05 13:33:33 -04:00
Bd ec8e8a2786
Merge pull request #426 from goodboy/remote_exc_type_registry
Fix remote exc relay + add `reg_err_types()` tests
2026-04-02 22:44:36 -04:00
Gud Boi c3d1ec22eb Fix `Type[BaseException]` annots, guard `.src_type` resolve
- Use `Type[BaseException]` (not bare `BaseException`)
  for all err-type references: `get_err_type()` return,
  `._src_type`, `boxed_type` in `unpack_error()`.
- Add `|None` where types can be unresolvable
  (`get_err_type()`, `.boxed_type` property).
- Add `._src_type_resolved` flag to prevent repeated
  lookups and guard against `._ipc_msg is None`.
- Fix `recevier` and `exeptions` typos.

Review: PR #426 (Copilot)
https://github.com/goodboy/tractor/pull/426

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-02 18:21:19 -04:00
Gud Boi 8f44efa327 Drop stale `.cancel()`, fix docstring typo in tests
- Remove leftover `await an.cancel()` in
  `test_registered_custom_err_relayed`; the
  nursery already cancels on scope exit.
- Fix `This document` -> `This documents` typo in
  `test_unregistered_err_still_relayed` docstring.

Review: PR #426 (Copilot)
https://github.com/goodboy/tractor/pull/426

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-02 18:21:19 -04:00
Gud Boi 5968a3c773 Use `'<unknown>'` for unresolvable `.boxed_type_str`
Add a teensie unit test to match.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-02 18:21:19 -04:00
Gud Boi 80597b80bf Add passing test for unregistered err relay
Drop the `xfail` test and instead add a new one that ensures the
`tractor._exceptions` fixes enable graceful relay of
remote-but-unregistered error types via the unboxing of just the
`rae.src_type_str/boxed_type_str` content. The test also ensures
a warning is included with remote error content indicating the user
should register their error type for effective cross-actor re-raising.

Deats,
- add `test_unregistered_err_still_relayed`: verify the
  `RemoteActorError` IS raised with `.boxed_type`
  as `None` but `.src_type_str`, `.boxed_type_str`,
  and `.tb_str` all preserved from the IPC msg.
- drop `test_unregistered_boxed_type_resolution_xfail`
  since the new above case covers it and we don't need to have
  an effectively entirely repeated test just with an inverse assert
  as it's last line..

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-02 18:21:19 -04:00
Gud Boi a41c6d5c70 Fix unregistered-remote-error-type relay crash
Make `RemoteActorError` resilient to unresolved
custom error types so that errors from remote actors
always relay back to the caller - even when the user
hasn't called `reg_err_types()` to register the exc type.

Deats,
- `.src_type`: log warning + return `None` instead
  of raising `TypeError` which was crashing the
  entire `_deliver_msg()` -> `pformat()` chain
  before the error could be relayed.
- `.boxed_type_str`: fallback to `_ipc_msg.boxed_type_str`
  when the type obj can't be resolved so the type *name* is always
  available.
- `unwrap_src_err()`: fallback to `RuntimeError` preserving
  original type name + traceback.
- `unpack_error()`: log warning when `get_err_type()` returns
  `None` telling the user to call `reg_err_types()`.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-02 18:21:19 -04:00
Gud Boi 9c37b3f956 Add `reg_err_types()` test suite for remote exc relay
Verify registered custom error types round-trip correctly over IPC via
`reg_err_types()` + `get_err_type()`.

Deats,
- `TestRegErrTypesPlumbing`: 5 unit tests for the type-registry plumbing
  (register, lookup, builtins, tractor-native types, unregistered
  returns `None`)
- `test_registered_custom_err_relayed`: IPC end-to-end for a registered
  `CustomAppError` checking `.boxed_type`, `.src_type`, and `.tb_str`
- `test_registered_another_err_relayed`: same for `AnotherAppError`
  (multi-type coverage)
- `test_unregistered_custom_err_fails_lookup`: `xfail` documenting that
  `.boxed_type` can't resolve without `reg_err_types()` registration

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-02 18:21:19 -04:00
Bd 8f6bc56174
Merge pull request #427 from goodboy/subsys_reorg
Mv core mods to `runtime/`, `spawn/`, `discovery/` subpkgs
2026-04-02 18:21:00 -04:00
Gud Boi b14dbde77b Skip `test_empty_mngrs_input_raises` on UDS tpt
The `open_actor_cluster()` teardown hangs
intermittently on UDS when `gather_contexts(mngrs=())`
raises `ValueError` mid-setup; likely a race in the
actor-nursery cleanup vs UDS socket shutdown. TCP
passes reliably (5/5 runs).

- Add `tpt_proto` fixture param to the test
- `pytest.skip()` on UDS with a TODO for deeper
  investigation of `._clustering`/`._supervise`
  teardown paths

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-02 17:59:13 -04:00
Gud Boi cd6509b724 Fix `tractor_test` kwarg check and Windows `start_method` default
- Use `kw in kwargs` membership test instead of
  `kwargs[kw]` to avoid `KeyError` on missing params.
- Restructure Windows `start_method` logic to properly
  default to `'trio'` when unset; only raise on an
  explicit non-trio value.

Review: PR #427 (Copilot)
https://github.com/goodboy/tractor/pull/427#pullrequestreview-4009934142

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-02 17:59:13 -04:00
Gud Boi 93d99ed2eb Move `get_cpu_state()` to `conftest` as shared latency headroom
Factor the CPU-freq-scaling helper out of
`test_legacy_one_way_streaming` into `conftest.py`
alongside a new `cpu_scaling_factor()` convenience fn
that returns a latency-headroom multiplier (>= 1.0).

Apply it to the two other flaky-timeout tests,
- `test_cancel_via_SIGINT_other_task`: 2s -> scaled
- `test_example[we_are_processes.py]`: 16s -> scaled

Deats,
- add `get_cpu_state()` + `cpu_scaling_factor()` to
  `conftest.py` so all test mods can share the logic.
- catch `IndexError` (empty glob) in addition to
  `FileNotFoundError`.
- rename `factor` var -> `headroom` at call sites for
  clarity on intent.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-02 17:59:13 -04:00
Gud Boi 6215e3b2dd Adjust `test_a_quadruple_example` time-limit for CPU scaling
Add `get_cpu_state()` helper to read CPU freq settings
from `/sys/devices/system/cpu/` and use it to compensate
the perf time-limit when `auto-cpufreq` (or similar)
scales down the max frequency.

Deats,
- read `*_pstate_max_freq` and `scaling_max_freq`
  to compute a `cpu_scaled` ratio.
- when `cpu_scaled != 1.`, increase `this_fast` limit
  proportionally (factoring dual-threaded cores).
- log a warning via `test_log` when compensating.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-02 17:59:13 -04:00
Gud Boi be5d8da8c0 Just alias `Arbiter` via assignment 2026-04-02 17:59:13 -04:00
Gud Boi 21ed181835 Filter `__pycache__` from example discovery in tests
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-02 17:59:13 -04:00
Gud Boi 9ec2749ab7 Rename `Arbiter` -> `Registrar`, mv to `discovery._registry`
Move the `Arbiter` class out of `runtime._runtime` into its
logical home at `discovery._registry` as `Registrar(Actor)`.
This completes the long-standing terminology migration from
"arbiter" to "registrar/registry" throughout the codebase.

Deats,
- add new `discovery/_registry.py` mod with `Registrar`
  class + backward-compat `Arbiter = Registrar` alias.
- rename `Actor.is_arbiter` attr -> `.is_registrar`;
  old attr now a `@property` with `DeprecationWarning`.
- `_root.py` imports `Registrar` directly for
  root-actor instantiation.
- export `Registrar` + `Arbiter` from `tractor.__init__`.
- `_runtime.py` re-imports from `discovery._registry`
  for backward compat.

Also,
- update all test files to use `.is_registrar`
  (`test_local`, `test_rpc`, `test_spawning`,
  `test_discovery`, `test_multi_program`).
- update "arbiter" -> "registrar" in comments/docstrings
  across `_discovery.py`, `_server.py`, `_transport.py`,
  `_testing/pytest.py`, and examples.
- drop resolved TODOs from `_runtime.py` and `_root.py`.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-02 17:59:13 -04:00
Gud Boi f3441a6790 Update tests+examples imports for new subpkgs
Adjust all `tractor._state`, `tractor._addr`,
`tractor._supervise`, etc. refs in tests and examples
to use the new `runtime/`, `discovery/`, `spawn/` paths.

Also,
- use `tractor.debug_mode()` pub API instead of
  `tractor._state.debug_mode()` in a few test mods
- add explicit `timeout=20` to `test_respawn_consumer_task`
  `@tractor_test` deco call

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-02 17:59:13 -04:00
Gud Boi cc42d38284 Mv core mods to `runtime/`, `spawn/`, `discovery/` subpkgs
Restructure the flat `tractor/` top-level private mods
into (more nested) subpackages:

- `runtime/`: `_runtime`, `_portal`, `_rpc`, `_state`,
  `_supervise`
- `spawn/`: `_spawn`, `_entry`, `_forkserver_override`,
  `_mp_fixup_main`
- `discovery/`: `_addr`, `_discovery`, `_multiaddr`

Each subpkg `__init__.py` is kept lazy (no eager
imports) to avoid circular import issues.

Also,
- update all intra-pkg imports across ~35 mods to use
  the new subpkg paths (e.g. `from .runtime._state`
  instead of `from ._state`)

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-02 17:59:13 -04:00
Gud Boi 6827ceba12 Use `wrapt` for `tractor_test()` decorator
Refactor the test-fn deco to use `wrapt.decorator`
instead of `functools.wraps` for better fn-sig
preservation and optional-args support via
`PartialCallableObjectProxy`.

Deats,
- add `timeout` and `hide_tb` deco params
- wrap test-fn body with `trio.fail_after(timeout)`
- consolidate per-fixture `if` checks into a loop
- add `iscoroutinefunction()` type-check on wrapped fn
- set `__tracebackhide__` at each wrapper level

Also,
- update imports for new subpkg paths:
  `tractor.spawn._spawn`, `tractor.discovery._addr`,
  `tractor.runtime._state`
  (see upcoming, likely large patch commit ;)

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-02 17:59:13 -04:00
Gud Boi 94458807ce Expose `RuntimeVars` + `get_runtime_vars()` from pkg
Export the new `RuntimeVars` struct and `get_runtime_vars()`
from `tractor.__init__` and improve the accessor to
optionally return the struct form.

Deats,
- add `RuntimeVars` and `get_runtime_vars` to
  `__init__.py` exports; alphabetize `_state` imports.
- move `get_runtime_vars()` up in `_state.py` to sit
  right below `_runtime_vars` dict definition.
- add `as_dict: bool = True` param so callers can get
  either the legacy `dict` or the new `RuntimeVars`
  struct.
- drop the old stub fn at bottom of `_state.py`.
- rm stale `from .msg.pretty_struct import Struct` comment.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-02 17:59:13 -04:00
Gud Boi be5e7e446b Proto a `._state.RuntimeVars` struct
So we can start transition from runtime-vars `dict` to a typed struct
for better clarity and wire-ready monitoring potential, as well as
better traceability when .

Deats,
- add a new `RuntimeVars(Struct)` with all fields from `_runtime_vars`
  dict typed out
- include `__setattr__()` with `breakpoint()` for debugging
  any unexpected mutations.
- add `.update()` method for batch-updating compat with `dict`.
- keep old `_runtime_vars: dict` in place (we need to port a ton of
  stuff to adjust..).

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-02 17:59:13 -04:00
Gud Boi 571b2b320e Add `reg_err_types()` for custom remote exc lookup
Allow external app code to register custom exception types
on `._exceptions` so they can be re-raised on the receiver
side of an IPC dialog via `get_err_type()`.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-02 17:59:13 -04:00
Gud Boi c7b5d00f19 Add `get_runtime_vars()` accessor to `._state`
Expose a copy of the current actor's `_runtime_vars` dict
via a public fn; TODO to convert to `RuntimeVars` struct.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-02 17:59:13 -04:00
dependabot[bot] 1049f7bf38
Bump pygments from 2.19.2 to 2.20.0
Bumps [pygments](https://github.com/pygments/pygments) from 2.19.2 to 2.20.0.
- [Release notes](https://github.com/pygments/pygments/releases)
- [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES)
- [Commits](https://github.com/pygments/pygments/compare/2.19.2...2.20.0)

---
updated-dependencies:
- dependency-name: pygments
  dependency-version: 2.20.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-03-30 20:28:44 +00:00
Bd cc3bfac741
Merge pull request #366 from goodboy/dereg_on_oserror
Make `find_actor()` delete stale sockaddr entries from registrar on `OSError`
2026-03-25 03:27:27 -04:00
Gud Boi e71eec07de Refine type annots in `_discovery` and `_runtime`
- Add `LocalPortal` union to `query_actor()` return
  type and `reg_portal` var annotation since the
  registrar yields a `LocalPortal` instance.
- Update docstring to note the `LocalPortal` case.
- Widen `.delete_addr()` `addr` param to accept
  `list[str|int]` bc msgpack deserializes tuples as
  lists over IPC.
- Tighten `uid` annotation to `tuple[str, str]|None`.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-25 02:16:48 -04:00
Gud Boi b557ec20a7 Coerce IPC `addr` to `tuple` in `.delete_addr()`
`msgpack` deserializes tuples as lists over IPC so
the `bidict.inverse.pop()` needs a `tuple`-cast to
match registry keys.

Regressed-by: 85457cb (`registry_addrs` change)
Found-via: `/run-tests` test_stale_entry_is_deleted

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-25 01:36:58 -04:00
Gud Boi 85457cb839 Address Copilot review suggestions on PR #366
- Use `bidict.forceput()` in `register_actor()` to handle
  duplicate addr values from stale entries or actor restarts.
- Fix `uid` annotation to `tuple[str, str]|None` in
  `maybe_open_portal()` and handle the `None` return from
  `delete_addr()` in log output.
- Pass explicit `registry_addrs=[reg_addr]` to `open_nursery()`
  and `find_actor()` in `test_stale_entry_is_deleted` to ensure
  the test uses the remote registrar.
- Update `query_actor()` docstring to document the new
  `(addr, reg_portal)` yield shape.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-25 00:21:09 -04:00
Gud Boi 850219f60c Guard `reg_portal` for `None` in `maybe_open_portal()`
Fix potential `AttributeError` when `query_actor()` yields
a `None` portal (peer-found-locally path) and an `OSError`
is raised during transport connect.

Also,
- fix `Arbiter.delete_addr()` return type to
  `tuple[str, str]|None` bc it can return `None`.
- fix "registar" typo -> "registrar" in comment.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-17 17:26:57 -04:00
Tyler Goodlet d929fb75b5 Rename `.delete_sockaddr()` -> `.delete_addr()` 2026-03-13 21:51:15 -04:00
Tyler Goodlet 403c2174a1 Always no-raise try-to-pop registry addrs 2026-03-13 21:51:15 -04:00
Tyler Goodlet 528012f35f Add stale entry deleted from registrar test
By spawning an actor task that immediately shuts down the transport
server and then sleeps, verify that attempting to connect via the
`._discovery.find_actor()` helper delivers `None` for the `Portal`
value.

Relates to #184 and #216
2026-03-13 21:51:15 -04:00
Tyler Goodlet 0dfa6f4a8a Don't unwrap and unwrapped addr, just warn on delete XD 2026-03-13 21:51:15 -04:00
Tyler Goodlet a0d3741fac Ensure `._registry` values are hashable, since `bidict`! 2026-03-13 21:51:15 -04:00
Tyler Goodlet 149b800c9f Handle stale registrar entries; detect and delete
In cases where an actor's transport server task (by default handling new
TCP connections) terminates early but does not de-register from the
pertaining registry (aka the registrar) actor's address table, the
trying-to-connect client actor will get a connection error on that
address. In the case where client handles a (local) `OSError` (meaning
the target actor address is likely being contacted over `localhost`)
exception, make a further call to the registrar to delete the stale
entry and `yield None` gracefully indicating to calling code that no
`Portal` can be delivered to the target address.

This issue was originally discovered in `piker` where the `emsd`
(clearing engine) actor would sometimes crash on rapid client
re-connects and then leave a `pikerd` stale entry. With this fix new
clients will attempt connect via an endpoint which will re-spawn the
`emsd` when a `None` portal is delivered (via `maybe_spawn_em()`).
2026-03-13 21:51:15 -04:00
Tyler Goodlet 03f458a45c Add `Arbiter.delete_sockaddr()` to remove addrs
Since stale addrs can be leaked where the actor transport server task
crashes but doesn't (successfully) unregister from the registrar, we
need a remote way to remove such entries; hence this new (registrar)
method.

To implement this make use of the `bidict` lib for the `._registry`
table thus making it super simple to do reverse uuid lookups from an
input socket-address.
2026-03-13 21:51:15 -04:00
Bd e77198bb64
Merge pull request #422 from goodboy/global_uds_in_test_harness
Run (some) test suites in CI with `--tpt-proto uds`
2026-03-13 21:50:45 -04:00
Gud Boi 5b8f6cf4c7 Use `.aid.uid` to avoid deprecation warns in tests
- `test_inter_peer_cancellation`: swap all `.uid` refs
  on `Actor`, `Channel`, and `Portal` to `.aid.uid`
- `test_legacy_one_way_streaming`: same + fix `print()`
  to multiline style

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-13 21:10:52 -04:00
Gud Boi 8868ff19f3 Flip to `ActorNursery.cancel_called` API
Avoid deprecation warnings, prepping for property removal.
2026-03-13 21:10:52 -04:00
Gud Boi 066011b83d Bump `fail_after` delay on non-linux for sync-sleep test
Use 6s timeout on non-linux (vs 4s) in
`test_cancel_while_childs_child_in_sync_sleep()` to avoid
flaky `TooSlowError` on slower CI runners.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-13 21:10:52 -04:00
Gud Boi b1d003d850 Add `--tpt-proto` CI matrix and wire to `pytest`
- add `tpt_proto: ['tcp', 'uds']` matrix dimension
  to the `testing` job.
- exclude `uds` on `macos-latest` for now.
- pass `--tpt-proto=${{ matrix.tpt_proto }}` to the
  `pytest` invocation.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-13 21:10:52 -04:00
Gud Boi 8991ec2bf5 Fix warns and de-reg race in `test_discovery`
Removes the `pytest` deprecation warns and attempts to avoid
some de-registration raciness, though i'm starting to think the
real issue is due to not having the fixes from #366 (which handle
the new dereg on `OSError` case from UDS)?

- use `.channel.aid.uid` over deprecated `.channel.uid`
  throughout `test_discovery.py`.
- add polling loop (up to 5s) for subactor de-reg check
  in `spawn_and_check_registry()` to handle slower
  transports like UDS where teardown takes longer.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-13 21:10:52 -04:00
Gud Boi dfc153c228 'Bump `daemon` pre-wait for 'uds' parametrization' 2026-03-13 21:10:52 -04:00
Gud Boi 52e8fb43ee Tighten UDS addr validation and sockname prefixes
- add `is_valid` and `sockpath.resolve()` asserts in
  `get_rando_addr()` for the `'uds'` case plus an
  explicit `UDSAddress` type annotation.
- rename no-runtime sockname prefixes from
  `'<unknown-actor>'`/`'root'` to
  `'no_runtime_root'`/`'no_runtime_actor'` with a proper
  if/else branch in `UDSAddress.get_random()`.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-13 21:10:52 -04:00
Gud Boi 99577b719a Skip cluster test on UDS, wire `tpt_proto` fixture
Add UDS skip-guard to `test_streaming_to_actor_cluster()`
and plumb `tpt_proto` through the `@tractor_test` wrapper
so transport-parametrized tests can receive it.

Deats,
- skip cluster test when `tpt_proto == 'uds'` with
  descriptive msg, add TODO about `@pytest.mark.no_tpt`.
- add `tpt_proto: str|None` param to inner wrapper in
  `tractor_test()`, forward to decorated fn when its sig
  accepts it.
- register custom `no_tpt` marker via `pytest_configure()`
  to avoid unknown-marker warnings.
- add masked todo for `no_tpt` marker-check code in `tpt_proto` fixture
  (needs fn-scope to work, left as TODO).
- add `request` param to `tpt_proto` fixture for future
  marker inspection.

Also,
- add doc-string to `test_streaming_to_actor_cluster()`.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-13 21:10:52 -04:00
Gud Boi 4092db60b2 Revert advanced-fault UDS edge case handling
Namely the workaround expected exc branches added in ef7ed7a for the UDS
parametrization. With the new boxing of the underlying CREs as
tpt-closed, we can expect the same exc outcomes as in the TCP cases.

Also this tweaks some error report logging content used while debugging
this,
- properly `repr()` the `TransportClosed.src_exc`-type from
  the maybe emit in `.report_n_maybe_raise()`.
- remove the redudant `chan.raddr` from the "closed abruptly"
  header in the tpt-closed handler of `._rpc.process_messages()`,
  the `Channel.__repr__()` now contains it by default.
2026-03-13 21:10:52 -04:00
Gud Boi 4f333dee05 Pass `enable_transports` in `daemon` fixture
Forward the `tpt_proto` fixture val into spawned daemon
subprocesses via `run_daemon(enable_transports=..)` and
sync `_runtime_vars['_enable_tpts']` in the `tpt_proto`
fixture so sub-actors inherit the transport setting.

Deats,
- add `enable_transports={enable_tpts}` to the daemon
  spawn-cmd template in `tests/conftest.py`.
- set `_state._runtime_vars['_enable_tpts']` in the
  `tpt_proto` fixture in `_testing/pytest.py`.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-13 21:10:52 -04:00
Gud Boi 8a2f74da2c Bump `_PROC_SPAWN_WAIT` and use `test_log` in `daemon`
For more reliability with the oob registrar using tests
via the `daemon` fixture,
- increase spawn-wait to `2` in CI, `1` OW; drop
  the old py<3.7 branch.
- move `_ci_env` to module-level (above `_non_linux`)
  so `_PROC_SPAWN_WAIT` can reference it at parse time.
- add `test_log` fixture param to `daemon()`, use it
  for the error-on-exit log line instead of bare `log`.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-13 21:10:52 -04:00
Gud Boi 2bf155131d Make `spawn()` `expect_timeout` configurable
Add `expect_timeout: float` param to `_spawn()`
so individual tests can tune `pexpect` timeouts
instead of relying on the hard-coded 3/10 split.

Deats,
- default to 4s, bump by +6 on non-linux CI.
- use walrus `:=` to capture resolved timeout and  assert
  `spawned.timeout == timeout` for sanity.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-13 21:10:52 -04:00
Gud Boi 0f29f5717a `ci.yml` add todos for mp-backend runs and for supporting subints soon! 2026-03-13 21:10:52 -04:00
Gud Boi 5ea721683b Use `.aid.uid` to avoid deprecation warns
I started getting annoyed by all the warnings from `pytest` during work
on macos suport in CI, so this replaces all `Actor.uid`/`Channel.uid`
accesses with `.aid.uid` (or `.aid.reprol()` for log msgs) across the
core runtime and IPC subsystems to avoid the noise.

This also provides incentive to start the adjustment to all
`.uid`-holding/tracking internal `dict`-tables/data-structures to
instead use `.msg.types.Aid`. Hopefully that will come a (vibed?) follow
up shortly B)

Deats,
- `._context`: swap all `self._actor.uid`, `self.chan.uid`,
  and `portal.actor.uid` refs to `.aid.uid`; use
  `.aid.reprol()` for log/error formatting.
- `._rpc`: same treatment for `actor.uid`, `chan.uid` in
  log msgs and cancel-scope handling; fix `str(err)` typo
  in `ContextCancelled` log.
- `._runtime`: update `chan.uid` -> `chan.aid.uid` in ctx
  cache lookups, RPC `Start` msg, registration and
  cancel-request handling; improve ctxc log formatting.
- `._spawn`: replace all `subactor.uid` with
  `.aid.uid` for child-proc tracking, IPC peer waiting,
  debug-lock acquisition, and nursery child dict ops.
- `._supervise`: same for `subactor.uid` in cancel and
  portal-wait paths; use `actor.aid.uid` for error dict.
- `._state`: fix `last.uid` -> `last.aid.uid` in
  `current_actor()` error msg.

Also,
- `._chan`: make `Channel.aid` a proper `@property` backed
  by `._aid` so we can add validation/typing later.
- `.log`: use `current_actor().aid.uuid` instead of
  `.uid[1]` for actor-uid log field.
- `.msg.types`: add TODO comment for `Start.aid` field
  conversion.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-13 21:10:52 -04:00
Gud Boi f84ef44992 Repair lifetime-stack suite's flakiness
Event on linux i was noticing lotsa false negatives based on sub
teardown race conditions, so this tries to both make way for
(eventually?) expanding the set of suite cases and ensure the current
ones are more reliable on every run.

The main change is to hange the `error_in_child=False` case to use
parent-side-cancellation via a new `trio.move_on_after(timeout)` instead
of `actor.cancel_soon()` (which is now toggled by a new `self_cancel:
bool` but unused rn), and add better teardown assertions.

Low level deats,
- add `rent_cancel`/`self_cancel` params to
  `crash_and_clean_tmpdir()` for different cancel paths;
  default to `rent_cancel=True` which just sleeps forever
  letting the parent's timeout do the work.
- use `trio.move_on_after()` with longer timeouts per
  case: 1.6s for error, 1s for cancel.
- use the `.move_on_after()` cancel-scope to assert `.cancel_called`
  pnly when `error_in_child=False`, indicating we
  parent-graceful-cancelled the sub.
- add `loglevel` fixture, pass to `open_nursery()`.
- log caught `RemoteActorError` via console logger.
- add `ids=` to parametrize for readable test names.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-13 21:10:52 -04:00
Gud Boi 1e0c57c6c5 Wrap cluster test in `trio.fail_after()`
Add a 6s timeout guard around `test_streaming_to_actor_cluster()`
to catch hangs, and nest the `async with` block inside it.
Found this when running `pytest tests/ --tpt-proto uds`.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-13 21:10:52 -04:00
Gud Boi 65660c77c7 Add note about `--tpt-proto` controlling `reg_addr`-type 2026-03-13 21:10:52 -04:00
Bd c9b415475f
Merge pull request #413 from goodboy/to_asyncio_channel_iface
Extend the `to_asyncio` inter-loop-task channel iface
2026-03-13 21:09:13 -04:00
Gud Boi 359bcf691f Update `docs/README.rst` to use `chan` API style
Sync the inline "infected asyncio" echo-server example
with the new `LinkedTaskChannel` iface from prior commits.

- `to_trio`/`from_trio` params -> `chan: LinkedTaskChannel`
- use `chan.started_nowait()`, `.send_nowait()`, `.get()`
- swap yield order to `(chan, first)`
- update blurb to describe the new unified channel API

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-13 20:54:49 -04:00
Gud Boi b3ce5ab4f6 Swap `open_channel_from()` to yield `(chan, first)`
Deliver `(LinkedTaskChannel, Any)` instead of the prior `(first, chan)`
order from `open_channel_from()` to match the type annotation and be
consistent with `trio.open_*_channel()` style where the channel obj
comes first.

- flip `yield first, chan` -> `yield chan, first`
- update type annotation + docstring to match
- swap all unpack sites in tests and examples

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-13 19:28:57 -04:00
Gud Boi e89fe03da7 Fix `LinkedTaskChannel` docstrings from GH bot review
Address valid findings from copilot's PR #413 review
(https://github.com/goodboy/tractor/pull/413
 #pullrequestreview-3925876037):

- `.get()` docstring referenced non-existent
  `._from_trio` attr, correct to `._to_aio`.
- `.send()` docstring falsely claimed error-raising
  on missing `from_trio` arg; reword to describe the
  actual `.put_nowait()` enqueue behaviour.
- `.open_channel_from()` return type annotation had
  `tuple[LinkedTaskChannel, Any]` but `yield` order
  is `(first, chan)`; fix annotation + docstring to
  match actual `tuple[Any, LinkedTaskChannel]`.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-13 19:28:57 -04:00
Gud Boi 417b796169 Use `chan: LinkedTaskChannel` API in all aio-task fns
Convert every remaining `to_trio`/`from_trio` fn-sig style
to the new unified `chan: LinkedTaskChannel` iface added in
prior commit (c46e9ee8).

Deats,
- `to_trio.send_nowait(val)` (1st call) -> `chan.started_nowait(val)`
- `to_trio.send_nowait(val)` (subsequent) -> `chan.send_nowait(val)`
- `await from_trio.get()` -> `await chan.get()`

Converted fns,
- `sleep_and_err()`, `push_from_aio_task()` in
  `tests/test_infected_asyncio.py`
- `sync_and_err()` in `tests/test_root_infect_asyncio.py`
- `aio_streamer()` in
  `tests/test_child_manages_service_nursery.py`
- `aio_echo_server()` in
  `examples/infected_asyncio_echo_server.py`
- `bp_then_error()` in `examples/debugging/asyncio_bp.py`

Also,
- drop stale comments referencing old param names.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-13 19:28:57 -04:00
Gud Boi 36cbc07602 Tried out an alt approach for `.to_asyncio` crashes
This change is masked out now BUT i'm leaving it in for reference.

I was debugging a multi-actor fault where the primary source actor was
an infected-aio-subactor (`brokerd.ib`) and it seemed like the REPL was only
entering on the `trio` side (at a `.open_channel_from()`) and not
eventually breaking in the `asyncio.Task`. But, since (changing
something?) it seems to be working now, it's just that the `trio` side
seems to sometimes handle before the (source/causing and more
child-ish) `asyncio`-task, which is a bit odd and not expected..
We could likely refine (maybe with an inter-loop-task REPL lock?) this
at some point and ensure a child-`asyncio` task which errors always
grabs the REPL **first**?

Lowlevel deats/further-todos,
- add (masked) `maybe_open_crash_handler()` block around
  `asyncio.Task` execution with notes about weird parent-addr
  delivery bug in `test_sync_pause_from_aio_task`
  * yeah dunno what that's about but made a bug; seems to be IPC
    serialization of the `TCPAddress` struct somewhere??
- add inter-loop lock TODO for avoiding aio-task clobbering
  trio-tasks when both crash in debug-mode

Also,
- change import from `tractor.devx.debug` to `tractor.devx`
- adjust `get_logger()` call to use new implicit mod-name detection
  added to `.log.get_logger()`, i.e. sin `name=__name__`.
- some teensie refinements to `open_channel_from()`:
  * swap return type annotation for  to `tuple[LinkedTaskChannel, Any]`
    (was `Any`).
  * update doc-string to clarify started-value delivery
  * add err-log before `.pause()` in what should be an unreachable path.
  * add todo to swap the `(first, chan)` pair to match that of ctx..

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])

[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-13 19:28:57 -04:00
Tyler Goodlet 1f2fad22ee Extend `.to_asyncio.LinkedTaskChannel` for aio side
With methods to comms similar to those that exist for the `trio` side,
- `.get()` which proxies verbatim to the `._to_aio: asyncio.Queue`,
- `.send_nowait()` which thin-wraps to `._to_trio: trio.MemorySendChannel`.

Obviously the more correct design is to break up the channel type into
a pair of handle types, one for each "side's" task in each event-loop,
that's hopefully coming shortly in a follow up patch B)

Also,
- fill in some missing doc strings, tweak some explanation comments and
  update todos.
- adjust the `test_aio_errors_and_channel_propagates_and_closes()` suite
  to use the new `chan` fn-sig-API with `.open_channel_from()` including
  the new methods for msg comms; ensures everything added here works e2e.
2026-03-13 19:28:57 -04:00
Tyler Goodlet ca5f6f50a8 Explain the `infect_asyncio: bool` param to pass in RTE msg 2026-03-13 19:28:57 -04:00
Bd a7ff1387c7
Merge pull request #414 from goodboy/struct_field_filtering
Hide private fields in `Struct.pformat()` output
2026-03-13 19:22:22 -04:00
Gud Boi abbb4a79c8 Drop unused import noticed by `copilot` 2026-03-13 11:52:18 -04:00
Gud Boi 1529095c32 Add `tests/msg/` sub-pkg, audit `pformat()` filtering
Reorganize existing msg-related test suites under
a new `tests/msg/` subdir (matching `tests/devx/`
and `tests/ipc/` convention) and add unit tests for
the `_`-prefixed field filtering in `pformat()`.

Deats,
- `git mv` `test_ext_types_msgspec` and `test_pldrx_limiting` into
  `tests/msg/`.
- add `__init__.py` + `conftest.py` for the new test sub-pkg.
- add new `test_pretty_struct.py` suite with 8 unit tests:
  - parametrized field visibility (public shown, `_`-private hidden,
    mixed)
  - direct `iter_struct_ppfmt_lines()` assertion
  - nested struct recursion filtering
  - empty struct edge case
  - real `MsgDec` via `mk_dec()` hiding `_dec`
  - `repr()` integration via `Struct.__repr__()`

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-12 18:32:31 -04:00
Gud Boi 8215a7ba34 Hide private fields in `Struct.pformat()` output
Skip fields starting with `_` in pretty-printed struct output
to avoid cluttering displays with internal/private state (and/or accessing
private properties which have errors Bp).

Deats,
- add `if k[0] == '_': continue` check to skip private fields
- change nested `if isinstance(v, Struct)` to `elif` since we
  now have early-continue for private fields
- mv `else:` comment to clarify it handles top-level fields
- fix indentation of `yield` statement to only output
  non-private, non-nested fields

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-10 00:33:32 -04:00
Bd c1c4d85958
Merge pull request #406 from goodboy/macos_support
macOS support (for the fancy peeps with nice furniture)
2026-03-10 00:28:54 -04:00
Gud Boi 88b084802f Merge `testing-macos` into unified `testing` matrix
Drop the separate `testing-macos` job and add
`macos-latest` to the existing OS matrix; bump
timeout to 16 min to accommodate macOS runs.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-09 23:28:58 -04:00
Gud Boi bf1dcea9d1 Announce macOS support in `pyproject` and README
- add `"Operating System :: MacOS"` classifier.
- add macOS bullet to README's TODO/status section.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-09 23:23:58 -04:00
Bd 5c270b89d5
Merge pull request #342 from goodboy/macos_in_ci
Macos in ci
2026-03-09 20:33:38 -04:00
Gud Boi 6ee0149e8d Another cancellation test timeout bump for non-linux 2026-03-09 19:46:42 -04:00
Gud Boi 9c4cd869fb OK-FINE, skip streaming docs example on macos!
It seems something is up with their VM-img or wtv bc i keep increasing
the subproc timeout and nothing is changing. Since i can't try
a `-xlarge` one without paying i'm just muting this test for now.
2026-03-09 19:46:42 -04:00
Gud Boi afd66ce3b7 Final try, drop logging level in streaming example to see if macos can cope.. 2026-03-09 19:46:42 -04:00
Gud Boi f9bdb1b35d Try one more timeout bumps for flaky docs streaming ex.. 2026-03-09 19:46:42 -04:00
Gud Boi d135ce94af Restyle `test_legacy_one_way_streaming` mod
- convert all doc-strings to `'''` multiline style.
- rename `nursery` -> `an`, `n` -> `tn` to match
  project-wide conventions.
- add type annotations to fn params (fixtures, test
  helpers).
- break long lines into multiline style for fn calls,
  assertions, and `parametrize` decorator lists.
- add `ids=` to `@pytest.mark.parametrize`.
- use `'` over `"` for string literals.
- add `from typing import Callable` import.
- drop spurious blank lines inside generators.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-09 19:46:25 -04:00
Gud Boi fb94aa0095 Tidy a typing-typo, add explicit `ids=` for paramed suites 2026-03-09 19:35:47 -04:00
Gud Boi b71e8575e5 Skip a couple more `ctlc` flaking suites 2026-03-09 19:31:16 -04:00
Gud Boi bbc028e84c Increase macos job timeout to 16s 2026-03-09 19:31:16 -04:00
Gud Boi 016306adf5 Allow `ctlcs_bish(<condition-args>)` skipping
Via ensuring `all(mark.args)` on wtv expressions are arg-passed to the
mark decorator; use it to skip the `test_subactor_breakpoint` suite when
`ctlc=True` since it seems too unreliable in CI.
2026-03-09 19:31:15 -04:00
Gud Boi 712c009790 Hike `testdir.spawn()` timeout on non-linux in CI 2026-03-09 19:30:41 -04:00
Gud Boi 79396b4a26 2x the ctl-c loop prompt-timeout for non-linux in CI 2026-03-09 19:30:41 -04:00
Gud Boi 5b2905b702 Xplatform tweaks for `daemon` fixture
There's a very sloppy registrar-actor-bootup syncing approach used in
this fixture (basically just guessing how long to sleep to wait for it
to init and bind the registry socket) using a `global _PROC_SPAWN_WAIT`
that needs to be made more reliable. But, for now i'm just playing along
with what's there to try and make less CI runs flaky by,

- sleeping *another* 1s when run from non-linux CI.
- reporting stdout (if any) alongside stderr on teardown.
- not strictly requiring a `proc.returncode == -2` indicating successful
  graceful cancellation via SIGINT; instead we now error-log and only
  raise the RTE on `< 0` exit code.
  * though i can't think of why this would happen other then an
    underlying crash which should propagate.. but i don't think any test
    suite does this intentionally rn?
  * though i don't think it should ever happen, having a CI run
    "error"-fail bc of this isn't all that illuminating, if there is
    some weird `.returncode == 0` termination case it's likely not
    a failure?

For later, see the new todo list; we should sync to some kind of "ping"
polling of the tpt address if possible which is already easy enough for
TCP reusing an internal closure from `._root.open_root_actor()`.
2026-03-09 19:30:41 -04:00
Gud Boi 776af3fce6 Register our `ctlcs_bish` marker to avoid warnings 2026-03-09 19:30:41 -04:00
Gud Boi 4639685770 Fill out types in `test_discovery` mod 2026-03-09 19:30:41 -04:00
Gud Boi 98a7d69341 Always pre-sleep in `daemon` fixture when in non-linux CI.. 2026-03-09 19:30:41 -04:00
Gud Boi ab6c955949 Lol fine! bump it a bit more XD 2026-03-09 19:30:41 -04:00
Gud Boi a72bb9321e Bleh, bump timeout again for docs-exs suite when in CI 2026-03-09 19:30:41 -04:00
Gud Boi 0e2949ea59 Bump docs-exs subproc timeout, exception log any timeouts 2026-03-09 19:30:41 -04:00
Gud Boi fb73935dbc Add a `test_log` fixture for emitting from *within* test bodies or fixtures 2026-03-09 19:30:41 -04:00
Gud Boi 94dfeb1441 Add delay before root-actor open, macos in CI.. 2026-03-09 19:30:41 -04:00
Gud Boi 9c1bcb23af Skip legacy-one-way suites on non-linux in CI 2026-03-09 19:30:41 -04:00
Gud Boi a1ea373f34 Ok.. try a longer prompt timeout? 2026-03-09 19:30:41 -04:00
Gud Boi e8f3d64e71 Increase prompt timeout for macos in CI 2026-03-09 19:30:41 -04:00
Gud Boi b30faaca82 Adjust debugger test suites for macos
Namely, after trying to get `test_multi_daemon_subactors` to work for
the `ctlc=True` case (for way too long), give up on that (see
todo/comments) and skip it; the normal case works just fine. Also tweak
the `test_ctxep_pauses_n_maybe_ipc_breaks` pattern matching for
non-`'UDS'` per the previous script commit; we can't use UDS alongside
`pytest`'s tmp dir generation, mega lulz.
2026-03-09 19:30:40 -04:00
Gud Boi 51701fc8dc Ok just skip `test_shield_pause` for macos..
Something something the SIGINT handler isn't being swapped correctly?
2026-03-09 19:29:47 -04:00
Gud Boi 7b89204afd Tweak `do_ctlc()`'s `delay` default
To be a null default and set to `0.1` when not passed by the caller so
as to avoid having to pass `0.1` if you wanted the
param-defined-default.

Also,
- in the `spawn()` fixtures's `unset_colors()` closure, add in a masked
  `os.environ['NO_COLOR'] = '1'` since i found it while trying to debug
  debugger tests.
- always return the `child.before` content from `assert_before()`
  helper; again it comes in handy when debugging console matching tests.
2026-03-09 19:29:18 -04:00
Gud Boi 82d02ef404 Lul, never use `'uds'` tpt for macos test-scripts
It's explained in the comment and i really think it's getting more
hilarious the more i learn about the arbitrary limitations of user space
with this tina platform.
2026-03-09 19:29:18 -04:00
Gud Boi b7546fd221 Longer timeout for `test_one_end_stream_not_opened`
On non-linux that is.
2026-03-09 19:29:18 -04:00
Gud Boi 86c95539ca Loosen shml test assert for key shortening on macos 2026-03-09 19:29:18 -04:00
Gud Boi 706a4b761b Add 6sec timeout around `test_simple_rpc` suite for macos 2026-03-09 19:29:18 -04:00
Gud Boi c5af2fa778 Add a `@no_macos` skipif deco 2026-03-09 19:29:18 -04:00
Gud Boi 86489cc453 Use py version in job `name`, consider macos in linux matrix? 2026-03-09 19:29:18 -04:00
Gud Boi 2631fb4ff3 Only run CI on <py3.14 2026-03-09 19:29:18 -04:00
Gud Boi aee86f2544 Run macos job on `uv` and newer `actions@v4` 2026-03-09 19:29:18 -04:00
Tyler Goodlet 83c8a8ad78 Add macos run using only the `trio` spawner 2026-03-09 19:29:18 -04:00
Gud Boi daae196048 Warn if `.ipc._uds.get_peer_pid()` returns null 2026-03-08 19:17:16 -04:00
Gud Boi 70efcb09a0 Slight refinements to `._state.get_rt_dir()`
Per the `copilot` review,
https://github.com/goodboy/tractor/pull/406#pullrequestreview-3893270953

now we also,
- pass `exists_ok=True` to `.mkdir()` to avoid conc races.
- expose `appname: str` param for caller override.
- normalize `subdir` to avoid escaping the base rt-dir location.
2026-03-08 19:17:16 -04:00
Gud Boi a7e74acdff Doc `getsockopt()` args (for macOS)
Per the questionable `copilot` review which is detailed for follow up in
https://github.com/goodboy/tractor/issues/418. These constants are
directly linked from the kernel sources fwiw.
2026-03-08 19:17:16 -04:00
Gud Boi 9c3d3bcec1 Add prompt flush hack for `bash` on macos as well.. 2026-03-08 19:17:16 -04:00
Gud Boi 521fb97fe9 Support UDS on macos (for realz)
Though it was a good (vibed) try by @dnks, the previous "fix" was not
actually adding unix socket support but merely sidestepping a crash due
to `get_peer_info()`'s impl never going to work on MacOS (and it was
never intended to).

This patch instead solves the underlying issue by implementing a new
`get_peer_pid()` helper which does in fact retrieve the peer's PID in
a more generic/cross-platform way (:fingers_crossed:); much thanks to
the linked SO answer for this solution!

Impl deats,
- add `get_peer_pid()` and call it from
  `MsgpackUDSStream.get_stream_addrs()` when we detect a non-'linux'
  platform, OW use the original soln: `get_stream_addrs()`.
- add a new case for the `match (peername, sockname)` with a
  `case (str(), str()):` which seems to at least work on macos.
- drop all the `LOCAL_PEERCRED` dynamic import branching since it was
  never needed and was never going to work.
2026-03-08 19:17:16 -04:00
Gud Boi d8a3969048 Also shorten shm-key for `ShmList` on macos
Same problem as for the `ShmArray` tokens, so tweak and reuse
the `_shorten_key_for_macos()` helper and call it from
`open_shm_list()` similarly.

Some tweaks/updates to the various helpers,
- support `prefix/suffix` inputs and if provided take their lengths and
  subtract them from the known *macOS shm_open() has a 31 char limit
  (PSHMNAMLEN)* when generating and using the `hashlib.sha256()` value
  which overrides (for now..) wtv `key` is passed by the caller.
- pass the appropriate `suffix='_first/_last'` values for the `ShmArray`
  token generators case.
- add a `prefix: str = 'shml_'` param to `open_shm_list()`.
- better log formatting with `!r` to report any key shortening.
2026-03-08 19:17:16 -04:00
Gud Boi 01c0db651a Port macOS shm 31-char name limit hack from `piker`
Adapt the `PSHMNAMLEN` fix from `piker.data._sharedmem` (orig commit
96fb79ec thx @dnks!) to `tractor.ipc._shm` accounting for the
module-local differences:

- Add `hashlib` import for sha256 key hashing
- Add `key: str|None` field to `NDToken` for storing
  the original descriptive key separate from the
  (possibly shortened) OS-level `shm_name`
- Add `__eq__()`/`__hash__()` to `NDToken` excluding
  the `key` field from identity comparison
- Add `_shorten_key_for_macos()` using `t_` prefix
  (vs piker's `p_`) with 16 hex chars of sha256
- Use `platform.system() == 'Darwin'` in `_make_token()`
  (tractor already imports the `platform` module vs
  piker's `sys.platform`)
- Wrap `shm_unlink()` in `ShmArray.destroy()` with
  `try/except FileNotFoundError` for teardown races
  (was already done in `SharedInt.destroy()`)
- Move token creation before `SharedMemory()` alloc in
  `open_shm_ndarray()` so `token.shm_name` is used
  as the OS-level name
- Use `lookup_key` pattern in `attach_shm_ndarray()`
  to decouple `_known_tokens` dict key from OS name

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-08 19:17:16 -04:00
Gud Boi 7bcd7aca2b Reorg `socket` conditional imports a bit
Move the multi-platorm-supporting conditional/dynamic `socket` constant
imports to *after* the main cross-platform ones.
Also add constant typing and reformat comments a bit for the macOS case.
2026-03-08 19:17:16 -04:00
Gud Boi 920d0043b4 Force parent subdirs for macos 2026-03-08 19:17:16 -04:00
wygud 93b9a6cd97 Add macOS compatibility for Unix socket credential passing
Make socket credential imports platform-conditional in `.ipc._uds`.
- Linux: use `SO_PASSCRED`/`SO_PEERCRED` from socket module
- macOS: use `LOCAL_PEERCRED` (0x0001) instead, no need for `SO_PASSCRED`
- Conditionally call `setsockopt(SO_PASSCRED)` only on Linux

Fixes AttributeError on macOS where SO_PASSCRED doesn't exist.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-08 19:17:16 -04:00
Tyler Goodlet e7cefba67f Use `platformdirs` for `.config.get_rt_dir()`
Thanks to the `tox`-dev community for such a lovely pkg which seems to
solves all the current cross-platform user-dir problems B)

Also this,
- now passes `platformdirs.user_runtime_dir(appname='tractor')`
  and allows caller to pass an optional `subdir` under `tractor/`
  if desired.
- drops the `.config._rtdir: Path` mod var.
- bumps the lock file with the new dep.
2026-03-08 19:16:49 -04:00
Bd 683476cc96
Merge pull request #421 from goodboy/py_pkging_update
Py pkging and version support update, go 3.12 and 3.13
2026-03-08 19:14:39 -04:00
Gud Boi ad24df0ed7 Drop `pytest.ini`, now covered in `pyproject.toml` 2026-03-08 18:58:30 -04:00
Gud Boi a1622c0b94 Bump `ruff.toml` to target py313 2026-03-08 18:39:08 -04:00
Gud Boi a385d20810 Disable the `xonsh` autoloaded `pytest` plugin 2026-03-08 18:38:31 -04:00
Gud Boi 7f9044c1ef Bump pkg classifiers to match py versions range 2026-03-08 18:16:30 -04:00
Gud Boi d0618e3cb4 Pin to py<3.14 (particularly for macos) 2026-03-08 18:16:30 -04:00
Gud Boi a5bebf76d5 Pin to py-3.12+ and pin-up some deps
Namely to get a fix (i patched in) to `pdbp` and the latest
prompt-injection feats from `xonsh` B)

Also leave in a (masked) `.uv.sources.pdbp` section for easy
patch-test-submit in the future from my own fork and bump the lock file
to match!
2026-03-08 18:16:21 -04:00
Bd 814b2e7e62
Merge pull request #416 from goodboy/claudy_skillz
Claudy skillz: kicking off some `claude` code skill-files
2026-03-04 21:36:49 -05:00
Gud Boi 1704f73504 Add local `claude` settings for commit-msg perms
Auto-allow the tool calls used by the `/commit-msg` skill
so the workflow requires minimal user prompting.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-04 16:09:37 -05:00
Gud Boi c735fc8544 Update `.gitignore` for `gish` local files
Rename `gitea/` comment to `gish`, add `gh/` ignore, and
expand TODOs about syncing with git hosting services.
Also mute any `.claude/*_commit*.txt` files.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-04 16:06:27 -05:00
Gud Boi c5ea6040bf Improve `/commit-msg` skill fmting + consistency
Align `SKILL.md` and `style-guide-reference.md` with the
claude-code skills docs and actual usage conventions.

Deats,
- add missing frontmatter fields: `argument-hint`,
  `disable-model-invocation`, and scope `Bash` tool to
  `Bash(git *)` prefix pattern per docs.
- add `Grep`/`Glob` to `allowed-tools`.
- restructure `!`backtick`` usage for proper dynamic
  context injection (not mixed with instructional text).
- use markdown link for `style-guide-reference.md` ref
  per docs' supporting-files convention.
- switch timestamp format to cross-OS-safe
  `date -u +%Y%m%dT%H%M%SZ`.

Also,
- rm errant blank lines between footer attribution and
  reference-def lines in both files.
- fix double space in style guide subject length desc.
- reflow long lines for 67 char limit.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-04 15:05:17 -05:00
Gud Boi d4f2fa547a Extend `.gitignore` for dev/gen artifacts
Add ignore rules for commit-msg gen tmp files, nix dev
profiles, vim sessions, macOS metadata, gitea local docs,
and private LLM conversation logs.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-01 17:46:03 -05:00
Gud Boi 20896bfbab Add `commit-msg` skill + style guide reference
Add a `claude-code` skill for generating commit msgs
matching `tractor`'s style, plus a companion style guide
derived from analysis of 500 repo commits.

Deats,
- `SKILL.md` defines the `commit-msg` skill with YAML
  frontmatter, outlining the generation process, formatting
  rules, and footer conventions.
- `style-guide-reference.md` documents verb frequencies,
  backtick usage patterns, section markers, abbreviations,
  and tone from historical commit analysis.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-03-01 15:46:27 -05:00
Bd 70bb77280e
Merge pull request #411 from goodboy/tpt_tolerance
Tpt-tolerance: more lowlevel `trio` CRE/BRE -> `TransportClosed` translations
2026-02-19 16:40:17 -05:00
Gud Boi 916f88a070 Less newlines in `._rpc` log msg 2026-02-19 16:31:54 -05:00
Gud Boi 91f2f3ec10 Use test-harness `loglevel` in inter-peer suite 2026-02-19 16:29:20 -05:00
Tyler Goodlet 3e5124e184 Hide `._rpc._invoke()` frame, again.. 2026-02-19 16:28:22 -05:00
Gud Boi fa86269e30 Stuff from auto-review in https://github.com/goodboy/tractor/pull/412 .. 2026-02-19 16:20:21 -05:00
Gud Boi d0b92bbeba Clean up `._transport` error-case comment
Expand and clarify the comment for the default `case _`
block in the `.send()` error matcher, noting that we
console-error and raise-thru for unexpected disconnect
conditions.

(this patch was suggested by copilot in,
 https://github.com/goodboy/tractor/pull/411)

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-02-19 16:18:39 -05:00
Gud Boi 9470815f5a Fix `spawn` fixture cleanup + test assertions
Improve the `spawn` fixture teardown logic in
`tests/devx/conftest.py` fixing the while-else bug, and fix
`test_advanced_faults` genexp for `TransportClosed` exc type
checking.

Deats,
- replace broken `while-else` pattern with direct
  `if ptyproc.isalive()` check after the SIGINT loop.
- fix undefined `spawned` ref -> `ptyproc.isalive()` in
  while condition.
- improve walrus expr formatting in timeout check (multiline
  style).

Also fix `test_ipc_channel_break_during_stream()` assertion,
- wrap genexp in `all()` call so it actually checks all excs
  are `TransportClosed` instead of just creating an unused
  generator.

(this patch was suggested by copilot in,
 https://github.com/goodboy/tractor/pull/411)

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-02-19 16:14:11 -05:00
Gud Boi 592d918394 Tweak `test_inter_peer_cancellation` for races
Adjust `basic_echo_server()` default sequence len to avoid the race
where the 'tell_little_bro()` finished streaming **before** the
echo-server sub is cancelled by its peer subactor (which is the whole
thing we're testing!).

Deats,
- bump `rng_seed` default from 50 -> 100 to ensure peer
  cancel req arrives before echo dialog completes on fast hw.
- add `trio.sleep(0.001)` between send/receive in msg loop on the
  "client" streamer side to give cancel request transit more time to
  arrive.

Also,
- add more native `tractor`-type hints.
- reflow `basic_echo_server()` doc-string for 67 char limit
- add masked `pause()` call with comment about unreachable
  code path
- alphabetize imports: mv `current_actor` and `open_nursery`
  below typed imports

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-02-19 15:24:42 -05:00
Gud Boi 0cddc67bdb Add doc-strs to `get_root()` + `maybe_open_portal()`
Brief descriptions for both fns in `._discovery` clarifying
what each delivers and under what conditions.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-02-19 13:55:02 -05:00
Gud Boi 052fe2435f Improve `Channel` doc-strs + minor cleanups
Flesh out missing method doc-strings, improve log msg formatting and
assert -> `RuntimeError` for un-inited tpt layer.

Deats,
- add doc-string to `.send()` noting `TransportClosed` raise
  on comms failures.
- add doc-string to `.recv()`.
- expand `._aiter_msgs()` doc-string, line-len reflow.
- add doc-string to `.connected()`.
- convert `assert self._transport` -> `RuntimeError` raise
  in `._aiter_msgs()` for more explicit crashing.
- expand `_connect_chan()` doc-string, note it's lowlevel
  and suggest `.open_portal()` to user instead.
- factor out `src_exc_str` in `TransportClosed` log handler
  to avoid double-call
- use multiline style for `.connected()` return expr.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-02-19 13:55:02 -05:00
Gud Boi 28819bf5d3 Add `Actor.is_root()` convenience predicate meth 2026-02-19 13:55:02 -05:00
Gud Boi 07c2ba5c0d Drop `trio`-exc-catching if tpt-closed covers them
Remove the `trio.ClosedResourceError` and `trio.BrokenResourceError`
handling that should now be subsumed by `TransportClosed` re-raising out
of the `.ipc` stack.

Deats,
- drop CRE and BRE from `._streaming.MsgStream.aclose()/.send()` blocks.
- similarly rm from `._context.open_context_from_portal()`.
- also from `._portal.Portal.cancel_actor()` and drop the
  (now-completed-todo) comment about this exact thing.

Also add comment in `._rpc.try_ship_error_to_remote()` noting the
remaining `trio` catches there are bc the `.ipc` layers *should* be
wrapping them; thus `log.critical()` use is warranted.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-02-19 13:55:02 -05:00
Gud Boi 50f40f427b Include `TransportClosed` in tpt-layer err handling
Add `TransportClosed` to except clauses where `trio`'s own
resource-closed errors are already caught, ensuring our
higher-level tpt exc is also tolerated in those same spots.
Likely i will follow up with a removal of the `trio` variants since most
*should be* caught and re-raised as tpt-closed out of the `.ipc` stack
now?

Add `TransportClosed` to various handler blocks,
- `._streaming.MsgStream.aclose()/.send()` except blocks.
- the broken-channel except in `._context.open_context_from_portal()`.
- obvi import it where necessary in those ^ mods.

Adjust `test_advanced_faults` suite + exs-script to match,
- update `ipc_failure_during_stream.py` example to catch
  `TransportClosed` alongside `trio.ClosedResourceError`
  in both the break and send-check paths.
- shield the `trio.sleep(0.01)` after tpt close in example to avoid
  taskc-raise/masking on that checkpoint since we want to simulate
  waiting for a user to send a KBI.
- loosen `ExceptionGroup` assertion to `len(excs) <= 2` and ensure all
  excs are `TransportClosed`.
- improve multi-line formatting, minor style/formatting fixes in
  condition expressions.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-02-19 13:55:02 -05:00
Gud Boi bf6de55865 Improve tpt-closed msg-fmt/content and CRE case matching
Refine tpt-error reporting to include closure attribution (`'locally'`
vs `'by peer'`), tighten match conditions and reduce needless newlines
in exc reprs.

Deats,
- factor out `trans_err_msg: str` and `by_whom: str` into a `dict`
  lookup before the `match:` block to pair specific err msgs to closure
  attribution strings.
- use `by_whom` directly as `CRE` case guard condition
  (truthy when msg matches known underlying CRE msg content).
- conveniently include `by_whom!r` in `TransportClosed` message.
- fix `'locally ?'` -> `'locally?'` in send-side `CRE`
  handler (drop errant space).
- add masked `maybe_pause_bp()` calls at both `CRE` sites (from when
  i was tracing a test harness issue where the UDS socket path wasn't
  being cleaned up on teardown).
- drop trailing `\n` from `body=` args to `TransportClosed`.
- reuse `trans_err_msg` for the `BRE`/broken-pipe guard.

Also adjust testing, namely `test_ctxep_pauses_n_maybe_ipc_breaks`'s
expected patts-set for new msg formats to be raised out of
`.ipc._transport`.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-02-19 13:55:02 -05:00
Gud Boi 5ded99a886 Add a `._trace.maybe_pause_bp()` for tpt-broken cases
Internal helper which falls back to sync `pdb` when the
child actor can't reach root to acquire the TTY lock.

Useful when debugging tpt layer failures (intentional or
otherwise) where a sub-actor can no longer IPC-contact the
root to coordinate REPL access; root uses `.pause()` as
normal while non-root falls back to `mk_pdb().set_trace()`.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-02-19 13:55:02 -05:00
Gud Boi 7145fa364f Add `SIGINT` cleanup to `spawn` fixture in `devx/conftest`
Convert `spawn` fixture to a generator and add post-test graceful
subproc cleanup via `SIGINT`/`SIGKILL` to avoid leaving stale `pexpect`
child procs around between test runs as well as any UDS-tpt socket files
under the system runtime-dir.

Deats,
- convert `return _spawn` -> `yield _spawn` to enable
  post-yield teardown logic.
- add a new `nonlocal spawned` ref so teardown logic can access the last
  spawned child from outside the delivered spawner fn-closure.
- add `SIGINT`-loop after yield with 5s timeout, then
  `SIGKILL` if proc still alive.
- add masked `breakpoint()` and TODO about UDS path cleanup

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-02-19 13:55:02 -05:00
Gud Boi f8e25688c7 Unmask `ClosedResourceError` handling in `._transport`
Unmask the CRE case block for peer-closed socket errors which already
had a TODO about reproducing the condition. It appears this case can
happen during inter-actor comms teardowns in `piker`, but i haven't been
able to figure out exactly what reproduces it yet..

So activate the block again for that 'socket already closed'-msg case,
and add a TODO questioning how to reproduce it.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-02-12 00:51:50 -05:00
Tyler Goodlet c3f455a8ec Mask tpt-closed handling of `chan.send(return_msg)`
A partial revert of commit c05d08e426 since it seem we already
suppress tpt-closed errors lower down in `.ipc.Channel.send()`; given
that i'm pretty sure this new handler code should basically never run?

Left in a todo to remove the masked content once i'm done more
thoroughly testing under `piker`.
2026-02-12 00:51:50 -05:00
Tyler Goodlet f78e842fba More `TransportClosed`-handling around IPC-IO
For IPC-disconnects-during-teardown edge cases, augment some `._rpc`
machinery,
- in `._invoke()` around the `await chan.send(return_msg)` where we
  suppress if the underlying `Channel` already disconnected.
- add a disjoint handler in `_errors_relayed_via_ipc()` which just
  reports-n-reraises the exc (same as prior behaviour).
  * originally i thought it needed to be handled specially (to avoid
    being crash handled) but turns out that isn't necessary?
  * hence the also-added-bu-masked-out `debug_filter` / guard expression
    around the `await debug._maybe_enter_pm()` line.
- show the `._invoke()` frame for the moment.
2026-02-12 00:51:50 -05:00
Bd 3638b80c9d
Merge pull request #412 from goodboy/root_actor_raddrs_fix
Non-registrar, root actor `_root_addrs` runtime-vars fix
2026-02-12 00:49:40 -05:00
Gud Boi 2ed9e65530 Clear rtvs state on root shutdown..
Fixes the bug discovered in last test update, not sure how this wasn't
caught already XD
2026-02-11 22:17:26 -05:00
Gud Boi 6cab363c51 Catch-n-fail on stale `_root_addrs` state..
Turns out we aren't clearing the `._state._runtime_vars` entries in
between `open_root_actor` calls.. This test refinement catches that by
adding runtime-vars asserts on the expected root-addrs value; ensure
`_runtime_vars['_root_addrs'] ONLY match the values provided by the
test's CURRENT root actor.

This causes a failure when the (just added)
`test_non_registrar_spawns_child` is run as part of the module suite,
it's fine when run standalone.
2026-02-11 22:17:26 -05:00
Gud Boi 8aee24e83f Fix when root-actor addrs is set as rtvs
Move `_root_addrs` assignment to after `async_main()` unblocks (via
`.started()`) which now delivers the bind addrs , ensuring correct
`UnwrappedAddress` propagation into `._state._runtime_vars` for
non-registar root actors..

Previously for non-registrar root actors the `._state._runtime_vars`
entries were being set as `Address` values which ofc IPC serialize
incorrectly rn vs. the unwrapped versions, (well until we add a msgspec
for their structs anyway) and thus are passed in incorrect form to
children/subactors during spawning..

This fixes the issue by waiting for the `.ipc.*` stack to
bind-and-resolve any randomly allocated addrs (by the OS) until after
the initial `Actor` startup is complete.

Deats,
- primarily, mv `_root_addrs` assignment from before `root_tn.start()`
  to after, using started(-ed) `accept_addrs` now delivered from
  `._runtime.async_main()`..
- update `task_status` type hints to match.
- unpack and set the `(accept_addrs, reg_addrs)` tuple from
  `root_tn.start()` call into `._state._runtime_vars` entries.
- improve and embolden comments distinguishing registrar vs non-registrar
  init paths, ensure typing reflects wrapped vs. unwrapped addrs.

Also,
- add a masked `mk_pdb().set_trace()` for debugging `raddrs` values
  being "off".
- add TODO about using UDS on linux for root mailbox
- rename `trans_bind_addrs` -> `tpt_bind_addrs` for clarity.
- expand comment about random port allocation for
  non-registrar case

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-02-11 22:17:26 -05:00
Gud Boi cdcc1b42fc Add test for non-registrar root sub-spawning
Ensure non-registrar root actors can spawn children and that
those children receive correct parent contact info. This test
catches the bug reported in,

https://github.com/goodboy/tractor/issues/410

Add new `test_non_registrar_spawns_child()` which spawns a sub-actor
from a non-registrar root and verifies the child can manually connect
back to its parent using `get_root()` API, auditing
`._state._runtime_vars` addr propagation from rent to child.

Also,
- improve type hints throughout test suites
  (`subprocess.Popen`, `UnwrappedAddress`, `Aid` etc.)
- rename `n` -> `an` for actor nursery vars
- use multiline style for function signatures

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-02-11 22:17:26 -05:00
Bd 51ac0c623e
Merge pull request #402 from goodboy/log_sys_testing
Log sys testing, starting to get "serious" about it..
2026-02-11 22:13:17 -05:00
Gud Boi 3f0bde1bf8 Use bare `get_logger()` in `.to_asyncio` 2026-02-11 22:02:41 -05:00
Gud Boi fa1a15dce8 Cleaups per copilot PR review 2026-02-11 21:51:40 -05:00
Gud Boi 5850844297 Mk `test_implicit_mod_name_applied_for_child()` check init-mods
Test pkg-level init module and sub-pkg module logger naming
to better validate auto-naming logic.

Deats,
- create `pkg_init_mod` and write `mod_code` to it for
  testing pkg-level `__init__.py` logger instance creation.
  * assert `snakelib.__init__` logger name is `proj_name`.
- write `mod_code` to `subpkg/__init__.py`` as well and check the same.

Also,
- rename some vars,
  * `pkg_mod` -> `pkg_submod`,
  * `pkgmod` -> `subpkgmod`
- add `ModuleType` import for type hints
- improve comments explaining pkg init vs first-level
  sub-module naming expectations.
- drop trailing whitespace and unused TODO comment
- remove masked `breakpoint()` call

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-02-11 21:43:37 -05:00
Gud Boi ff02939213 Toss in some `colorlog` alts to try 2026-02-11 21:05:16 -05:00
Gud Boi d61e8caab2 Improve `test_log_sys` for new auto-naming logic
Add assertions and comments to better test the reworked
implicit module-name detection in `get_logger()`.

Deats,
- add `assert not tractor.current_actor()` check to verify
  no runtime is active during test.
- import `.log` submod directly for use.
- add masked `breakpoint()` for debugging mod loading.
- add comment about using `ranger` to inspect `testdir` layout
  of auto-generated py pkg + module-files.
- improve comments explaining pkg-root-log creation.
- add TODO for testing `get_logger()` call from pkg
  `__init__.py`
- add comment about first-pkg-level module naming.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-02-11 21:05:07 -05:00
Gud Boi 0b0c83e9da Drop `name=__name__` from all `get_logger()` calls
Use new implicit module-name detection throughout codebase to simplify
logger creation and leverage auto-naming from caller mod .

Main changes,
- drop `name=__name__` arg from all `get_logger()` calls
  (across 29 modules).
- update `get_console_log()` calls to include `name='tractor'` for
  enabling root logger in test harness and entry points; this ensures
  logic in `get_logger()` triggers so that **all** `tractor`-internal
  logging emits to console.
- add info log msg in test `conftest.py` showing test-harness
  log level

Also,
- fix `.actor.uid` ref to `.actor.aid.uid` in `._trace`.
- adjust a `._context` log msg formatting for clarity.
- add TODO comments in `._addr`, `._uds` for when we mv to
  using `multiaddr`.
- add todo for `RuntimeVars` type hint TODO in `.msg.types` (once we
  eventually get that all going obvi!)

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-02-11 21:04:49 -05:00
Gud Boi 5e7c0f264d Rework `.get_logger()`, better autonaming, deduping
Overhaul of the automatic-calling-module-name detection and sub-log
creation logic to avoid (at least warn) on duplication(s) and still
handle the common usage of a call with `name=__name__` from a mod's top
level scope.

Main `get_logger()` changes,
- refactor auto-naming logic for implicit `name=None` case such that we
  handle at least `tractor` internal "bare" calls from internal submods.
- factor out the `get_caller_mod()` closure (still inside
  `get_logger()`)for introspecting caller's module with configurable
  frame depth.
- use `.removeprefix()` instead of `.lstrip()` for stripping pkg-name
  from mod paths
- mv root-logger creation before sub-logger name processing
- improve duplicate detection for `pkg_name` in `name`
- add `_strict_debug=True`-only-emitted warnings for duplicate
  pkg/leaf-mod names.
- use `print()` fallback for warnings when no actor runtime is up at
  call time.

Surrounding tweaks,
- add `.level` property to `StackLevelAdapter` for getting
  current emit level as lowercase `str`.
- mv `_proj_name` def to just above `get_logger()`
- use `_curr_actor_no_exc` partial in `_conc_name_getters`
  to avoid runtime errors
- improve comments/doc-strings throughout
- keep some masked `breakpoint()` calls for future debugging

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-02-11 21:04:29 -05:00
Gud Boi edf1189fe0 Multi-line styling in `test.devx.conftest` 2026-02-11 21:04:22 -05:00
Tyler Goodlet de24bfe052 Mv `load_module_from_path()` to a new `._code_load` submod 2026-02-11 21:03:29 -05:00
Tyler Goodlet e235b96894 Use new `pkg_name` in log-sys test suites 2026-02-11 21:03:07 -05:00
Tyler Goodlet dea4b9fd93 Implicitly name sub-logs by caller's mod
That is when no `name` is passed to `get_logger()`, try to introspect
the caller's `module.__name__` and use it to infer/get the "namespace
path" to that module the same as if using `name=__name__` as in the most
common usage.

Further, change the `_root_name` to be `pkg_name: str`, a public and
more obvious param name, and deprecate the former. This obviously adds
the necessary impl to make the new
`test_sys_log::test_implicit_mod_name_applied_for_child` test pass.

Impl detalles for `get_logger()`,
- add `pkg_name` and deprecate `_root_name`, include failover logic
  and a warning.
- implement calling module introspection using
  `inspect.stack()/getmodule()` to get both the `.__name__` and
  `.__package__` info alongside adjusted logic to set the `name`
  when not provided but only when a new `mk_sublog: bool` is set.
- tweak the `name` processing for implicitly set case,
  - rename `sub_name` -> `pkg_path: str` which is the path
    to the calling module minus that module's name component.
  - only partition `name` if `pkg_name` is `in` it.
  - use the `_root_log` for `pkg_name` duplication warnings.

Other/related,
- add types to various public mod vars missing them.
- rename `.log.log` -> `.log._root_log`.
2026-02-11 21:03:07 -05:00
Tyler Goodlet 557e2cec6a Add an implicit-pkg-path-as-logger-name test
A bit of test driven dev to anticipate support  of `.log.get_logger()`
usage such that it can be called from arbitrary sub-modules, themselves
embedded in arbitrary sub-pkgs, of some project; the when not provided,
the `sub_name` passed to the `Logger.getChild(<sub_name>)` will be set
as the sub-pkg path "down to" the calling module.

IOW if you call something like,

`log = tractor.log.get_logger(pkg_name='mypylib')`

from some `submod.py` in a project-dir that looks like,

mypylib/
  mod.py
  subpkg/
    submod.py  <- calling module

the `log: StackLevelAdapter` child-`Logger` instance will have a
`.name: str = 'mypylib.subpkg'`, discluding the `submod` part since this
already rendered as the `{filename}` header in `log.LOG_FORMAT`.

Previously similar behaviour would be obtained by passing
`get_logger(name=__name__)` in the calling module and so much so it
motivated me to make this the default, presuming we can introspect for
the info.

Impl deats,
- duplicated a `load_module_from_path()` from `modden` to load the
  `testdir` rendered py project dir from its path.
 |_should prolly factor it down to this lib anyway bc we're going to
   need it for hot code reload? (well that and `watchfiles` Bp)
- in each of `mod.py` and `submod.py` render the `get_logger()` code
  sin `name`, expecting the (coming shortly) implicit introspection
  feat to do this.
- do `.name` and `.parent` checks against expected sub-logger values
  from `StackLevelAdapter.logger.getChildren()`.
2026-02-11 21:03:07 -05:00
Tyler Goodlet 0e3229f16d Start a logging-sys unit-test module
To start ensuring that when `name=__name__` is passed we try to
de-duplicate the `_root_name` and any `leaf_mod: str` since it's already
included in the headers as `{filename}`.

Deats,
- heavily document the de-duplication `str.partition()`s in
  `.log.get_logger()` and provide the end fix by changing the predicate,
  `if rname == 'tractor':` -> `if rname == _root_name`.
  * also toss in some warnings for when we still detect duplicates.
- add todo comments around logging "filters" (vs. our "adapter").
- create the new `test_log_sys.test_root_pkg_not_duplicated()` which
  runs green with the fixes from ^.
- add a ton of test-suite todos both for existing and anticipated
  logging sys feats in the new mod.
2026-02-11 21:03:07 -05:00
Bd 448d25aef4
Merge pull request #409 from goodboy/nixos_flake
Nixos flake, for the *too-hip-for-arch-ers*
2026-02-11 21:02:37 -05:00
Gud Boi 343c9e0034 Tweaks per the `copilot` PR review 2026-02-11 20:55:08 -05:00
Gud Boi 1dc27c5161 Add a dev-overlay nix flake
Based on the impure template from `pyproject.nix` and providing
a dev-shell for easy bypass-n-hack on nix(os) using `uv`.

Deats,
- include bash completion pkgs for devx/happiness.
- pull `ruff` from <nixpkgs> to avoid wheel (build) issues.
- pin to py313 `cpython` for now.
2026-01-23 16:27:19 -05:00
Gud Boi 14aefa4b11 Reorg dev deps into nested groups
Namely,
- `devx` for console debugging extras used in `tractor.devx`.
- `repl` for @goodboy's `xonsh` hackin utils.
- `testing` for harness stuffs.
- `lint` for whenever we start doing that; it requires special
  separation on nixos in order to pull `ruff` from pkgs.

Oh and bump the lock file.
2026-01-23 16:24:24 -05:00
135 changed files with 9824 additions and 2502 deletions

View File

@ -0,0 +1,38 @@
# Docs TODOs
## Auto-sync README code examples with source
The `docs/README.rst` has inline code blocks that
duplicate actual example files (e.g.
`examples/infected_asyncio_echo_server.py`). Every time
the public API changes we have to manually sync both.
Sphinx's `literalinclude` directive can pull code directly
from source files:
```rst
.. literalinclude:: ../examples/infected_asyncio_echo_server.py
:language: python
:caption: examples/infected_asyncio_echo_server.py
```
Or to include only a specific function/section:
```rst
.. literalinclude:: ../examples/infected_asyncio_echo_server.py
:language: python
:pyobject: aio_echo_server
```
This way the docs always reflect the actual code without
manual syncing.
### Considerations
- `README.rst` is also rendered on GitHub/PyPI which do
NOT support `literalinclude` - so we'd need a build
step or a separate `_sphinx_readme.rst` (which already
exists at `docs/github_readme/_sphinx_readme.rst`).
- Could use a pre-commit hook or CI step to extract code
from examples into the README for GitHub rendering.
- Another option: `sphinx-autodoc` style approach where
docstrings from the actual module are pulled in.

View File

@ -0,0 +1,36 @@
{
"permissions": {
"allow": [
"Bash(date *)",
"Bash(cp .claude/*)",
"Bash(git diff *)",
"Bash(git log *)",
"Bash(git status)",
"Bash(git remote:*)",
"Bash(git stash:*)",
"Bash(git mv:*)",
"Bash(git rev-parse:*)",
"Bash(test:*)",
"Bash(ls:*)",
"Bash(grep:*)",
"Bash(find:*)",
"Bash(ln:*)",
"Bash(cat:*)",
"Bash(mkdir:*)",
"Bash(gh pr:*)",
"Bash(gh api:*)",
"Bash(gh issue:*)",
"Bash(UV_PROJECT_ENVIRONMENT=py* uv sync:*)",
"Bash(UV_PROJECT_ENVIRONMENT=py* uv run:*)",
"Bash(echo EXIT:$?:*)",
"Write(.claude/*commit_msg*)",
"Write(.claude/git_commit_msg_LATEST.md)",
"Skill(run-tests)",
"Skill(close-wkt)",
"Skill(open-wkt)",
"Skill(prompt-io)"
],
"deny": [],
"ask": []
}
}

View File

@ -0,0 +1,225 @@
# Commit Message Style Guide for `tractor`
Analysis based on 500 recent commits from the `tractor` repository.
## Core Principles
Write commit messages that are technically precise yet casual in
tone. Use abbreviations and informal language while maintaining
clarity about what changed and why.
## Subject Line Format
### Length and Structure
- Target: ~50 chars with a hard-max of 67.
- Use backticks around code elements (72.2% of commits)
- Rarely use colons (5.2%), except for file prefixes
- End with '?' for uncertain changes (rare: 0.8%)
- End with '!' for important changes (rare: 2.0%)
### Opening Verbs (Present Tense)
Most common verbs from analysis:
- `Add` (14.4%) - wholly new features/functionality
- `Use` (4.4%) - adopt new approach/tool
- `Drop` (3.6%) - remove code/feature
- `Fix` (2.4%) - bug fixes
- `Move`/`Mv` (3.6%) - relocate code
- `Adjust` (2.0%) - minor tweaks
- `Update` (1.6%) - enhance existing feature
- `Bump` (1.2%) - dependency updates
- `Rename` (1.2%) - identifier changes
- `Set` (1.2%) - configuration changes
- `Handle` (1.0%) - add handling logic
- `Raise` (1.0%) - add error raising
- `Pass` (0.8%) - pass parameters/values
- `Support` (0.8%) - add support for something
- `Hide` (1.4%) - make private/internal
- `Always` (1.4%) - enforce consistent behavior
- `Mk` (1.4%) - make/create (abbreviated)
- `Start` (1.0%) - begin implementation
Other frequent verbs: `More`, `Change`, `Extend`, `Disable`, `Log`,
`Enable`, `Ensure`, `Expose`, `Allow`
### Backtick Usage
Always use backticks for:
- Module names: `trio`, `asyncio`, `msgspec`, `greenback`, `stackscope`
- Class names: `Context`, `Actor`, `Address`, `PldRx`, `SpawnSpec`
- Method names: `.pause_from_sync()`, `._pause()`, `.cancel()`
- Function names: `breakpoint()`, `collapse_eg()`, `open_root_actor()`
- Decorators: `@acm`, `@context`
- Exceptions: `Cancelled`, `TransportClosed`, `MsgTypeError`
- Keywords: `finally`, `None`, `False`
- Variable names: `tn`, `debug_mode`
- Complex expressions: `trio.Cancelled`, `asyncio.Task`
Most backticked terms in tractor:
`trio`, `asyncio`, `Context`, `.pause_from_sync()`, `tn`,
`._pause()`, `breakpoint()`, `collapse_eg()`, `Actor`, `@acm`,
`.cancel()`, `Cancelled`, `open_root_actor()`, `greenback`
### Examples
Good subject lines:
```
Add `uds` to `._multiaddr`, tweak typing
Drop `DebugStatus.shield` attr, add `.req_finished`
Use `stackscope` for all actor-tree rendered "views"
Fix `.to_asyncio` inter-task-cancellation!
Bump `ruff.toml` to target py313
Mv `load_module_from_path()` to new `._code_load` submod
Always use `tuple`-cast for singleton parent addrs
```
## Body Format
### General Structure
- 43.2% of commits have no body (simple changes)
- Use blank line after subject
- Max line length: 67 chars
- Use `-` bullets for lists (28.0% of commits)
- Rarely use `*` bullets (2.4%)
### Section Markers
Use these markers to organize longer commit bodies:
- `Also,` (most common: 26 occurrences)
- `Other,` (13 occurrences)
- `Deats,` (11 occurrences) - for implementation details
- `Further,` (7 occurrences)
- `TODO,` (3 occurrences)
- `Impl details,` (2 occurrences)
- `Notes,` (1 occurrence)
### Common Abbreviations
Use these freely (sorted by frequency):
- `msg` (63) - message
- `bg` (37) - background
- `ctx` (30) - context
- `impl` (27) - implementation
- `mod` (26) - module
- `obvi` (17) - obviously
- `tn` (16) - task name
- `fn` (15) - function
- `vs` (15) - versus
- `bc` (14) - because
- `var` (14) - variable
- `prolly` (9) - probably
- `ep` (6) - entry point
- `OW` (5) - otherwise
- `rn` (4) - right now
- `sig` (4) - signal/signature
- `deps` (3) - dependencies
- `iface` (2) - interface
- `subproc` (2) - subprocess
- `tho` (2) - though
- `ofc` (2) - of course
### Tone and Style
- Casual but technical (use `XD` for humor: 23 times)
- Use `..` for trailing thoughts (108 occurrences)
- Use `Woops,` to acknowledge mistakes (4 subject lines)
- Don't be afraid to show personality while being precise
### Example Bodies
Simple with bullets:
```
Add `multiaddr` and bump up some deps
Since we're planning to use it for (discovery)
addressing, allowing replacement of the hacky (pretend)
attempt in `tractor._multiaddr` Bp
Also pin some deps,
- make us py312+
- use `pdbp` with my frame indexing fix.
- mv to latest `xonsh` for fancy cmd/suggestion injections.
Bump lock file to match obvi!
```
With section markers:
```
Use `stackscope` for all actor-tree rendered "views"
Instead of the (much more) limited and hacky `.devx._code`
impls, move to using the new `.devx._stackscope` API which
wraps the `stackscope` project.
Deats,
- make new `stackscope.extract_stack()` wrapper
- port over frame-descing to `_stackscope.pformat_stack()`
- move `PdbREPL` to use `stackscope` render approach
- update tests for new stack output format
Also,
- tweak log formatting for consistency
- add typing hints throughout
```
## Special Patterns
### WIP Commits
Rare (0.2%) - avoid committing WIP if possible
### Merge Commits
Auto-generated (4.4%), don't worry about style
### File References
- Use `module.py` or `.submodule` style
- Rarely use `file.py:line` references (0 in analysis)
### Links
- GitHub links used sparingly (3 total)
- Prefer code references over external links
## Footer
The default footer should credit `claude` (you) for helping generate
the commit msg content:
```
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
```
Further, if the patch was solely or in part written
by `claude`, instead add:
```
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
```
## Summary Checklist
Before committing, verify:
- [ ] Subject line uses present tense verb
- [ ] Subject line ~50 chars (hard max 67)
- [ ] Code elements wrapped in backticks
- [ ] Body lines ≤67 chars
- [ ] Abbreviations used where natural
- [ ] Casual yet precise tone
- [ ] Section markers if body >3 paragraphs
- [ ] Technical accuracy maintained
## Analysis Metadata
```
Source: tractor repository
Commits analyzed: 500
Date range: 2019-2025
Analysis date: 2026-02-08
```
---
(this style guide was generated by [`claude-code`][claude-code-gh]
analyzing commit history)
[claude-code-gh]: https://github.com/anthropics/claude-code

View File

@ -0,0 +1,231 @@
---
name: conc-anal
description: >
Concurrency analysis for tractor's trio-based
async primitives. Trace task scheduling across
checkpoint boundaries, identify race windows in
shared mutable state, and verify synchronization
correctness. Invoke on code segments the user
points at, OR proactively when reviewing/writing
concurrent cache, lock, or multi-task acm code.
argument-hint: "[file:line-range or function name]"
allowed-tools:
- Read
- Grep
- Glob
- Task
---
Perform a structured concurrency analysis on the
target code. This skill should be invoked:
- **On demand**: user points at a code segment
(file:lines, function name, or pastes a snippet)
- **Proactively**: when writing or reviewing code
that touches shared mutable state across trio
tasks — especially `_Cache`, locks, events, or
multi-task `@acm` lifecycle management
## 0. Identify the target
If the user provides a file:line-range or function
name, read that code. If not explicitly provided,
identify the relevant concurrent code from context
(e.g. the current diff, a failing test, or the
function under discussion).
## 1. Inventory shared mutable state
List every piece of state that is accessed by
multiple tasks. For each, note:
- **What**: the variable/dict/attr (e.g.
`_Cache.values`, `_Cache.resources`,
`_Cache.users`)
- **Scope**: class-level, module-level, or
closure-captured
- **Writers**: which tasks/code-paths mutate it
- **Readers**: which tasks/code-paths read it
- **Guarded by**: which lock/event/ordering
protects it (or "UNGUARDED" if none)
Format as a table:
```
| State | Writers | Readers | Guard |
|---------------------|-----------------|-----------------|----------------|
| _Cache.values | run_ctx, moc¹ | moc | ctx_key lock |
| _Cache.resources | run_ctx, moc | moc, run_ctx | UNGUARDED |
```
¹ `moc` = `maybe_open_context`
## 2. Map checkpoint boundaries
For each code path through the target, mark every
**checkpoint** — any `await` expression where trio
can switch to another task. Use line numbers:
```
L325: await lock.acquire() ← CHECKPOINT
L395: await service_tn.start(...) ← CHECKPOINT
L411: lock.release() ← (not a checkpoint, but changes lock state)
L414: yield (False, yielded) ← SUSPEND (caller runs)
L485: no_more_users.set() ← (wakes run_ctx, no switch yet)
```
**Key trio scheduling rules to apply:**
- `Event.set()` makes waiters *ready* but does NOT
switch immediately
- `lock.release()` is not a checkpoint
- `await sleep(0)` IS a checkpoint
- Code in `finally` blocks CAN have checkpoints
(unlike asyncio)
- `await` inside `except` blocks can be
`trio.Cancelled`-masked
## 3. Trace concurrent task schedules
Write out the **interleaved execution trace** for
the problematic scenario. Number each step and tag
which task executes it:
```
[Task A] 1. acquires lock
[Task A] 2. cache miss → allocates resources
[Task A] 3. releases lock
[Task A] 4. yields to caller
[Task A] 5. caller exits → finally runs
[Task A] 6. users-- → 0, sets no_more_users
[Task A] 7. pops lock from _Cache.locks
[run_ctx] 8. wakes from no_more_users.wait()
[run_ctx] 9. values.pop(ctx_key)
[run_ctx] 10. acm __aexit__ → CHECKPOINT
[Task B] 11. creates NEW lock (old one popped)
[Task B] 12. acquires immediately
[Task B] 13. values[ctx_key] → KeyError
[Task B] 14. resources[ctx_key] → STILL EXISTS
[Task B] 15. 💥 RuntimeError
```
Identify the **race window**: the range of steps
where state is inconsistent. In the example above,
steps 910 are the window (values gone, resources
still alive).
## 4. Classify the bug
Categorize what kind of concurrency issue this is:
- **TOCTOU** (time-of-check-to-time-of-use): state
changes between a check and the action based on it
- **Stale reference**: a task holds a reference to
state that another task has invalidated
- **Lifetime mismatch**: a synchronization primitive
(lock, event) has a shorter lifetime than the
state it's supposed to protect
- **Missing guard**: shared state is accessed
without any synchronization
- **Atomicity gap**: two operations that should be
atomic have a checkpoint between them
## 5. Propose fixes
For each proposed fix, provide:
- **Sketch**: pseudocode or diff showing the change
- **How it closes the window**: which step(s) from
the trace it eliminates or reorders
- **Tradeoffs**: complexity, perf, new edge cases,
impact on other code paths
- **Risk**: what could go wrong (deadlocks, new
races, cancellation issues)
Rate each fix: `[simple|moderate|complex]` impl
effort.
## 6. Output format
Structure the full analysis as:
```markdown
## Concurrency analysis: `<target>`
### Shared state
<table from step 1>
### Checkpoints
<list from step 2>
### Race trace
<interleaved trace from step 3>
### Classification
<bug type from step 4>
### Fixes
<proposals from step 5>
```
## Tractor-specific patterns to watch
These are known problem areas in tractor's
concurrency model. Flag them when encountered:
### `_Cache` lock vs `run_ctx` lifetime
The `_Cache.locks` entry is managed by
`maybe_open_context` callers, but `run_ctx` runs
in `service_tn` — a different task tree. Lock
pop/release in the caller's `finally` does NOT
wait for `run_ctx` to finish tearing down. Any
state that `run_ctx` cleans up in its `finally`
(e.g. `resources.pop()`) is vulnerable to
re-entry races after the lock is popped.
### `values.pop()` → acm `__aexit__``resources.pop()` gap
In `_Cache.run_ctx`, the inner `finally` pops
`values`, then the acm's `__aexit__` runs (which
has checkpoints), then the outer `finally` pops
`resources`. This creates a window where `values`
is gone but `resources` still exists — a classic
atomicity gap.
### Global vs per-key counters
`_Cache.users` as a single `int` (pre-fix) meant
that users of different `ctx_key`s inflated each
other's counts, preventing teardown when one key's
users hit zero. Always verify that per-key state
(`users`, `locks`) is actually keyed on `ctx_key`
and not on `fid` or some broader key.
### `Event.set()` wakes but doesn't switch
`trio.Event.set()` makes waiting tasks *ready* but
the current task continues executing until its next
checkpoint. Code between `.set()` and the next
`await` runs atomically from the scheduler's
perspective. Use this to your advantage (or watch
for bugs where code assumes the woken task runs
immediately).
### `except` block checkpoint masking
`await` expressions inside `except` handlers can
be masked by `trio.Cancelled`. If a `finally`
block runs from an `except` and contains
`lock.release()`, the release happens — but any
`await` after it in the same `except` may be
swallowed. This is why `maybe_open_context`'s
cache-miss path does `lock.release()` in a
`finally` inside the `except KeyError`.
### Cancellation in `finally`
Unlike asyncio, trio allows checkpoints in
`finally` blocks. This means `finally` cleanup
that does `await` can itself be cancelled (e.g.
by nursery shutdown). Watch for cleanup code that
assumes it will run to completion.

View File

@ -0,0 +1,241 @@
# PR/Patch-Request Description Format Reference
Canonical structure for `tractor` patch-request
descriptions, designed to work across GitHub,
Gitea, SourceHut, and GitLab markdown renderers.
**Line length: wrap at 72 chars** for all prose
content (Summary bullets, Motivation paragraphs,
Scopes bullets, etc.). Fill lines *to* 72 — don't
stop short at 50-65. Only raw URLs in
reference-link definitions may exceed this.
## Template
```markdown
<!-- pr-msg-meta
branch: <branch-name>
base: <base-branch>
submitted:
github: ___
gitea: ___
srht: ___
-->
## <Title: present-tense verb + backticked code>
### Summary
- [<hash>][<hash>] Description of change ending
with period.
- [<hash>][<hash>] Another change description
ending with period.
- [<hash>][<hash>] [<hash>][<hash>] Multi-commit
change description.
### Motivation
<1-2 paragraphs: problem/limitation first,
then solution. Hard-wrap at 72 chars.>
### Scopes changed
- [<hash>][<hash>] `pkg.mod.func()` — what
changed.
* [<hash>][<hash>] Also adjusts
`.related_thing()` in same module.
- [<hash>][<hash>] `tests.test_mod` — new/changed
test coverage.
<!--
### Cross-references
Also submitted as
[github-pr][] | [gitea-pr][] | [srht-patch][].
### Links
- [relevant-issue-or-discussion](url)
- [design-doc-or-screenshot](url)
-->
(this pr content was generated in some part by
[`claude-code`][claude-code-gh])
[<hash>]: https://<service>/<owner>/<repo>/commit/<hash>
[claude-code-gh]: https://github.com/anthropics/claude-code
<!-- cross-service pr refs (fill after submit):
[github-pr]: https://github.com/<owner>/<repo>/pull/___
[gitea-pr]: https://<host>/<owner>/<repo>/pulls/___
[srht-patch]: https://git.sr.ht/~<owner>/<repo>/patches/___
-->
```
## Markdown Reference-Link Strategy
Use reference-style links for ALL commit hashes
and cross-service PR refs to ensure cross-service
compatibility:
**Inline usage** (in bullets):
```markdown
- [f3726cf9][f3726cf9] Add `reg_err_types()`
for custom exc lookup.
```
**Definition** (bottom of document):
```markdown
[f3726cf9]: https://github.com/goodboy/tractor/commit/f3726cf9
```
### Why reference-style?
- Keeps prose readable without long inline URLs.
- All URLs in one place — trivially swappable
per-service.
- Most git services auto-link bare SHAs anyway,
but explicit refs guarantee it works in *any*
md renderer.
- The `[hash][hash]` form is self-documenting —
display text matches the ref ID.
- Cross-service PR refs use the same mechanism:
`[github-pr][]` resolves via a ref-link def
at the bottom, trivially fillable post-submit.
## Cross-Service PR Placeholder Mechanism
The generated description includes three layers
of cross-service support, all using native md
reference-links:
### 1. Metadata comment (top of file)
```markdown
<!-- pr-msg-meta
branch: remote_exc_type_registry
base: main
submitted:
github: ___
gitea: ___
srht: ___
-->
```
A YAML-ish HTML comment block. The `___`
placeholders get filled with PR/patch numbers
after submission. Machine-parseable for tooling
(e.g. `gish`) but invisible in rendered md.
### 2. Cross-references section (in body)
```markdown
<!--
### Cross-references
Also submitted as
[github-pr][] | [gitea-pr][] | [srht-patch][].
-->
```
Commented out at generation time. After submitting
to multiple services, uncomment and the ref-links
resolve via the stubs at the bottom.
### 3. Ref-link stubs (bottom of file)
```markdown
<!-- cross-service pr refs (fill after submit):
[github-pr]: https://github.com/goodboy/tractor/pull/___
[gitea-pr]: https://pikers.dev/goodboy/tractor/pulls/___
[srht-patch]: https://git.sr.ht/~goodboy/tractor/patches/___
-->
```
Commented out with `___` number placeholders.
After submission: uncomment, replace `___` with
the actual number. Each service-specific copy
fills in all services' numbers so any copy can
cross-reference the others.
### Post-submission file layout
```
pr_msg_LATEST.md # latest draft (skill root)
msgs/
20260325T002027Z_mybranch_pr_msg.md # timestamped
github/
42_pr_msg.md # github PR #42
gitea/
17_pr_msg.md # gitea PR #17
srht/
5_pr_msg.md # srht patch #5
```
Each `<service>/<num>_pr_msg.md` is a copy with:
- metadata `submitted:` fields filled in
- cross-references section uncommented
- ref-link stubs uncommented with real numbers
- all services cross-linked in each copy
This mirrors the `gish` skill's
`<backend>/<num>.md` pattern.
## Commit-Link URL Patterns by Service
| Service | Pattern |
|-----------|-------------------------------------|
| GitHub | `https://github.com/<o>/<r>/commit/<h>` |
| Gitea | `https://<host>/<o>/<r>/commit/<h>` |
| SourceHut | `https://git.sr.ht/~<o>/<r>/commit/<h>` |
| GitLab | `https://gitlab.com/<o>/<r>/-/commit/<h>` |
## PR/Patch URL Patterns by Service
| Service | Pattern |
|-----------|-------------------------------------|
| GitHub | `https://github.com/<o>/<r>/pull/<n>` |
| Gitea | `https://<host>/<o>/<r>/pulls/<n>` |
| SourceHut | `https://git.sr.ht/~<o>/<r>/patches/<n>` |
| GitLab | `https://gitlab.com/<o>/<r>/-/merge_requests/<n>` |
## Scope Naming Convention
Use Python namespace-resolution syntax for
referencing changed code scopes:
| File path | Scope reference |
|---------------------------|-------------------------------|
| `tractor/_exceptions.py` | `tractor._exceptions` |
| `tractor/_state.py` | `tractor._state` |
| `tests/test_foo.py` | `tests.test_foo` |
| Function in module | `tractor._exceptions.func()` |
| Method on class | `.RemoteActorError.src_type` |
| Class | `tractor._exceptions.RAE` |
Prefix with the package path for top-level refs;
use leading-dot shorthand (`.ClassName.method()`)
for sub-bullets where the parent module is already
established.
## Title Conventions
Same verb vocabulary as commit messages:
- `Add` — wholly new feature/API
- `Fix` — bug fix
- `Drop` — removal
- `Use` — adopt new approach
- `Move`/`Mv` — relocate code
- `Adjust` — minor tweak
- `Update` — enhance existing feature
- `Support` — add support for something
Target 50 chars, hard max 70. Always backtick
code elements.
## Tone
Casual yet technically precise — matching the
project's commit-msg style. Terse but every bullet
carries signal. Use project abbreviations freely
(msg, bg, ctx, impl, mod, obvi, fn, bc, var,
prolly, ep, etc.).
---
(this format reference was generated by
[`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

View File

@ -0,0 +1,358 @@
---
name: run-tests
description: >
Run tractor test suite (or subsets). Use when the user wants
to run tests, verify changes, or check for regressions.
argument-hint: "[test-path-or-pattern] [--opts]"
allowed-tools:
- Bash(python -m pytest *)
- Bash(python -c *)
- Bash(python --version *)
- Bash(UV_PROJECT_ENVIRONMENT=py* uv run python *)
- Bash(UV_PROJECT_ENVIRONMENT=py* uv run pytest *)
- Bash(UV_PROJECT_ENVIRONMENT=py* uv sync *)
- Bash(UV_PROJECT_ENVIRONMENT=py* uv pip show *)
- Bash(git rev-parse *)
- Bash(ls *)
- Bash(cat *)
- Bash(jq * .pytest_cache/*)
- Read
- Grep
- Glob
- Task
- AskUserQuestion
---
Run the `tractor` test suite using `pytest`. Follow this
process:
## 1. Parse user intent
From the user's message and any arguments, determine:
- **scope**: full suite, specific file(s), specific
test(s), or a keyword pattern (`-k`).
- **transport**: which IPC transport protocol to test
against (default: `tcp`, also: `uds`).
- **options**: any extra pytest flags the user wants
(e.g. `--ll debug`, `--tpdb`, `-x`, `-v`).
If the user provides a bare path or pattern as argument,
treat it as the test target. Examples:
- `/run-tests` → full suite
- `/run-tests test_local.py` → single file
- `/run-tests test_registrar -v` → file + verbose
- `/run-tests -k cancel` → keyword filter
- `/run-tests tests/ipc/ --tpt-proto uds` → subdir + UDS
## 2. Construct the pytest command
Base command:
```
python -m pytest
```
### Default flags (always include unless user overrides):
- `-x` (stop on first failure)
- `--tb=short` (concise tracebacks)
- `--no-header` (reduce noise)
### Path resolution:
- If the user gives a bare filename like `test_local.py`,
resolve it under `tests/`.
- If the user gives a subdirectory like `ipc/`, resolve
under `tests/ipc/`.
- Glob if needed: `tests/**/test_*<pattern>*.py`
### Key pytest options for this project:
| Flag | Purpose |
|---|---|
| `--ll <level>` | Set tractor log level (e.g. `debug`, `info`, `runtime`) |
| `--tpdb` / `--debug-mode` | Enable tractor's multi-proc debugger |
| `--tpt-proto <key>` | IPC transport: `tcp` (default) or `uds` |
| `--spawn-backend <be>` | Spawn method: `trio` (default), `mp_spawn`, `mp_forkserver` |
| `-k <expr>` | pytest keyword filter |
| `-v` / `-vv` | Verbosity |
| `-s` | No output capture (useful with `--tpdb`) |
### Common combos:
```sh
# quick smoke test of core modules
python -m pytest tests/test_local.py tests/test_rpc.py -x --tb=short --no-header
# full suite, stop on first failure
python -m pytest tests/ -x --tb=short --no-header
# specific test with debug
python -m pytest tests/discovery/test_registrar.py::test_reg_then_unreg -x -s --tpdb --ll debug
# run with UDS transport
python -m pytest tests/ -x --tb=short --no-header --tpt-proto uds
# keyword filter
python -m pytest tests/ -x --tb=short --no-header -k "cancel and not slow"
```
## 3. Pre-flight: venv detection (MANDATORY)
**Always verify a `uv` venv is active before running
`python` or `pytest`.** This project uses
`UV_PROJECT_ENVIRONMENT=py<MINOR>` naming (e.g.
`py313`) — never `.venv`.
### Step 1: detect active venv
Run this check first:
```sh
python -c "
import sys, os
venv = os.environ.get('VIRTUAL_ENV', '')
prefix = sys.prefix
print(f'VIRTUAL_ENV={venv}')
print(f'sys.prefix={prefix}')
print(f'executable={sys.executable}')
"
```
### Step 2: interpret results
**Case A — venv is active** (`VIRTUAL_ENV` is set
and points to a `py<MINOR>/` dir under the project
root or worktree):
Use bare `python` / `python -m pytest` for all
commands. This is the normal, fast path.
**Case B — no venv active** (`VIRTUAL_ENV` is empty
or `sys.prefix` points to a system Python):
Use `AskUserQuestion` to ask the user:
> "No uv venv is active. Should I activate one
> via `UV_PROJECT_ENVIRONMENT=py<MINOR> uv sync`,
> or would you prefer to activate your shell venv
> first?"
Options:
1. **"Create/sync venv"** — run
`UV_PROJECT_ENVIRONMENT=py<MINOR> uv sync` where
`<MINOR>` is detected from `python --version`
(e.g. `313` for 3.13). Then use
`py<MINOR>/bin/python` for all subsequent
commands in this session.
2. **"I'll activate it myself"** — stop and let the
user `source py<MINOR>/bin/activate` or similar.
**Case C — inside a git worktree** (`git rev-parse
--git-common-dir` differs from `--git-dir`):
Verify Python resolves from the **worktree's own
venv**, not the main repo's:
```sh
python -c "import tractor; print(tractor.__file__)"
```
If the path points outside the worktree, create a
worktree-local venv:
```sh
UV_PROJECT_ENVIRONMENT=py<MINOR> uv sync
```
Then use `py<MINOR>/bin/python` for all commands.
**Why this matters**: without the correct venv,
subprocesses spawned by tractor resolve modules
from the wrong editable install, causing spurious
`AttributeError` / `ModuleNotFoundError`.
### Fallback: `uv run`
If the user can't or won't activate a venv, all
`python` and `pytest` commands can be prefixed
with `UV_PROJECT_ENVIRONMENT=py<MINOR> uv run`:
```sh
# instead of: python -m pytest tests/ -x
UV_PROJECT_ENVIRONMENT=py313 uv run pytest tests/ -x
# instead of: python -c 'import tractor'
UV_PROJECT_ENVIRONMENT=py313 uv run python -c 'import tractor'
```
`uv run` auto-discovers the project and venv,
but is slower than a pre-activated venv due to
lock-file resolution on each invocation. Prefer
activating the venv when possible.
### Step 3: import + collection checks
After venv is confirmed, always run these
(especially after refactors or module moves):
```sh
# 1. package import smoke check
python -c 'import tractor; print(tractor)'
# 2. verify all tests collect (no import errors)
python -m pytest tests/ -x -q --co 2>&1 | tail -5
```
If either fails, fix the import error before running
any actual tests.
## 4. Run and report
- Run the constructed command.
- Use a timeout of **600000ms** (10min) for full suite
runs, **120000ms** (2min) for single-file runs.
- If the suite is large (full `tests/`), consider running
in the background and checking output when done.
- Use `--lf` (last-failed) to re-run only previously
failing tests when iterating on a fix.
### On failure:
- Show the failing test name(s) and short traceback.
- If the failure looks related to recent changes, point
out the likely cause and suggest a fix.
- **Check the known-flaky list** (section 8) before
investigating — don't waste time on pre-existing
timeout issues.
- **NEVER auto-commit fixes.** If you apply a code fix
during test iteration, leave it unstaged. Tell the
user what changed and suggest they review the
worktree state, stage files manually, and use
`/commit-msg` (inline or in a separate session) to
generate the commit message. The human drives all
`git add` and `git commit` operations.
### On success:
- Report the pass/fail/skip counts concisely.
## 5. Test directory layout (reference)
```
tests/
├── conftest.py # root fixtures, daemon, signals
├── devx/ # debugger/tooling tests
├── ipc/ # transport protocol tests
├── msg/ # messaging layer tests
├── discovery/ # discovery subsystem tests
│ ├── test_multiaddr.py # multiaddr construction
│ └── test_registrar.py # registry/discovery protocol
├── test_local.py # registrar + local actor basics
├── test_rpc.py # RPC error handling
├── test_spawning.py # subprocess spawning
├── test_multi_program.py # multi-process tree tests
├── test_cancellation.py # cancellation semantics
├── test_context_stream_semantics.py # ctx streaming
├── test_inter_peer_cancellation.py # peer cancel
├── test_infected_asyncio.py # trio-in-asyncio
└── ...
```
## 6. Change-type → test mapping
After modifying specific modules, run the corresponding
test subset first for fast feedback:
| Changed module(s) | Run these tests first |
|---|---|
| `runtime/_runtime.py`, `runtime/_state.py` | `test_local.py test_rpc.py test_spawning.py test_root_runtime.py` |
| `discovery/` (`_registry`, `_discovery`, `_addr`) | `tests/discovery/ test_multi_program.py test_local.py` |
| `_context.py`, `_streaming.py` | `test_context_stream_semantics.py test_advanced_streaming.py` |
| `ipc/` (`_chan`, `_server`, `_transport`) | `tests/ipc/ test_2way.py` |
| `runtime/_portal.py`, `runtime/_rpc.py` | `test_rpc.py test_cancellation.py` |
| `spawn/` (`_spawn`, `_entry`) | `test_spawning.py test_multi_program.py` |
| `devx/debug/` | `tests/devx/test_debugger.py` (slow!) |
| `to_asyncio.py` | `test_infected_asyncio.py test_root_infect_asyncio.py` |
| `msg/` | `tests/msg/` |
| `_exceptions.py` | `test_remote_exc_relay.py test_inter_peer_cancellation.py` |
| `runtime/_supervise.py` | `test_cancellation.py test_spawning.py` |
## 7. Quick-check shortcuts
### After refactors (fastest first-pass):
```sh
# import + collect check
python -c 'import tractor' && python -m pytest tests/ -x -q --co 2>&1 | tail -3
# core subset (~10s)
python -m pytest tests/test_local.py tests/test_rpc.py tests/test_spawning.py tests/discovery/test_registrar.py -x --tb=short --no-header
```
### Inspect last failures (without re-running):
When the user asks "what failed?", "show failures",
or wants to check the last-failed set before
re-running — read the pytest cache directly. This
is instant and avoids test collection overhead.
```sh
python -c "
import json, pathlib, sys
p = pathlib.Path('.pytest_cache/v/cache/lastfailed')
if not p.exists():
print('No lastfailed cache found.'); sys.exit()
data = json.loads(p.read_text())
# filter to real test node IDs (ignore junk
# entries that can accumulate from system paths)
tests = sorted(k for k in data if k.startswith('tests/'))
if not tests:
print('No failures recorded.')
else:
print(f'{len(tests)} last-failed test(s):')
for t in tests:
print(f' {t}')
"
```
**Why not `--cache-show` or `--co --lf`?**
- `pytest --cache-show 'cache/lastfailed'` works
but dumps raw dict repr including junk entries
(stale system paths that leak into the cache).
- `pytest --co --lf` actually *collects* tests which
triggers import resolution and is slow (~0.5s+).
Worse, when cached node IDs don't exactly match
current parametrize IDs (e.g. param names changed
between runs), pytest falls back to collecting
the *entire file*, giving false positives.
- Reading the JSON directly is instant, filterable
to `tests/`-prefixed entries, and shows exactly
what pytest recorded — no interpretation.
**After inspecting**, re-run the failures:
```sh
python -m pytest --lf -x --tb=short --no-header
```
### Full suite in background:
When core tests pass and you want full coverage while
continuing other work, run in background:
```sh
python -m pytest tests/ -x --tb=short --no-header -q
```
(use `run_in_background=true` on the Bash tool)
## 8. Known flaky tests
These tests have **pre-existing** timing/environment
sensitivity. If they fail with `TooSlowError` or
pexpect `TIMEOUT`, they are almost certainly NOT caused
by your changes — note them and move on.
| Test | Typical error | Notes |
|---|---|---|
| `devx/test_debugger.py::test_multi_nested_subactors_error_through_nurseries` | pexpect TIMEOUT | Debugger pexpect timing |
| `test_cancellation.py::test_cancel_via_SIGINT_other_task` | TooSlowError | Signal handling race |
| `test_inter_peer_cancellation.py::test_peer_spawns_and_cancels_service_subactor` | TooSlowError | Async timing (both param variants) |
| `test_docs_examples.py::test_example[we_are_processes.py]` | `assert None == 0` | `__main__` missing `__file__` in subproc |
**Rule of thumb**: if a test fails with `TooSlowError`,
`trio.TooSlowError`, or `pexpect.TIMEOUT` and you didn't
touch the relevant code path, it's flaky — skip it.

View File

@ -1,10 +1,18 @@
name: CI name: CI
# NOTE distilled from,
# https://github.com/orgs/community/discussions/26276
on: on:
# any time someone pushes a new branch to origin # any time a new update to 'main'
push: push:
branches:
- main
# Allows you to run this workflow manually from the Actions tab # for on all (forked) PRs to repo
# NOTE, use a draft PR if you just want CI triggered..
pull_request:
# to run workflow manually from the "Actions" tab
workflow_dispatch: workflow_dispatch:
jobs: jobs:
@ -74,24 +82,44 @@ jobs:
# run: mypy tractor/ --ignore-missing-imports --show-traceback # run: mypy tractor/ --ignore-missing-imports --show-traceback
testing-linux: testing:
name: '${{ matrix.os }} Python ${{ matrix.python }} - ${{ matrix.spawn_backend }}' name: '${{ matrix.os }} Python${{ matrix.python-version }} spawn_backend=${{ matrix.spawn_backend }} tpt_proto=${{ matrix.tpt_proto }}'
timeout-minutes: 10 timeout-minutes: 16
runs-on: ${{ matrix.os }} runs-on: ${{ matrix.os }}
strategy: strategy:
fail-fast: false fail-fast: false
matrix: matrix:
os: [ubuntu-latest] os: [
python-version: ['3.13'] ubuntu-latest,
macos-latest,
]
python-version: [
'3.13',
# '3.14',
]
spawn_backend: [ spawn_backend: [
'trio', 'trio',
# 'mp_spawn', # 'mp_spawn',
# 'mp_forkserver', # 'mp_forkserver',
# ?TODO^ is it worth it to get these running again?
#
# - [ ] next-gen backends, on 3.13+
# https://github.com/goodboy/tractor/issues/379
# 'subinterpreter',
# 'subint',
] ]
tpt_proto: [
'tcp',
'uds',
]
# https://github.com/orgs/community/discussions/26253#discussioncomment-3250989
exclude:
# don't do UDS run on macOS (for now)
- os: macos-latest
tpt_proto: 'uds'
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v4
- name: 'Install uv + py-${{ matrix.python-version }}' - name: 'Install uv + py-${{ matrix.python-version }}'
@ -118,7 +146,11 @@ jobs:
run: uv tree run: uv tree
- name: Run tests - name: Run tests
run: uv run pytest tests/ --spawn-backend=${{ matrix.spawn_backend }} -rsx run: >
uv run
pytest tests/ -rsx
--spawn-backend=${{ matrix.spawn_backend }}
--tpt-proto=${{ matrix.tpt_proto }}
# XXX legacy NOTE XXX # XXX legacy NOTE XXX
# #

66
.gitignore vendored
View File

@ -102,3 +102,69 @@ venv.bak/
# mypy # mypy
.mypy_cache/ .mypy_cache/
# all files under
.git/
# require very explicit staging for anything we **really**
# want put/kept in repo.
notes_to_self/
snippets/
# ------- AI shiz -------
# `ai.skillz` symlinks,
# (machine-local, deploy via deploy-skill.sh)
.claude/skills/py-codestyle
.claude/skills/close-wkt
.claude/skills/plan-io
.claude/skills/prompt-io
.claude/skills/resolve-conflicts
.claude/skills/inter-skill-review
# /open-wkt specifics
.claude/skills/open-wkt
.claude/wkts/
claude_wkts
# /code-review-changes specifics
.claude/skills/code-review-changes
# review-skill ephemeral ctx (per-PR, single-use)
.claude/review_context.md
.claude/review_regression.md
# /pr-msg specifics
.claude/skills/pr-msg/*
# repo-specific
!.claude/skills/pr-msg/format-reference.md
# XXX, so u can nvim-telescope this file.
# !.claude/skills/pr-msg/pr_msg_LATEST.md
# /commit-msg specifics
# - any commit-msg gen tmp files
.claude/*_commit_*.md
.claude/*_commit*.txt
.claude/skills/commit-msg/*
!.claude/skills/commit-msg/style-duie-reference.md
# use prompt-io instead?
.claude/plans
# nix develop --profile .nixdev
.nixdev*
# :Obsession .
Session.vim
# `gish` local `.md`-files
# TODO? better all around automation!
# -[ ] it'd be handy to also commit and sync with wtv git service?
# -[ ] everything should be put under a `.gish/` no?
gitea/
gh/
# ------ macOS ------
# Finder metadata
**/.DS_Store
# LLM conversations that should remain private
docs/conversations/

View File

@ -0,0 +1,54 @@
---
model: claude-opus-4-6
service: claude
session: (ad-hoc, not tracked via conf.toml)
timestamp: 2026-04-06T17:28:48Z
git_ref: 02b2ef1
scope: tests
substantive: true
raw_file: 20260406T172848Z_02b2ef1_prompt_io.raw.md
---
## Prompt
User asked to extend `tests/test_resource_cache.py` with a test
that reproduces the edge case fixed in commit `02b2ef18` (per-key
locking+user tracking in `maybe_open_context()`). The bug was
originally triggered in piker's `brokerd.kraken` backend where the
same `acm_func` was called with different kwargs, and the old
global `_Cache.users` counter caused:
- teardown skipped for one `ctx_key` bc another key's users kept
the global count > 0
- re-entry hitting `assert not resources.get(ctx_key)` during the
teardown window
User requested a test that would fail under the old code and pass
with the fix.
## Response summary
Designed and implemented `test_per_ctx_key_resource_lifecycle`
which verifies per-`ctx_key` resource isolation by:
1. Holding resource `'a'` open in a bg task
2. Opening+closing resource `'b'` (same `acm_func`, different
kwargs) while `'a'` is still alive
3. Re-opening `'b'` and asserting cache MISS — proving `'b'` was
torn down independently despite `'a'` keeping its own user
count > 0
With the old global counter, phase 3 would produce a stale cache
HIT (leaked resource) or crash on the assert.
Also added a trivial `acm_with_resource(resource_id)` ACM helper
at module level.
## Files changed
- `tests/test_resource_cache.py` — add `acm_with_resource` ACM +
`test_per_ctx_key_resource_lifecycle` test fn
## Human edits
None — committed as generated (pending user review).

View File

@ -0,0 +1,57 @@
---
model: claude-opus-4-6
service: claude
session: (ad-hoc, not tracked via conf.toml)
timestamp: 2026-04-06T19:31:25Z
git_ref: 85f9c5d
scope: tests
substantive: true
raw_file: 20260406T193125Z_85f9c5d_prompt_io.raw.md
---
## Prompt
User asked to reproduce the exact `assert not resources.get(ctx_key)`
crash originally seen in piker's `brokerd.kraken` backend via
`open_cached_client('kraken')`. Key constraints from user:
- In piker, kwargs were the **same** (empty) so all callers
share one `ctx_key = (fid, ())`
- The root issue is `_Cache.locks` being indexed by `fid`
rather than `ctx_key`, plus the race window between
`values.pop()` and `resources.pop()` in `_Cache.run_ctx`
(the acm `__aexit__` has checkpoints in between)
- Mark the test `@pytest.mark.xfail` so it can be committed
against the unpatched code and later adjusted once the fix
from `9e49eddd` is applied
User pointed to piker's `open_cached_client('kraken')` usage
as the real-world pattern to model.
## Response summary
Added `test_moc_reentry_during_teardown` which deterministically
reproduces the teardown race:
1. A `cached_client()` acm (no params, like
`kraken.api.get_client()`) signals via `in_aexit` event
when its `__aexit__` starts, then sleeps to hold the
window open
2. `use_and_exit` task enters+exits the cached resource
3. `reenter_during_teardown` waits on `in_aexit` then
re-enters — landing exactly in the window where
`values` is popped but `resources` still exists
4. The old `assert not resources.get(ctx_key)` fires
5. `@pytest.mark.xfail(raises=AssertionError)` catches it
Test confirmed: `XFAIL` with the exact traceback at
`_mngrs.py:327`.
## Files changed
- `tests/test_resource_cache.py` — add
`test_moc_reentry_during_teardown` xfail test fn
## Human edits
None — committed as generated (pending user review).

View File

@ -0,0 +1,52 @@
---
model: claude-opus-4-6
service: claude
session: (ad-hoc, not tracked via conf.toml)
timestamp: 2026-04-06T22:26:48Z
git_ref: f086222
scope: code
substantive: true
raw_file: 20260406T222648Z_f086222_prompt_io.raw.md
---
## Prompt
User applied the per-key locking patch (`f086222d`),
removed the `@pytest.mark.xfail` from
`test_moc_reentry_during_teardown`, and found the
test STILL fails — now with `RuntimeError` instead
of `AssertionError`. Asked to diagnose why the fix
doesn't actually resolve this race.
## Response summary
Traced the exact scheduling sequence across three
concurrent tasks to show that the per-`ctx_key`
locking change doesn't close the fundamental gap:
- `maybe_open_context`'s finally pops the lock and
sets `no_more_users` in the **exiting caller's
task**
- But the actual resource cleanup (`values.pop`,
acm `__aexit__`, `resources.pop`) runs in the
**`run_ctx` task** inside `service_tn`
- A re-entering task creates a fresh lock (old one
popped), acquires immediately, and hits stale
`resources[ctx_key]` while `run_ctx` is still in
`__aexit__`
Identified three possible fix strategies:
1. Let `run_ctx` own lock release/pop after
`resources.pop()`
2. Add a "teardown complete" event that new callers
wait on
3. Move `resources.pop()` into `run_ctx`'s inner
finally (before acm `__aexit__`)
## Files changed
No code changes — analysis only.
## Human edits
N/A — diagnostic output, no code generated.

View File

@ -0,0 +1,32 @@
---
model: claude-opus-4-6
service: claude
timestamp: 2026-04-06T22:26:48Z
git_ref: f086222
diff_cmd: git diff f086222~1..f086222
---
Here's the problem. The per-key locking doesn't fix this race because the **lock lifetime is decoupled from `run_ctx`'s lifetime**. Trace through what happens:
**`maybe_open_context` finally** (lines 468-495):
1. `users[ctx_key] -= 1` → 0
2. `no_more_users.set()` → wakes `run_ctx` (but it doesn't run yet, just becomes ready)
3. `_Cache.locks.pop(ctx_key)` → **lock is gone**
4. `maybe_open_context` exits
**`run_ctx`** now gets scheduled (lines 237-241):
5. Inner finally: `values.pop(ctx_key)` → values gone
6. `mng.__aexit__()` runs → `in_aexit.set()`, `await trio.sleep(10)` → **CHECKPOINT**
**`reenter_during_teardown`** wakes:
7. Enters `maybe_open_context` → no lock exists for `ctx_key` → creates **fresh** lock, acquires immediately
8. `values[ctx_key]``KeyError`
9. `resources.get(ctx_key)`**still exists!** (step 6 hasn't reached `resources.pop()` in `run_ctx`'s outer finally yet)
10. `raise RuntimeError('Caching resources ALREADY exist?!')`
The core issue: `no_more_users.set()` (step 2) and `locks.pop()` (step 3) happen in the **exiting caller's task**, but the actual resource cleanup (`values.pop`, acm `__aexit__`, `resources.pop`) happens in the **`run_ctx` task** inside `service_tn`. There's no synchronization between "lock removed, new callers welcome" and "run_ctx has actually finished tearing down". The per-`ctx_key` lock change just made the lock more granular — it didn't close this gap.
The fix needs to ensure a new caller can't acquire a lock for `ctx_key` until `run_ctx` has fully exited (including `resources.pop()`). Either:
- **don't pop the lock in `maybe_open_context`** — let `run_ctx` release/pop it after `resources.pop()`
- **wait on a "teardown complete" event** in the cache-miss path when stale `resources` exist
- **pop `resources` before the acm `__aexit__`** (move `resources.pop()` into the inner finally alongside `values.pop()`)

View File

@ -0,0 +1,74 @@
---
model: claude-opus-4-6
service: claude
session: (ad-hoc, not tracked via conf.toml)
timestamp: 2026-04-13T19:21:16Z
git_ref: f851f28
scope: code
substantive: true
---
## Prompt
User provided a detailed implementation plan for
a new `test_tpt_bind_addrs.py` test suite covering
`open_root_actor(tpt_bind_addrs=...)` — the three
runtime code paths in `_root.py:385-450`:
1. Non-registrar, no explicit bind -> random addrs
2. Registrar, no explicit bind -> registry_addrs
3. Explicit bind given -> `wrap_address()` + merge
Plan specified 6 test functions (~10 parametrized
variants), predicted a type-mixing bug at line 446,
and asked for an in-flight fix if confirmed.
## Response summary
Created `tests/discovery/test_tpt_bind_addrs.py`
with 9 collected test variants across 6 functions:
- `test_registrar_root_tpt_bind_addrs` (3 variants:
`bind-eq-reg`, `bind-subset-reg`,
`bind-disjoint-reg`)
- `test_non_registrar_root_tpt_bind_addrs`
(2 variants: `bind-same-bindspace`,
`bind-disjoint`)
- `test_non_registrar_default_random_bind`
(baseline, no explicit bind)
- `test_tpt_bind_addrs_as_maddr_str`
(multiaddr string input)
- `test_registrar_merge_binds_union`
(registrar + disjoint bind -> union)
- `test_open_nursery_forwards_tpt_bind_addrs`
(`open_nursery(**kwargs)` forwarding)
Confirmed and fixed the predicted bug at
`_root.py:446`: the registrar merge path mixed
`Address` objects (`tpt_bind_addrs`) with raw tuples
(`uw_reg_addrs`) inside `set()`, preventing
deduplication and causing double-bind `OSError`.
Fix: wrap `uw_reg_addrs` before the set union:
```python
# before (broken)
tpt_bind_addrs = list(set(
tpt_bind_addrs + uw_reg_addrs
))
# after (fixed)
tpt_bind_addrs = list(set(
tpt_bind_addrs
+ [wrap_address(a) for a in uw_reg_addrs]
))
```
All 9 tests pass after the fix.
## Files changed
- `tests/discovery/test_tpt_bind_addrs.py` (new)
- `tractor/_root.py:446` (bug fix, 1 line)
## Human edits
N/A — pending review.

View File

@ -0,0 +1,50 @@
---
model: claude-opus-4-6
service: claude
session: 76154e65-d8e1-4b5f-9275-0ea45ba7e98a
timestamp: 2026-04-13T20:50:48Z
git_ref: 269d939c
scope: code
substantive: true
raw_file: 20260413T205048Z_269d939c_prompt_io.raw.md
---
## Prompt
Implement a `parse_endpoints()` API in
`tractor.discovery._multiaddr` that lets downstream
projects declare per-actor transport bind addresses
as a service table of actor-name -> multiaddr
strings (loaded from config, e.g. TOML `[network]`
section). Add type aliases `EndpointsTable` and
`ParsedEndpoints`, plus 7 unit tests covering TCP,
UDS, tuples, mixed input, unsupported protocols, and
edge cases.
A detailed implementation plan was provided
specifying insertion points, type signatures, test
names, and verification commands.
## Response summary
Add `parse_endpoints()` function with two supporting
type aliases to `_multiaddr.py`. The function
iterates a `dict[str, list[str|tuple]]` service
table and delegates each entry to the existing
`wrap_address()` helper, which already handles maddr
strings, raw tuples, and pre-wrapped `Address`
objects. Seven pure unit tests (no trio runtime)
validate all input variants and edge cases. All 22
multiaddr tests pass.
## Files changed
- `tractor/discovery/_multiaddr.py` — add
`EndpointsTable`, `ParsedEndpoints` type aliases
and `parse_endpoints()` function after line 133
- `tests/discovery/test_multiaddr.py` — add
`parse_endpoints` import and 7 new test functions
## Human edits
None — committed as generated.

View File

@ -0,0 +1,32 @@
---
model: claude-opus-4-6
service: claude
timestamp: 2026-04-13T20:50:48Z
git_ref: 269d939c
diff_cmd: git diff 269d939c~1..269d939c
---
## Implementation output
### `tractor/discovery/_multiaddr.py`
> `git diff 269d939c~1..269d939c -- tractor/discovery/_multiaddr.py`
Type aliases `EndpointsTable` and `ParsedEndpoints`
plus `parse_endpoints()` helper that wraps each
entry via `wrap_address()`.
### `tests/discovery/test_multiaddr.py`
> `git diff 269d939c~1..269d939c -- tests/discovery/test_multiaddr.py`
Add import of `parse_endpoints` to existing import
block, then 7 new test functions.
### Test run output
```
22 passed, 1 warning in 0.05s
```
All 22 tests pass (15 existing + 7 new).

View File

@ -0,0 +1,38 @@
---
model: claude-opus-4-6
service: claude
session: multiaddr-support-rename-prefer
timestamp: 2026-04-14T16:33:00Z
git_ref: befedc49
scope: code
substantive: true
raw_file: 20260414T163300Z_befedc49_prompt_io.raw.md
---
## Prompt
Create a helper function that determines the best transport given
actor locality (distributed vs same host). Use PID/hostname
comparison for locality detection, apply at registry addr selection
only (not spawn-time).
## Response summary
New `prefer_addr()` + `_is_local_addr()` helpers
in `_api.py` using `socket.getaddrinfo()` and
`ipaddress` for PID/hostname locality detection.
Preference: UDS > local TCP > remote TCP.
Integrated into `query_actor()` and
`wait_for_actor()`. Also changed
`Registrar.find_actor()` to return full addr list
so callers can apply preference.
## Files changed
- `tractor/discovery/_discovery.py``_api.py`
— renamed + added `prefer_addr()`,
`_is_local_addr()`; updated `query_actor()` and
`wait_for_actor()` call sites
- `tractor/discovery/_registry.py`
`Registrar.find_actor()` returns
`list[UnwrappedAddress]|None`

View File

@ -0,0 +1,62 @@
---
model: claude-opus-4-6
service: claude
timestamp: 2026-04-14T16:33:00Z
git_ref: befedc49
diff_cmd: git diff befedc49~1..befedc49
---
### `tractor/discovery/_api.py`
> `git diff befedc49~1..befedc49 -- tractor/discovery/_api.py`
Add `_is_local_addr()` and `prefer_addr()` transport
preference helpers.
#### `_is_local_addr(addr: Address) -> bool`
Determines whether an `Address` is reachable on the
local host:
- `UDSAddress`: always returns `True`
(filesystem-bound, inherently local)
- `TCPAddress`: checks if `._host` is a loopback IP
via `ipaddress.ip_address().is_loopback`, then
falls back to comparing against the machine's own
interface IPs via
`socket.getaddrinfo(socket.gethostname(), None)`
#### `prefer_addr(addrs: list[UnwrappedAddress]) -> UnwrappedAddress`
Selects the "best" transport address from a
multihomed actor's address list. Wraps each
candidate via `wrap_address()` to get typed
`Address` objects, then classifies into three tiers:
1. **UDS** (same-host guaranteed, lowest overhead)
2. **TCP loopback / same-host IP** (local network)
3. **TCP remote** (only option for distributed)
Within each tier, the last-registered (latest) entry
is preferred. Falls back to `addrs[-1]` if no
heuristic matches.
### `tractor/discovery/_registry.py`
> `git diff befedc49~1..befedc49 -- tractor/discovery/_registry.py`
`Registrar.find_actor()` return type broadened from
single addr to `list[UnwrappedAddress]|None` — full
addr list lets callers apply transport preference.
#### Integration
`query_actor()` and `wait_for_actor()` now call
`prefer_addr(addrs)` instead of `addrs[-1]`.
### Verification
All discovery tests pass (13/13 non-daemon).
`test_local.py` and `test_multi_program.py` also
pass (daemon fixture teardown failures are
pre-existing and unrelated).

View File

@ -0,0 +1,101 @@
---
model: claude-opus-4-7[1m]
service: claude
session: subints-spawner-design-kickoff
timestamp: 2026-04-17T03:49:18Z
git_ref: 9703210
scope: docs
substantive: true
raw_file: 20260417T034918Z_9703210_prompt_io.raw.md
---
## Prompt
Drive the "first big boi, from GH issue" task seeded by
`ai/prompt-io/prompts/subints_spawner.md`: design, plan
and implement sub-interpreter (subint) spawn-backend
support per issue #379, including (1) modularizing
`tractor.spawn._spawn` into per-backend submods, (2) a new
`._subint` backend, and (3) harness parametrization via the
existing `--spawn-backend` / `start_method` pytest fixture
in `tractor._testing.pytest`.
Follow-up clarifications from the user (this turn):
1. Pin `<3.15` on this dev branch and feature-gate subint
tests — chose option (a).
2. Split Phase A (modularization) into its own PR first.
3. Defer the `fork()`-via-subint hack to a follow-up.
4. Harness flag is `pytest --spawn-backend <key>` CLI →
`start_method` fixture (prompt file updated to match).
## Response summary
Produced a three-phase plan and a concrete Phase A (pure
modularization) file-split plan for user review; no code
written yet — the green-light to start Phase A was given
in this same turn conditional on logging this prompt-io
entry first.
Phases:
- **A — modularize** `tractor/spawn/_spawn.py` (847 LOC):
keep generic machinery in `_spawn.py`, extract
`trio_proc``spawn/_trio.py`, `mp_proc`
`spawn/_mp.py`. No pin bump.
- **B — `_subint` backend**: bump `pyproject.toml`
`requires-python` upper to `<3.15`; add `'subint'` to
`SpawnMethodKey`; reuse existing UDS transport; shm
escape-hatch deferred.
- **C — harness**: drive the valid-backend tuple in
`tractor/_testing/pytest.py:345-349` from
`typing.get_args(SpawnMethodKey)`; skip subint tests on
Python < 3.14.
Key findings surfaced to the user:
- `pyproject.toml:12` currently pins `<3.14`; PEP 734
`concurrent.interpreters` only ships in 3.14 — the
load-bearing constraint.
- `_testing/pytest.py:345-349` hardcodes valid backends
as a string tuple (`'mp_spawn'`, `'mp_forkserver'`,
`'trio'`) — should be `get_args(SpawnMethodKey)`.
- `_testing/pytest.py:228` already imports
`try_set_start_method` from `tractor.spawn._spawn`
keeping the `_spawn.py` path as the "core" module
avoids breaking external refs during Phase A.
Phase A commit breakdown (each green via `/run-tests`):
1. Drive `_testing/pytest.py` backend-tuple from
`get_args(SpawnMethodKey)` — prep.
2. Extract `trio_proc``spawn/_trio.py`; `_methods`
registry imports it.
3. Extract `mp_proc``spawn/_mp.py`; `_methods` imports
it.
4. Tidy `spawn/__init__.py` docstring for the new layout.
Workflow: Phase A runs in a new worktree via `/open-wkt
spawn_modularize` (branch `wkt/spawn_modularize`) and
lands as its own PR before Phase B/C returns to the
current `claude_bun_opus4.7_subint_spawner_backend`
branch.
## Files changed
None this turn — design/planning only. The user updated
`ai/prompt-io/prompts/subints_spawner.md` themselves to
complete the truncated sentence on line 32 (harness-flag
reference).
Upcoming (Phase A, next turn):
- `tractor/spawn/_spawn.py` — trimmed to "core"
- `tractor/spawn/_trio.py` — new; holds `trio_proc()`
- `tractor/spawn/_mp.py` — new; holds `mp_proc()`
- `tractor/spawn/__init__.py` — docstring refresh
- `tractor/_testing/pytest.py` — backend tuple from
`get_args(SpawnMethodKey)`
## Human edits
None — committed as generated. User completed the
truncated sentence on line 32 of
`ai/prompt-io/prompts/subints_spawner.md` themselves
*before* this prompt-io log was written; that edit was
the user's authoring, not an edit to AI output.

View File

@ -0,0 +1,117 @@
---
model: claude-opus-4-7[1m]
service: claude
timestamp: 2026-04-17T03:49:18Z
git_ref: 9703210
diff_cmd: git diff HEAD~1..HEAD
---
No code was produced in this turn — this is a design/planning
exchange for subint spawner support per issue #379. Captured
verbatim below are the two substantive AI responses, which
together fix the phased plan (A: modularize, B: subint, C:
harness) and the concrete Phase A file split before any
implementation begins.
## Turn 1 — initial triage + gating questions
> Read: issue #379, current `spawn/_spawn.py` (847 lines),
> and pyproject pins.
>
> ## Task summary
> Design/plan/implement subint backend per #379, plus
> modularize `spawn/_spawn.py` into per-backend submods, plus
> harness flag for test parametrization.
>
> ## Key finding
> `pyproject.toml:12` pins `requires-python = ">=3.12, <3.14"`
> — but `concurrent.interpreters` / PEP 734 **only lands in
> 3.14**. This is the load-bearing design question before
> coding.
>
> ## Proposed phasing (incremental commits)
> - **Phase A — modularize** `_spawn.py`: extract `trio_proc`
> → `spawn/_trio.py`, `mp_proc``spawn/_mp.py`, keep
> generic helpers (`hard_kill`/`soft_kill`/`new_proc`
> dispatcher/`SpawnMethodKey`) in a slim core. Tests green
> at each step.
> - **Phase B — `_subint` backend**: extend `SpawnMethodKey`
> with `'subint'`; reuse UDS transport by default; shm is
> escape-hatch follow-up.
> - **Phase C — harness**: add backend-parametrize flag, skip
> subint when Python < 3.14.
>
> ## Gating questions before I start
> 1. **Python pin**: bump `<3.15` and feature-gate subint
> tests (a), require 3.14 outright (b), or backport shim
> (c)? I'd recommend (a).
> 2. **PR strategy**: one branch, commits per phase (current
> `claude_bun_opus4.7_subint_spawner_backend`)? Or split
> Phase A as its own PR first?
> 3. **`fork()`-via-subint hack** from the issue's "Our own
> thoughts" section — v1 or follow-up? I'd defer to
> follow-up.
> 4. **Prompt file line 32 is truncated** — what's the
> existing harness flag name you had in mind ("using the
> existing ___")?
>
> Which of these are already decided vs. open?
## User answers (for the raw record)
1. bump `<3.15`, feature-gate — option (a).
2. split Phase A as its own PR first.
3. defer `fork()`-via-subint hack.
4. harness flag is `--spawn-backend` CLI → `start_method`
fixture, per `tractor._testing._pytest`. Prompt file
updated accordingly.
## Turn 2 — concrete Phase A plan
> Good — prompt file now ends cleanly; harness flag
> confirmed as `--spawn-backend` CLI → `start_method`
> fixture (via `_testing/pytest.py:193,355`). Also noted
> `pytest.py:345-349` hardcodes valid backends as a string
> tuple — `typing.get_args(SpawnMethodKey)` should drive
> that list.
>
> ## Phase A — concrete plan (separate PR)
>
> **File split** in `tractor/spawn/`:
> - `_spawn.py` stays as "core": keeps `SpawnMethodKey`,
> `_spawn_method`, `proc_waiter`, `try_set_start_method`,
> `exhaust_portal`, `cancel_on_completion`, `hard_kill`,
> `soft_kill`, `new_proc` dispatcher, `_methods`
> registry. Kept at this path so `_testing/pytest.py:228`
> and other external imports don't break.
> - `_trio.py``trio_proc()` (lines 448671, ~224 lines)
> - `_mp.py``mp_proc()` (lines 672842, ~171 lines)
> - `_methods` dict imports the two backend fns from the
> new modules.
>
> **Commits** (each green via `/run-tests`):
> 1. Drive `_testing/pytest.py` backend-tuple from
> `get_args(SpawnMethodKey)` — tiny prep.
> 2. Extract `trio_proc``spawn/_trio.py`; `_methods`
> imports it.
> 3. Extract `mp_proc``spawn/_mp.py`; `_methods`
> imports it.
> 4. Tidy `spawn/__init__.py` docstring to describe new
> layout.
>
> No pyproject pin bump in Phase A (that's Phase B).
>
> **Workflow**: I'd run this in a new worktree via
> `/open-wkt spawn_modularize` (branch
> `wkt/spawn_modularize`), then open a PR before returning
> to the current `claude_bun_opus4.7_subint_spawner_backend`
> branch for Phase B/C.
>
> Green-light to `/open-wkt` and start Phase A?
## User final response
> all this first then yes, green light.
— log this `/prompt-io` entry first, then proceed with
Phase A in a worktree.

View File

@ -0,0 +1,86 @@
---
model: claude-opus-4-7[1m]
service: claude
session: subints-spawner-phase-a-impl
timestamp: 2026-04-17T03:58:05Z
git_ref: 61a73ba
scope: code
substantive: true
raw_file: 20260417T035800Z_61a73ba_prompt_io.raw.md
---
## Prompt
Follow-up from the design session logged in
`20260417T034918Z_9703210_prompt_io.md`: the user
greenlit Phase A after `/prompt-io` tracking, so
execute the Phase A "modularize `spawn._spawn`" plan
end-to-end in a new worktree — four logical steps
(pytest harness prep, `trio_proc` extraction,
`mp_proc` extraction, docstring refresh), each green
via `/run-tests`.
User directives during execution:
- **Option 3** for the test cadence ("continue A.2A.4
first, then run the full suite once at the end of
Phase A").
- **One commit** for the whole phase ("can't we just
commit the whole patch in one commit?") instead of
the 3/4-commit split I initially proposed.
- **Don't pre-draft** commit messages — wait for the
user to invoke `/commit-msg` (captured as feedback
memory `feedback_no_auto_draft_commit_msgs.md`).
## Response summary
Produced the cohesive Phase A modularization patch,
landed as commit `61a73bae` (subject: `Mv
trio_proc`/`mp_proc` to per-backend submods`). Five
files changed, +565 / -418 lines.
Key pieces of the patch (generated by claude,
reviewed by the human before commit):
- `tractor/spawn/_trio.py`**new**; receives
`trio_proc()` verbatim from `_spawn.py`; imports
cross-backend helpers back from `._spawn`.
- `tractor/spawn/_mp.py`**new**; receives
`mp_proc()` verbatim; uses `from . import _spawn`
for late-binding access to the mutable `_ctx` /
`_spawn_method` globals (design decision made
during impl, not the original plan).
- `tractor/spawn/_spawn.py` — shrunk 847 → 448 LOC;
import pruning; bottom-of-module late imports for
`trio_proc` / `mp_proc` with a one-line comment
explaining the circular-dep reason.
- `tractor/spawn/__init__.py` — docstring refresh
describing the new layout.
- `tractor/_testing/pytest.py` — the valid-backend
set now comes from `typing.get_args(SpawnMethodKey)`
so future additions (`'subint'`) don't need harness
edits.
## Files changed
See `git diff 61a73ba~1..61a73ba --stat`:
```
tractor/_testing/pytest.py | 12 +-
tractor/spawn/__init__.py | 31 +++-
tractor/spawn/_mp.py | 235 ++++++++++++++++++++++++
tractor/spawn/_spawn.py | 413 +-------------------------------
tractor/spawn/_trio.py | 292 ++++++++++++++++++++++++++++
5 files changed, 565 insertions(+), 418 deletions(-)
```
Validation:
- import probe + `_methods` wiring check — OK
- spawn-relevant test subset — 37 passed, 1 skipped
- full suite — 350 passed, 14 skipped, 7 xfailed, 1
xpassed
## Human edits
None — committed as generated by claude (no diff
between `.claude/git_commit_msg_LATEST.md` and the
committed body, as far as the assistant could
observe).

View File

@ -0,0 +1,138 @@
---
model: claude-opus-4-7[1m]
service: claude
timestamp: 2026-04-17T03:58:05Z
git_ref: 61a73ba
diff_cmd: git diff 61a73ba~1..61a73ba
---
Code generated in this turn was committed verbatim as
`61a73bae` ("Mv `trio_proc`/`mp_proc` to per-backend
submods"). Per diff-ref mode, per-file code is captured
via the pointers below, each followed by a prose
summary of what the AI generated. Non-code output
(sanity-check results, design rationale) is included
verbatim.
## Per-file generated content
### `tractor/spawn/_trio.py` (new, 292 lines)
> `git diff 61a73ba~1..61a73ba -- tractor/spawn/_trio.py`
Pure lift-and-shift of `trio_proc()` out of
`tractor/spawn/_spawn.py` (previously lines 448670).
Added AGPL header + module docstring describing the
backend; imports include local `from ._spawn import
cancel_on_completion, hard_kill, soft_kill` which
creates the bottom-of-module late-import pattern in
the core file to avoid a cycle. All call sites,
log-format strings, and body logic are byte-identical
to the originals — no semantic change.
### `tractor/spawn/_mp.py` (new, 235 lines)
> `git diff 61a73ba~1..61a73ba -- tractor/spawn/_mp.py`
Pure lift-and-shift of `mp_proc()` out of
`tractor/spawn/_spawn.py` (previously lines 672842).
Same AGPL header convention. Key difference from
`_trio.py`: uses `from . import _spawn` (module
import, not from-import) for `_ctx` and
`_spawn_method` references — these are mutated at
runtime by `try_set_start_method()`, so late binding
via `_spawn._ctx` / `_spawn._spawn_method` is required
for correctness. Also imports `cancel_on_completion`,
`soft_kill`, `proc_waiter` from `._spawn`.
### `tractor/spawn/_spawn.py` (modified, 847 → 448 LOC)
> `git diff 61a73ba~1..61a73ba -- tractor/spawn/_spawn.py`
- removed `trio_proc()` body (moved to `_trio.py`)
- removed `mp_proc()` body (moved to `_mp.py`)
- pruned imports now unused in core: `sys`,
`is_root_process`, `current_actor`,
`is_main_process`, `_mp_main`, `ActorFailure`,
`pretty_struct`, `_pformat`
- added bottom-of-file late imports
`from ._trio import trio_proc` and
`from ._mp import mp_proc` with a one-line
comment explaining why (circular dep)
- `_methods` dict unchanged structurally; still binds
`'trio' → trio_proc`, `'mp_spawn' → mp_proc`,
`'mp_forkserver' → mp_proc`
### `tractor/spawn/__init__.py` (modified)
> `git diff 61a73ba~1..61a73ba -- tractor/spawn/__init__.py`
Docstring rewrite only — no code. New text describes
the per-backend `Layout` with `- ._spawn` (core),
`- ._trio` (default), `- ._mp` (two mp variants), and
reiterates the "no eager import" NOTE.
### `tractor/_testing/pytest.py` (modified)
> `git diff 61a73ba~1..61a73ba -- tractor/_testing/pytest.py`
Two-line logical change (plus a comment):
- `from typing import` grows `get_args`
- `pytest_generate_tests()` replaces the hardcoded
`('mp_spawn','mp_forkserver','trio')` tuple with
`get_args(SpawnMethodKey)` via lazy import
`from tractor.spawn._spawn import SpawnMethodKey`
inside the function body (matching the project's
existing lazy-import convention for the spawn
module)
## Non-code output (verbatim)
### Design rationale for the file split
From my Phase A plan turn (already captured in
`20260417T034918Z_9703210_prompt_io.md`):
- `_spawn.py` stays as the "core" rather than renamed
to `_core.py` so external imports
(`_testing/pytest.py:228` imports
`try_set_start_method` from `tractor.spawn._spawn`)
keep working without churn.
- Per-backend extraction chosen over alternatives
(e.g. splitting generic helpers further) because
the immediate motivation is hosting a 3rd
`_subint.py` sibling cleanly in Phase B.
### Sanity-check output (verbatim terminal excerpts)
Post-extraction import probe:
```
extraction OK
_methods: {'trio': 'tractor.spawn._trio.trio_proc',
'mp_spawn': 'tractor.spawn._mp.mp_proc',
'mp_forkserver': 'tractor.spawn._mp.mp_proc'}
```
Spawn-relevant test subset (`tests/test_local.py
test_rpc.py test_spawning.py test_multi_program.py
test_discovery.py`):
```
37 passed, 1 skipped, 14 warnings in 55.37s
```
Full suite:
```
350 passed, 14 skipped, 7 xfailed, 1 xpassed,
151 warnings in 437.73s (0:07:17)
```
No regressions vs. `main`. One transient `-x`
early-stop `ERROR` on
`test_close_channel_explicit_remote_registrar[trio-True]`
was flaky (passed solo, passed without `-x`), not
caused by this refactor.
### Commit message
Also AI-drafted (via `/commit-msg`) — the 40-line
message on commit `61a73bae` itself. Not reproduced
here; see `git log -1 61a73bae`.

View File

@ -0,0 +1,27 @@
# AI Prompt I/O Log — claude
This directory tracks prompt inputs and model
outputs for AI-assisted development using
`claude` (Claude Code).
## Policy
Prompt logging follows the
[NLNet generative AI policy][nlnet-ai].
All substantive AI contributions are logged
with:
- Model name and version
- Timestamps
- The prompts that produced the output
- Unedited model output (`.raw.md` files)
[nlnet-ai]: https://nlnet.nl/foundation/policies/generativeAI/
## Usage
Entries are created by the `/prompt-io` skill
or automatically via `/commit-msg` integration.
Human contributors remain accountable for all
code decisions. AI-generated content is never
presented as human-authored work.

View File

@ -0,0 +1,76 @@
ok now i want you to take a look at the most recent commit adding
a `tpt_bind_addrs` to `open_root_actor()` and extend the existing
tests/discovery/test_multiaddr* and friends to use this new param in
at least one suite with parametrizations over,
- `registry_addrs == tpt_bind_addrs`, as in both inputs are the same.
- `set(registry_addrs) >= set(tpt_bind_addrs)`, as in the registry
addrs include the bind set.
- `registry_addrs != tpt_bind_addrs`, where the reg set is disjoint from
the bind set in all possible combos you can imagine.
All of the ^above cases should further be parametrized over,
- the root being the registrar,
- a non-registrar root using our bg `daemon` fixture.
once we have a fairly thorough test suite and have flushed out all
bugs and edge cases we want to design a wrapping API which allows
declaring full tree's of actors tpt endpoints using multiaddrs such
that a `dict[str, list[str]]` of actor-name -> multiaddr can be used
to configure a tree of actors-as-services given such an input
"endpoints-table" can be matched with the number of appropriately
named subactore spawns in a `tractor` user-app.
Here is a small example from piker,
- in piker's root conf.toml we define a `[network]` section which can
define various actor-service-daemon names set to a maddr
(multiaddress str).
- each actor whether part of the `pikerd` tree (as a sub) or spawned
in other non-registrar rooted trees (such as `piker chart`) should
configurable in terms of its `tractor` tpt bind addresses via
a simple service lookup table,
```toml
[network]
pikerd = [
'/ip4/127.0.0.1/tcp/6116', # std localhost daemon-actor tree
'/uds/run/user/1000/piker/pikerd@6116.sock', # same but serving UDS
]
chart = [
'/ip4/127.0.0.1/tcp/3333', # std localhost daemon-actor tree
'/uds/run/user/1000/piker/chart@3333.sock',
]
```
We should take whatever common API is needed to support this and
distill it into a
```python
tractor.discovery.parse_endpoints(
) -> dict[
str,
list[Address]
|dict[str, list[Address]]
# ^recursive case, see below
]:
```
style API which can,
- be re-used easily across dependent projects.
- correctly raise tpt-backend support errors when a maddr specifying
a unsupport proto is passed.
- be used to handle "tunnelled" maddrs per
https://github.com/multiformats/py-multiaddr/#tunneling such that
for any such tunneled maddr-`str`-entry we deliver a data-structure
which can easily be passed to nested `@acm`s which consecutively
setup nested net bindspaces for binding the endpoint addrs using
a combo of our `.ipc.*` machinery and, say for example something like
https://github.com/svinota/pyroute2, more precisely say for
managing tunnelled wireguard eps within network-namespaces,
* https://docs.pyroute2.org/
* https://docs.pyroute2.org/netns.html
remember to include use of all default `.claude/skills` throughout
this work!

View File

@ -0,0 +1,34 @@
This is your first big boi, "from GH issue" design, plan and
implement task.
We need to try and add sub-interpreter (aka subint) support per the
issue,
https://github.com/goodboy/tractor/issues/379
Part of this work should include,
- modularizing and thus better organizing the `.spawn.*` subpkg by
breaking up various backends currently in `spawn._spawn` into
separate submods where it makes sense.
- add a new `._subint` backend which tries to keep as much of the
inter-process-isolation machinery in use as possible but with plans
to optimize for localhost only benefits as offered by python's
subints where possible.
* utilizing localhost-only tpts like UDS, shm-buffers for
performant IPC between subactors but also leveraging the benefits from
the traditional OS subprocs mem/storage-domain isolation, linux
namespaces where possible and as available/permitted by whatever
is happening under the hood with how cpython implements subints.
* default configuration should encourage state isolation as with
subprocs, but explicit public escape hatches to enable rigorously
managed shm channels for high performance apps.
- all tests should be (able to be) parameterized to use the new
`subints` backend and enabled by flag in the harness using the
existing `pytest --spawn-backend <spawn-backend>` support offered in
the `open_root_actor()` and `.testing._pytest` harness override
fixture.

View File

@ -420,20 +420,17 @@ Check out our experimental system for `guest`_-mode controlled
async def aio_echo_server( async def aio_echo_server(
to_trio: trio.MemorySendChannel, chan: tractor.to_asyncio.LinkedTaskChannel,
from_trio: asyncio.Queue,
) -> None: ) -> None:
# a first message must be sent **from** this ``asyncio`` # a first message must be sent **from** this ``asyncio``
# task or the ``trio`` side will never unblock from # task or the ``trio`` side will never unblock from
# ``tractor.to_asyncio.open_channel_from():`` # ``tractor.to_asyncio.open_channel_from():``
to_trio.send_nowait('start') chan.started_nowait('start')
# XXX: this uses an ``from_trio: asyncio.Queue`` currently but we
# should probably offer something better.
while True: while True:
# echo the msg back # echo the msg back
to_trio.send_nowait(await from_trio.get()) chan.send_nowait(await chan.get())
await asyncio.sleep(0) await asyncio.sleep(0)
@ -445,7 +442,7 @@ Check out our experimental system for `guest`_-mode controlled
# message. # message.
async with tractor.to_asyncio.open_channel_from( async with tractor.to_asyncio.open_channel_from(
aio_echo_server, aio_echo_server,
) as (first, chan): ) as (chan, first):
assert first == 'start' assert first == 'start'
await ctx.started(first) await ctx.started(first)
@ -504,8 +501,10 @@ Yes, we spawn a python process, run ``asyncio``, start ``trio`` on the
``asyncio`` loop, then send commands to the ``trio`` scheduled tasks to ``asyncio`` loop, then send commands to the ``trio`` scheduled tasks to
tell ``asyncio`` tasks what to do XD tell ``asyncio`` tasks what to do XD
We need help refining the `asyncio`-side channel API to be more The ``asyncio``-side task receives a single
`trio`-like. Feel free to sling your opinion in `#273`_! ``chan: LinkedTaskChannel`` handle providing a ``trio``-like
API: ``.started_nowait()``, ``.send_nowait()``, ``.get()``
and more. Feel free to sling your opinion in `#273`_!
.. _#273: https://github.com/goodboy/tractor/issues/273 .. _#273: https://github.com/goodboy/tractor/issues/273
@ -641,13 +640,15 @@ Help us push toward the future of distributed `Python`.
- Typed capability-based (dialog) protocols ( see `#196 - Typed capability-based (dialog) protocols ( see `#196
<https://github.com/goodboy/tractor/issues/196>`_ with draft work <https://github.com/goodboy/tractor/issues/196>`_ with draft work
started in `#311 <https://github.com/goodboy/tractor/pull/311>`_) started in `#311 <https://github.com/goodboy/tractor/pull/311>`_)
- We **recently disabled CI-testing on windows** and need help getting - **macOS is now officially supported** and tested in CI
it running again! (see `#327 alongside Linux!
<https://github.com/goodboy/tractor/pull/327>`_). **We do have windows - We **recently disabled CI-testing on windows** and need
support** (and have for quite a while) but since no active hacker help getting it running again! (see `#327
exists in the user-base to help test on that OS, for now we're not <https://github.com/goodboy/tractor/pull/327>`_). **We do
actively maintaining testing due to the added hassle and general have windows support** (and have for quite a while) but
latency.. since no active hacker exists in the user-base to help
test on that OS, for now we're not actively maintaining
testing due to the added hassle and general latency..
Feel like saying hi? Feel like saying hi?

View File

@ -17,6 +17,7 @@ from tractor import (
MsgStream, MsgStream,
_testing, _testing,
trionics, trionics,
TransportClosed,
) )
import trio import trio
import pytest import pytest
@ -208,11 +209,15 @@ async def main(
# TODO: is this needed or no? # TODO: is this needed or no?
raise raise
except trio.ClosedResourceError: except (
trio.ClosedResourceError,
TransportClosed,
) as _tpt_err:
# NOTE: don't send if we already broke the # NOTE: don't send if we already broke the
# connection to avoid raising a closed-error # connection to avoid raising a closed-error
# such that we drop through to the ctl-c # such that we drop through to the ctl-c
# mashing by user. # mashing by user.
with trio.CancelScope(shield=True):
await trio.sleep(0.01) await trio.sleep(0.01)
# timeout: int = 1 # timeout: int = 1
@ -247,6 +252,7 @@ async def main(
await stream.send(i) await stream.send(i)
pytest.fail('stream not closed?') pytest.fail('stream not closed?')
except ( except (
TransportClosed,
trio.ClosedResourceError, trio.ClosedResourceError,
trio.EndOfChannel, trio.EndOfChannel,
) as send_err: ) as send_err:

View File

@ -18,15 +18,14 @@ async def aio_sleep_forever():
async def bp_then_error( async def bp_then_error(
to_trio: trio.MemorySendChannel, chan: to_asyncio.LinkedTaskChannel,
from_trio: asyncio.Queue,
raise_after_bp: bool = True, raise_after_bp: bool = True,
) -> None: ) -> None:
# sync with `trio`-side (caller) task # sync with `trio`-side (caller) task
to_trio.send_nowait('start') chan.started_nowait('start')
# NOTE: what happens here inside the hook needs some refinement.. # NOTE: what happens here inside the hook needs some refinement..
# => seems like it's still `.debug._set_trace()` but # => seems like it's still `.debug._set_trace()` but
@ -60,7 +59,7 @@ async def trio_ctx(
to_asyncio.open_channel_from( to_asyncio.open_channel_from(
bp_then_error, bp_then_error,
# raise_after_bp=not bp_before_started, # raise_after_bp=not bp_before_started,
) as (first, chan), ) as (chan, first),
trio.open_nursery() as tn, trio.open_nursery() as tn,
): ):

View File

@ -20,7 +20,7 @@ async def sleep(
async def open_ctx( async def open_ctx(
n: tractor._supervise.ActorNursery n: tractor.runtime._supervise.ActorNursery
): ):
# spawn both actors # spawn both actors

View File

@ -3,6 +3,7 @@ Verify we can dump a `stackscope` tree on a hang.
''' '''
import os import os
import platform
import signal import signal
import trio import trio
@ -31,13 +32,26 @@ async def main(
from_test: bool = False, from_test: bool = False,
) -> None: ) -> None:
if platform.system() != 'Darwin':
tpt = 'uds'
else:
# XXX, precisely we can't use pytest's tmp-path generation
# for tests.. apparently because:
#
# > The OSError: AF_UNIX path too long in macOS Python occurs
# > because the path to the Unix domain socket exceeds the
# > operating system's maximum path length limit (around 104
#
# WHICH IS just, wtf hillarious XD
tpt = 'tcp'
async with ( async with (
tractor.open_nursery( tractor.open_nursery(
debug_mode=True, debug_mode=True,
enable_stack_on_sig=True, enable_stack_on_sig=True,
# maybe_enable_greenback=False, # maybe_enable_greenback=False,
loglevel='devx', loglevel='devx',
enable_transports=['uds'], enable_transports=[tpt],
) as an, ) as an,
): ):
ptl: tractor.Portal = await an.start_actor( ptl: tractor.Portal = await an.start_actor(

View File

@ -1,3 +1,5 @@
import platform
import tractor import tractor
import trio import trio
@ -34,9 +36,22 @@ async def just_bp(
async def main(): async def main():
if platform.system() != 'Darwin':
tpt = 'uds'
else:
# XXX, precisely we can't use pytest's tmp-path generation
# for tests.. apparently because:
#
# > The OSError: AF_UNIX path too long in macOS Python occurs
# > because the path to the Unix domain socket exceeds the
# > operating system's maximum path length limit (around 104
#
# WHICH IS just, wtf hillarious XD
tpt = 'tcp'
async with tractor.open_nursery( async with tractor.open_nursery(
debug_mode=True, debug_mode=True,
enable_transports=['uds'], enable_transports=[tpt],
loglevel='devx', loglevel='devx',
) as n: ) as n:
p = await n.start_actor( p = await n.start_actor(

View File

@ -90,7 +90,7 @@ async def main() -> list[int]:
# yes, a nursery which spawns `trio`-"actors" B) # yes, a nursery which spawns `trio`-"actors" B)
an: ActorNursery an: ActorNursery
async with tractor.open_nursery( async with tractor.open_nursery(
loglevel='cancel', loglevel='error',
# debug_mode=True, # debug_mode=True,
) as an: ) as an:
@ -118,8 +118,10 @@ async def main() -> list[int]:
cancelled: bool = await portal.cancel_actor() cancelled: bool = await portal.cancel_actor()
assert cancelled assert cancelled
print(f"STREAM TIME = {time.time() - start}") print(
print(f"STREAM + SPAWN TIME = {time.time() - pre_start}") f"STREAM TIME = {time.time() - start}\n"
f"STREAM + SPAWN TIME = {time.time() - pre_start}\n"
)
assert result_stream == list(range(seed)) assert result_stream == list(range(seed))
return result_stream return result_stream

View File

@ -11,21 +11,17 @@ import tractor
async def aio_echo_server( async def aio_echo_server(
to_trio: trio.MemorySendChannel, chan: tractor.to_asyncio.LinkedTaskChannel,
from_trio: asyncio.Queue,
) -> None: ) -> None:
# a first message must be sent **from** this ``asyncio`` # a first message must be sent **from** this ``asyncio``
# task or the ``trio`` side will never unblock from # task or the ``trio`` side will never unblock from
# ``tractor.to_asyncio.open_channel_from():`` # ``tractor.to_asyncio.open_channel_from():``
to_trio.send_nowait('start') chan.started_nowait('start')
# XXX: this uses an ``from_trio: asyncio.Queue`` currently but we
# should probably offer something better.
while True: while True:
# echo the msg back # echo the msg back
to_trio.send_nowait(await from_trio.get()) chan.send_nowait(await chan.get())
await asyncio.sleep(0) await asyncio.sleep(0)
@ -37,7 +33,7 @@ async def trio_to_aio_echo_server(
# message. # message.
async with tractor.to_asyncio.open_channel_from( async with tractor.to_asyncio.open_channel_from(
aio_echo_server, aio_echo_server,
) as (first, chan): ) as (chan, first):
assert first == 'start' assert first == 'start'
await ctx.started(first) await ctx.started(first)

View File

@ -0,0 +1,5 @@
import os
async def child_fn() -> str:
return f"child OK pid={os.getpid()}"

View File

@ -0,0 +1,50 @@
"""
Integration test: spawning tractor actors from an MPI process.
When a parent is launched via ``mpirun``, Open MPI sets ``OMPI_*`` env
vars that bind ``MPI_Init`` to the ``orted`` daemon. Tractor children
inherit those env vars, so if ``inherit_parent_main=True`` (the default)
the child re-executes ``__main__``, re-imports ``mpi4py``, and
``MPI_Init_thread`` fails because the child was never spawned by
``orted``::
getting local rank failed
--> Returned value No permission (-17) instead of ORTE_SUCCESS
Passing ``inherit_parent_main=False`` and placing RPC functions in a
separate importable module (``_child``) avoids the re-import entirely.
Usage::
mpirun --allow-run-as-root -np 1 python -m \
examples.integration.mpi4py.inherit_parent_main
"""
from mpi4py import MPI
import os
import trio
import tractor
from ._child import child_fn
async def main() -> None:
rank = MPI.COMM_WORLD.Get_rank()
print(f"[parent] rank={rank} pid={os.getpid()}", flush=True)
async with tractor.open_nursery(start_method='trio') as an:
portal = await an.start_actor(
'mpi-child',
enable_modules=[child_fn.__module__],
# Without this the child replays __main__, which
# re-imports mpi4py and crashes on MPI_Init.
inherit_parent_main=False,
)
result = await portal.run(child_fn)
print(f"[parent] got: {result}", flush=True)
await portal.cancel_actor()
if __name__ == "__main__":
trio.run(main)

View File

@ -10,7 +10,7 @@ async def main(service_name):
await an.start_actor(service_name) await an.start_actor(service_name)
async with tractor.get_registry() as portal: async with tractor.get_registry() as portal:
print(f"Arbiter is listening on {portal.channel}") print(f"Registrar is listening on {portal.channel}")
async with tractor.wait_for_actor(service_name) as sockaddr: async with tractor.wait_for_actor(service_name) as sockaddr:
print(f"my_service is found at {sockaddr}") print(f"my_service is found at {sockaddr}")

27
flake.lock 100644
View File

@ -0,0 +1,27 @@
{
"nodes": {
"nixpkgs": {
"locked": {
"lastModified": 1769018530,
"narHash": "sha256-MJ27Cy2NtBEV5tsK+YraYr2g851f3Fl1LpNHDzDX15c=",
"owner": "nixos",
"repo": "nixpkgs",
"rev": "88d3861acdd3d2f0e361767018218e51810df8a1",
"type": "github"
},
"original": {
"owner": "nixos",
"ref": "nixos-unstable",
"repo": "nixpkgs",
"type": "github"
}
},
"root": {
"inputs": {
"nixpkgs": "nixpkgs"
}
}
},
"root": "root",
"version": 7
}

70
flake.nix 100644
View File

@ -0,0 +1,70 @@
# An "impure" template thx to `pyproject.nix`,
# https://pyproject-nix.github.io/pyproject.nix/templates.html#impure
# https://github.com/pyproject-nix/pyproject.nix/blob/master/templates/impure/flake.nix
{
description = "An impure overlay (w dev-shell) using `uv`";
inputs = {
nixpkgs.url = "github:nixos/nixpkgs/nixos-unstable";
};
outputs =
{ nixpkgs, ... }:
let
inherit (nixpkgs) lib;
forAllSystems = lib.genAttrs lib.systems.flakeExposed;
in
{
devShells = forAllSystems (
system:
let
pkgs = nixpkgs.legacyPackages.${system};
# XXX NOTE XXX, for now we overlay specific pkgs via
# a major-version-pinned-`cpython`
cpython = "python313";
venv_dir = "py313";
pypkgs = pkgs."${cpython}Packages";
in
{
default = pkgs.mkShell {
packages = [
# XXX, ensure sh completions activate!
pkgs.bashInteractive
pkgs.bash-completion
# XXX, on nix(os), use pkgs version to avoid
# build/sys-sh-integration issues
pkgs.ruff
pkgs.uv
pkgs.${cpython}# ?TODO^ how to set from `cpython` above?
];
shellHook = ''
# unmask to debug **this** dev-shell-hook
# set -e
# link-in c++ stdlib for various AOT-ext-pkgs (numpy, etc.)
LD_LIBRARY_PATH="${pkgs.stdenv.cc.cc.lib}/lib:$LD_LIBRARY_PATH"
export LD_LIBRARY_PATH
# RUNTIME-SETTINGS
# ------ uv ------
# - always use the ./py313/ venv-subdir
# - sync env with all extras
export UV_PROJECT_ENVIRONMENT=${venv_dir}
uv sync --dev --all-extras
# ------ TIPS ------
# NOTE, to launch the py-venv installed `xonsh` (like @goodboy)
# run the `nix develop` cmd with,
# >> nix develop -c uv run xonsh
'';
};
}
);
};
}

View File

@ -9,7 +9,7 @@ name = "tractor"
version = "0.1.0a6dev0" version = "0.1.0a6dev0"
description = 'structured concurrent `trio`-"actors"' description = 'structured concurrent `trio`-"actors"'
authors = [{ name = "Tyler Goodlet", email = "goodboy_foss@protonmail.com" }] authors = [{ name = "Tyler Goodlet", email = "goodboy_foss@protonmail.com" }]
requires-python = ">= 3.11" requires-python = ">=3.12, <3.14"
readme = "docs/README.rst" readme = "docs/README.rst"
license = "AGPL-3.0-or-later" license = "AGPL-3.0-or-later"
keywords = [ keywords = [
@ -24,11 +24,13 @@ keywords = [
classifiers = [ classifiers = [
"Development Status :: 3 - Alpha", "Development Status :: 3 - Alpha",
"Operating System :: POSIX :: Linux", "Operating System :: POSIX :: Linux",
"Operating System :: MacOS",
"Framework :: Trio", "Framework :: Trio",
"License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)", "License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)",
"Programming Language :: Python :: Implementation :: CPython", "Programming Language :: Python :: Implementation :: CPython",
"Programming Language :: Python :: 3 :: Only", "Programming Language :: Python :: 3 :: Only",
"Programming Language :: Python :: 3.11", "Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Topic :: System :: Distributed Computing", "Topic :: System :: Distributed Computing",
] ]
dependencies = [ dependencies = [
@ -42,48 +44,65 @@ dependencies = [
"wrapt>=1.16.0,<2", "wrapt>=1.16.0,<2",
"colorlog>=6.8.2,<7", "colorlog>=6.8.2,<7",
# built-in multi-actor `pdb` REPL # built-in multi-actor `pdb` REPL
"pdbp>=1.6,<2", # windows only (from `pdbp`) "pdbp>=1.8.2,<2", # windows only (from `pdbp`)
# typed IPC msging # typed IPC msging
"msgspec>=0.19.0", "msgspec>=0.21.0",
"cffi>=1.17.1", "cffi>=1.17.1",
"bidict>=0.23.1", "bidict>=0.23.1",
"multiaddr>=0.2.0",
"platformdirs>=4.4.0",
] ]
# ------ project ------ # ------ project ------
[dependency-groups] [dependency-groups]
dev = [ dev = [
# test suite {include-group = 'devx'},
# TODO: maybe some of these layout choices? {include-group = 'testing'},
# https://docs.pytest.org/en/8.0.x/explanation/goodpractices.html#choosing-a-test-layout-import-rules {include-group = 'repl'},
"pytest>=8.3.5", ]
"pexpect>=4.9.0,<5", devx = [
# `tractor.devx` tooling # `tractor.devx` tooling
"greenback>=1.2.1,<2", "greenback>=1.2.1,<2",
"stackscope>=0.2.2,<0.3", "stackscope>=0.2.2,<0.3",
# ^ requires this? # ^ requires this?
"typing-extensions>=4.14.1", "typing-extensions>=4.14.1",
]
testing = [
# test suite
# TODO: maybe some of these layout choices?
# https://docs.pytest.org/en/8.0.x/explanation/goodpractices.html#choosing-a-test-layout-import-rules
"pytest>=8.3.5",
"pexpect>=4.9.0,<5",
]
repl = [
"pyperclip>=1.9.0", "pyperclip>=1.9.0",
"prompt-toolkit>=3.0.50", "prompt-toolkit>=3.0.50",
"xonsh>=0.19.2", "xonsh>=0.22.8",
"psutil>=7.0.0", "psutil>=7.0.0",
] ]
lint = [
"ruff>=0.9.6"
]
# TODO, add these with sane versions; were originally in # TODO, add these with sane versions; were originally in
# `requirements-docs.txt`.. # `requirements-docs.txt`..
# docs = [ # docs = [
# "sphinx>=" # "sphinx>="
# "sphinx_book_theme>=" # "sphinx_book_theme>="
# ] # ]
# ------ dependency-groups ------ # ------ dependency-groups ------
# ------ dependency-groups ------
[tool.uv.sources] [tool.uv.sources]
# XXX NOTE, only for @goodboy's hacking on `pprint(sort_dicts=False)` # XXX NOTE, only for @goodboy's hacking on `pprint(sort_dicts=False)`
# for the `pp` alias.. # for the `pp` alias..
# pdbp = { path = "../pdbp", editable = true }
# [tool.uv.sources.pdbp]
# XXX, in case we need to tmp patch again.
# git = "https://github.com/goodboy/pdbp.git"
# branch ="repair_stack_trace_frame_indexing"
# path = "../pdbp"
# editable = true
# ------ tool.uv.sources ------ # ------ tool.uv.sources ------
# TODO, distributed (multi-host) extensions # TODO, distributed (multi-host) extensions
@ -145,6 +164,7 @@ all_bullets = true
[tool.pytest.ini_options] [tool.pytest.ini_options]
minversion = '6.0' minversion = '6.0'
# https://docs.pytest.org/en/stable/reference/reference.html#configuration-options
testpaths = [ testpaths = [
'tests' 'tests'
] ]
@ -155,10 +175,17 @@ addopts = [
'--import-mode=importlib', '--import-mode=importlib',
# don't show frickin captured logs AGAIN in the report.. # don't show frickin captured logs AGAIN in the report..
'--show-capture=no', '--show-capture=no',
# disable `xonsh` plugin
# https://docs.pytest.org/en/stable/how-to/plugins.html#disabling-plugins-from-autoloading
# https://docs.pytest.org/en/stable/how-to/plugins.html#deactivating-unregistering-a-plugin-by-name
'-p no:xonsh'
] ]
log_cli = false log_cli = false
# TODO: maybe some of these layout choices? # TODO: maybe some of these layout choices?
# https://docs.pytest.org/en/8.0.x/explanation/goodpractices.html#choosing-a-test-layout-import-rules # https://docs.pytest.org/en/8.0.x/explanation/goodpractices.html#choosing-a-test-layout-import-rules
# pythonpath = "src" # pythonpath = "src"
# https://docs.pytest.org/en/stable/reference/reference.html#confval-console_output_style
console_output_style = 'progress'
# ------ tool.pytest ------ # ------ tool.pytest ------

View File

@ -1,8 +0,0 @@
# vim: ft=ini
# pytest.ini for tractor
[pytest]
# don't show frickin captured logs AGAIN in the report..
addopts = --show-capture='no'
log_cli = false
; minversion = 6.0

View File

@ -35,8 +35,8 @@ exclude = [
line-length = 88 line-length = 88
indent-width = 4 indent-width = 4
# Assume Python 3.9 # assume latest minor cpython
target-version = "py311" target-version = "py313"
[lint] [lint]
# Enable Pyflakes (`F`) and a subset of the pycodestyle (`E`) codes by default. # Enable Pyflakes (`F`) and a subset of the pycodestyle (`E`) codes by default.

View File

@ -9,8 +9,11 @@ import os
import signal import signal
import platform import platform
import time import time
from pathlib import Path
from typing import Literal
import pytest import pytest
import tractor
from tractor._testing import ( from tractor._testing import (
examples_dir as examples_dir, examples_dir as examples_dir,
tractor_test as tractor_test, tractor_test as tractor_test,
@ -22,6 +25,8 @@ pytest_plugins: list[str] = [
'tractor._testing.pytest', 'tractor._testing.pytest',
] ]
_ci_env: bool = os.environ.get('CI', False)
_non_linux: bool = platform.system() != 'Linux'
# Sending signal.SIGINT on subprocess fails on windows. Use CTRL_* alternatives # Sending signal.SIGINT on subprocess fails on windows. Use CTRL_* alternatives
if platform.system() == 'Windows': if platform.system() == 'Windows':
@ -34,9 +39,8 @@ else:
_INT_SIGNAL = signal.SIGINT _INT_SIGNAL = signal.SIGINT
_INT_RETURN_CODE = 1 if sys.version_info < (3, 8) else -signal.SIGINT.value _INT_RETURN_CODE = 1 if sys.version_info < (3, 8) else -signal.SIGINT.value
_PROC_SPAWN_WAIT = ( _PROC_SPAWN_WAIT = (
0.6 2 if _ci_env
if sys.version_info < (3, 7) else 1
else 0.4
) )
@ -44,6 +48,80 @@ no_windows = pytest.mark.skipif(
platform.system() == "Windows", platform.system() == "Windows",
reason="Test is unsupported on windows", reason="Test is unsupported on windows",
) )
no_macos = pytest.mark.skipif(
platform.system() == "Darwin",
reason="Test is unsupported on MacOS",
)
def get_cpu_state(
icpu: int = 0,
setting: Literal[
'scaling_governor',
'*_pstate_max_freq',
'scaling_max_freq',
# 'scaling_cur_freq',
] = '*_pstate_max_freq',
) -> tuple[
Path,
str|int,
]|None:
'''
Attempt to read the (first) CPU's setting according
to the set `setting` from under the file-sys,
/sys/devices/system/cpu/cpu0/cpufreq/{setting}
Useful to determine latency headroom for various perf affected
test suites.
'''
try:
# Read governor for core 0 (usually same for all)
setting_path: Path = list(
Path(f'/sys/devices/system/cpu/cpu{icpu}/cpufreq/')
.glob(f'{setting}')
)[0] # <- XXX must be single match!
with open(
setting_path,
'r',
) as f:
return (
setting_path,
f.read().strip(),
)
except (FileNotFoundError, IndexError):
return None
def cpu_scaling_factor() -> float:
'''
Return a latency-headroom multiplier (>= 1.0) reflecting how
much to inflate time-limits when CPU-freq scaling is active on
linux.
When no scaling info is available (non-linux, missing sysfs),
returns 1.0 (i.e. no headroom adjustment needed).
'''
if _non_linux:
return 1.
mx = get_cpu_state()
cur = get_cpu_state(setting='scaling_max_freq')
if mx is None or cur is None:
return 1.
_mx_pth, max_freq = mx
_cur_pth, cur_freq = cur
cpu_scaled: float = int(cur_freq) / int(max_freq)
if cpu_scaled != 1.:
return 1. / (
cpu_scaled * 2 # <- bc likely "dual threaded"
)
return 1.
def pytest_addoption( def pytest_addoption(
@ -61,16 +139,52 @@ def pytest_addoption(
@pytest.fixture(scope='session', autouse=True) @pytest.fixture(scope='session', autouse=True)
def loglevel(request): def loglevel(request) -> str:
import tractor import tractor
orig = tractor.log._default_loglevel orig = tractor.log._default_loglevel
level = tractor.log._default_loglevel = request.config.option.loglevel level = tractor.log._default_loglevel = request.config.option.loglevel
tractor.log.get_console_log(level) log = tractor.log.get_console_log(
level=level,
name='tractor', # <- enable root logger
)
log.info(
f'Test-harness set runtime loglevel: {level!r}\n'
)
yield level yield level
tractor.log._default_loglevel = orig tractor.log._default_loglevel = orig
_ci_env: bool = os.environ.get('CI', False) @pytest.fixture(scope='function')
def test_log(
request,
loglevel: str,
) -> tractor.log.StackLevelAdapter:
'''
Deliver a per test-module-fn logger instance for reporting from
within actual test bodies/fixtures.
For example this can be handy to report certain error cases from
exception handlers using `test_log.exception()`.
'''
modname: str = request.function.__module__
log = tractor.log.get_logger(
name=modname, # <- enable root logger
# pkg_name='tests',
)
_log = tractor.log.get_console_log(
level=loglevel,
logger=log,
name=modname,
# pkg_name='tests',
)
_log.debug(
f'In-test-logging requested\n'
f'test_log.name: {log.name!r}\n'
f'level: {loglevel!r}\n'
)
yield _log
@pytest.fixture(scope='session') @pytest.fixture(scope='session')
@ -85,15 +199,39 @@ def ci_env() -> bool:
def sig_prog( def sig_prog(
proc: subprocess.Popen, proc: subprocess.Popen,
sig: int, sig: int,
canc_timeout: float = 0.1, canc_timeout: float = 0.2,
tries: int = 3,
) -> int: ) -> int:
"Kill the actor-process with ``sig``." '''
Kill the actor-process with `sig`.
Prefer to kill with the provided signal and
failing a `canc_timeout`, send a `SIKILL`-like
to ensure termination.
'''
for i in range(tries):
proc.send_signal(sig) proc.send_signal(sig)
if proc.poll() is None:
print(
f'WARNING, proc still alive after,\n'
f'canc_timeout={canc_timeout!r}\n'
f'sig={sig!r}\n'
f'\n'
f'{proc.args!r}\n'
)
time.sleep(canc_timeout) time.sleep(canc_timeout)
if not proc.poll(): else:
# TODO: why sometimes does SIGINT not work on teardown? # TODO: why sometimes does SIGINT not work on teardown?
# seems to happen only when trace logging enabled? # seems to happen only when trace logging enabled?
if proc.poll() is None:
print(
f'XXX WARNING KILLING PROG WITH SIGINT XXX\n'
f'canc_timeout={canc_timeout!r}\n'
f'{proc.args!r}\n'
)
proc.send_signal(_KILL_SIGNAL) proc.send_signal(_KILL_SIGNAL)
ret: int = proc.wait() ret: int = proc.wait()
assert ret assert ret
@ -106,6 +244,8 @@ def daemon(
testdir: pytest.Pytester, testdir: pytest.Pytester,
reg_addr: tuple[str, int], reg_addr: tuple[str, int],
tpt_proto: str, tpt_proto: str,
ci_env: bool,
test_log: tractor.log.StackLevelAdapter,
) -> subprocess.Popen: ) -> subprocess.Popen:
''' '''
@ -121,10 +261,12 @@ def daemon(
"import tractor; " "import tractor; "
"tractor.run_daemon([], " "tractor.run_daemon([], "
"registry_addrs={reg_addrs}, " "registry_addrs={reg_addrs}, "
"enable_transports={enable_tpts}, "
"debug_mode={debug_mode}, " "debug_mode={debug_mode}, "
"loglevel={ll})" "loglevel={ll})"
).format( ).format(
reg_addrs=str([reg_addr]), reg_addrs=str([reg_addr]),
enable_tpts=str([tpt_proto]),
ll="'{}'".format(loglevel) if loglevel else None, ll="'{}'".format(loglevel) if loglevel else None,
debug_mode=debug_mode, debug_mode=debug_mode,
) )
@ -143,13 +285,25 @@ def daemon(
**kwargs, **kwargs,
) )
# TODO! we should poll for the registry socket-bind to take place
# and only once that's done yield to the requester!
# -[ ] TCP: use the `._root.open_root_actor()`::`ping_tpt_socket()`
# closure!
# -[ ] UDS: can we do something similar for 'pinging" the
# file-socket?
#
global _PROC_SPAWN_WAIT
# UDS sockets are **really** fast to bind()/listen()/connect() # UDS sockets are **really** fast to bind()/listen()/connect()
# so it's often required that we delay a bit more starting # so it's often required that we delay a bit more starting
# the first actor-tree.. # the first actor-tree..
if tpt_proto == 'uds': if tpt_proto == 'uds':
global _PROC_SPAWN_WAIT _PROC_SPAWN_WAIT += 1.6
_PROC_SPAWN_WAIT = 0.6
if _non_linux and ci_env:
_PROC_SPAWN_WAIT += 1
# XXX, allow time for the sub-py-proc to boot up.
# !TODO, see ping-polling ideas above!
time.sleep(_PROC_SPAWN_WAIT) time.sleep(_PROC_SPAWN_WAIT)
assert not proc.returncode assert not proc.returncode
@ -159,18 +313,30 @@ def daemon(
# XXX! yeah.. just be reaaal careful with this bc sometimes it # XXX! yeah.. just be reaaal careful with this bc sometimes it
# can lock up on the `_io.BufferedReader` and hang.. # can lock up on the `_io.BufferedReader` and hang..
stderr: str = proc.stderr.read().decode() stderr: str = proc.stderr.read().decode()
if stderr: stdout: str = proc.stdout.read().decode()
if (
stderr
or
stdout
):
print( print(
f'Daemon actor tree produced STDERR:\n' f'Daemon actor tree produced output:\n'
f'{proc.args}\n' f'{proc.args}\n'
f'\n' f'\n'
f'{stderr}\n' f'stderr: {stderr!r}\n'
f'stdout: {stdout!r}\n'
) )
if proc.returncode != -2:
raise RuntimeError( if (rc := proc.returncode) != -2:
'Daemon actor tree failed !?\n' msg: str = (
f'{proc.args}\n' f'Daemon actor tree was not cancelled !?\n'
f'proc.args: {proc.args!r}\n'
f'proc.returncode: {rc!r}\n'
) )
if rc < 0:
raise RuntimeError(msg)
test_log.error(msg)
# @pytest.fixture(autouse=True) # @pytest.fixture(autouse=True)

View File

@ -3,6 +3,8 @@
''' '''
from __future__ import annotations from __future__ import annotations
import platform
import signal
import time import time
from typing import ( from typing import (
Callable, Callable,
@ -32,9 +34,23 @@ if TYPE_CHECKING:
from pexpect import pty_spawn from pexpect import pty_spawn
_non_linux: bool = platform.system() != 'Linux'
def pytest_configure(config):
# register custom marks to avoid warnings see,
# https://docs.pytest.org/en/stable/how-to/writing_plugins.html#registering-custom-markers
config.addinivalue_line(
'markers',
'ctlcs_bish: test will (likely) not behave under SIGINT..'
)
# a fn that sub-instantiates a `pexpect.spawn()` # a fn that sub-instantiates a `pexpect.spawn()`
# and returns it. # and returns it.
type PexpectSpawner = Callable[[str], pty_spawn.spawn] type PexpectSpawner = Callable[
[str],
pty_spawn.spawn,
]
@pytest.fixture @pytest.fixture
@ -64,26 +80,65 @@ def spawn(
''' '''
import os import os
# disable colored tbs
os.environ['PYTHON_COLORS'] = '0' os.environ['PYTHON_COLORS'] = '0'
# disable all ANSI color output
# os.environ['NO_COLOR'] = '1'
spawned: PexpectSpawner|None = None
def _spawn( def _spawn(
cmd: str, cmd: str,
expect_timeout: float = 4,
**mkcmd_kwargs, **mkcmd_kwargs,
) -> pty_spawn.spawn: ) -> pty_spawn.spawn:
nonlocal spawned
unset_colors() unset_colors()
return testdir.spawn( spawned = testdir.spawn(
cmd=mk_cmd( cmd=mk_cmd(
cmd, cmd,
**mkcmd_kwargs, **mkcmd_kwargs,
), ),
expect_timeout=3, expect_timeout=(timeout:=(
expect_timeout + 6
if _non_linux and _ci_env
else expect_timeout
)),
# preexec_fn=unset_colors, # preexec_fn=unset_colors,
# ^TODO? get `pytest` core to expose underlying # ^TODO? get `pytest` core to expose underlying
# `pexpect.spawn()` stuff? # `pexpect.spawn()` stuff?
) )
# sanity
assert spawned.timeout == timeout
return spawned
# such that test-dep can pass input script name. # such that test-dep can pass input script name.
return _spawn # the `PexpectSpawner`, type alias. yield _spawn # the `PexpectSpawner`, type alias.
if (
spawned
and
(ptyproc := spawned.ptyproc)
):
start: float = time.time()
timeout: float = 5
while (
ptyproc.isalive()
and
(
(_time_took := (time.time() - start))
<
timeout
)
):
ptyproc.kill(signal.SIGINT)
time.sleep(0.01)
if ptyproc.isalive():
ptyproc.kill(signal.SIGKILL)
# TODO? ensure we've cleaned up any UDS-paths?
# breakpoint()
@pytest.fixture( @pytest.fixture(
@ -109,7 +164,13 @@ def ctlc(
'https://github.com/goodboy/tractor/issues/320' 'https://github.com/goodboy/tractor/issues/320'
) )
if mark.name == 'ctlcs_bish': if (
mark.name == 'ctlcs_bish'
and
use_ctlc
and
all(mark.args)
):
pytest.skip( pytest.skip(
f'Test {node} prolly uses something from the stdlib (namely `asyncio`..)\n' f'Test {node} prolly uses something from the stdlib (namely `asyncio`..)\n'
f'The test and/or underlying example script can *sometimes* run fine ' f'The test and/or underlying example script can *sometimes* run fine '
@ -214,12 +275,13 @@ def assert_before(
err_on_false=True, err_on_false=True,
**kwargs **kwargs
) )
return str(child.before.decode())
def do_ctlc( def do_ctlc(
child, child,
count: int = 3, count: int = 3,
delay: float = 0.1, delay: float|None = None,
patt: str|None = None, patt: str|None = None,
# expect repl UX to reprint the prompt after every # expect repl UX to reprint the prompt after every
@ -231,6 +293,7 @@ def do_ctlc(
) -> str|None: ) -> str|None:
before: str|None = None before: str|None = None
delay = delay or 0.1
# make sure ctl-c sends don't do anything but repeat output # make sure ctl-c sends don't do anything but repeat output
for _ in range(count): for _ in range(count):
@ -241,7 +304,10 @@ def do_ctlc(
# if you run this test manually it works just fine.. # if you run this test manually it works just fine..
if expect_prompt: if expect_prompt:
time.sleep(delay) time.sleep(delay)
child.expect(PROMPT) child.expect(
PROMPT,
timeout=(child.timeout * 2) if _ci_env else child.timeout,
)
before = str(child.before.decode()) before = str(child.before.decode())
time.sleep(delay) time.sleep(delay)

View File

@ -37,6 +37,9 @@ from .conftest import (
in_prompt_msg, in_prompt_msg,
assert_before, assert_before,
) )
from ..conftest import (
_ci_env,
)
if TYPE_CHECKING: if TYPE_CHECKING:
from ..conftest import PexpectSpawner from ..conftest import PexpectSpawner
@ -51,13 +54,14 @@ if TYPE_CHECKING:
# - recurrent root errors # - recurrent root errors
_non_linux: bool = platform.system() != 'Linux'
if platform.system() == 'Windows': if platform.system() == 'Windows':
pytest.skip( pytest.skip(
'Debugger tests have no windows support (yet)', 'Debugger tests have no windows support (yet)',
allow_module_level=True, allow_module_level=True,
) )
# TODO: was trying to this xfail style but some weird bug i see in CI # TODO: was trying to this xfail style but some weird bug i see in CI
# that's happening at collect time.. pretty soon gonna dump actions i'm # that's happening at collect time.. pretty soon gonna dump actions i'm
# thinkin... # thinkin...
@ -193,6 +197,11 @@ def test_root_actor_bp_forever(
child.expect(EOF) child.expect(EOF)
# skip on non-Linux CI
@pytest.mark.ctlcs_bish(
_non_linux,
_ci_env,
)
@pytest.mark.parametrize( @pytest.mark.parametrize(
'do_next', 'do_next',
(True, False), (True, False),
@ -258,6 +267,11 @@ def test_subactor_error(
child.expect(EOF) child.expect(EOF)
# skip on non-Linux CI
@pytest.mark.ctlcs_bish(
_non_linux,
_ci_env,
)
def test_subactor_breakpoint( def test_subactor_breakpoint(
spawn, spawn,
ctlc: bool, ctlc: bool,
@ -480,8 +494,24 @@ def test_multi_daemon_subactors(
stream. stream.
''' '''
child = spawn('multi_daemon_subactors') non_linux = _non_linux
if non_linux and ctlc:
pytest.skip(
'Ctl-c + MacOS is too unreliable/racy for this test..\n'
)
# !TODO, if someone with more patience then i wants to muck
# with the timings on this please feel free to see all the
# `non_linux` branching logic i added on my first attempt
# below!
#
# my conclusion was that if i were to run the script
# manually, and thus as slowly as a human would, the test
# would and should pass as described in this test fn, however
# after fighting with it for >= 1hr. i decided more then
# likely the more extensive `linux` testing should cover most
# regressions.
child = spawn('multi_daemon_subactors')
child.expect(PROMPT) child.expect(PROMPT)
# there can be a race for which subactor will acquire # there can be a race for which subactor will acquire
@ -511,8 +541,19 @@ def test_multi_daemon_subactors(
else: else:
raise ValueError('Neither log msg was found !?') raise ValueError('Neither log msg was found !?')
non_linux_delay: float = 0.3
if ctlc: if ctlc:
do_ctlc(child) do_ctlc(
child,
delay=(
non_linux_delay
if non_linux
else None
),
)
if non_linux:
time.sleep(1)
# NOTE: previously since we did not have clobber prevention # NOTE: previously since we did not have clobber prevention
# in the root actor this final resume could result in the debugger # in the root actor this final resume could result in the debugger
@ -543,33 +584,66 @@ def test_multi_daemon_subactors(
# assert "in use by child ('bp_forever'," in before # assert "in use by child ('bp_forever'," in before
if ctlc: if ctlc:
do_ctlc(child) do_ctlc(
child,
delay=(
non_linux_delay
if non_linux
else None
),
)
if non_linux:
time.sleep(1)
# expect another breakpoint actor entry # expect another breakpoint actor entry
child.sendline('c') child.sendline('c')
child.expect(PROMPT) child.expect(PROMPT)
try: try:
assert_before( before: str = assert_before(
child, child,
bp_forev_parts, bp_forev_parts,
) )
except AssertionError: except AssertionError:
assert_before( before: str = assert_before(
child, child,
name_error_parts, name_error_parts,
) )
else: else:
if ctlc: if ctlc:
do_ctlc(child) before: str = do_ctlc(
child,
delay=(
non_linux_delay
if non_linux
else None
),
)
if non_linux:
time.sleep(1)
# should crash with the 2nd name error (simulates # should crash with the 2nd name error (simulates
# a retry) and then the root eventually (boxed) errors # a retry) and then the root eventually (boxed) errors
# after 1 or more further bp actor entries. # after 1 or more further bp actor entries.
child.sendline('c') child.sendline('c')
child.expect(PROMPT) try:
child.expect(
PROMPT,
timeout=3,
)
except EOF:
before: str = child.before.decode()
print(
f'\n'
f'??? NEVER RXED `pdb` PROMPT ???\n'
f'\n'
f'{before}\n'
)
raise
assert_before( assert_before(
child, child,
name_error_parts, name_error_parts,
@ -689,7 +763,8 @@ def test_multi_subactors_root_errors(
@has_nested_actors @has_nested_actors
def test_multi_nested_subactors_error_through_nurseries( def test_multi_nested_subactors_error_through_nurseries(
spawn, ci_env: bool,
spawn: PexpectSpawner,
# TODO: address debugger issue for nested tree: # TODO: address debugger issue for nested tree:
# https://github.com/goodboy/tractor/issues/320 # https://github.com/goodboy/tractor/issues/320
@ -710,9 +785,29 @@ def test_multi_nested_subactors_error_through_nurseries(
# timed_out_early: bool = False # timed_out_early: bool = False
for send_char in itertools.cycle(['c', 'q']): for (
i,
send_char,
) in enumerate(itertools.cycle(['c', 'q'])):
timeout: float = -1
if (
_non_linux
and
ci_env
):
timeout: float = 6
# XXX linux but the first crash sequence
# can take longer to arrive at a prompt.
elif i == 0:
timeout = 5
try: try:
child.expect(PROMPT) child.expect(
PROMPT,
timeout=timeout,
)
child.sendline(send_char) child.sendline(send_char)
time.sleep(0.01) time.sleep(0.01)
@ -889,6 +984,11 @@ def test_different_debug_mode_per_actor(
) )
# skip on non-Linux CI
@pytest.mark.ctlcs_bish(
_non_linux,
_ci_env,
)
def test_post_mortem_api( def test_post_mortem_api(
spawn, spawn,
ctlc: bool, ctlc: bool,
@ -1133,12 +1233,21 @@ def test_ctxep_pauses_n_maybe_ipc_breaks(
# closed so verify we see error reporting as well as # closed so verify we see error reporting as well as
# a failed crash-REPL request msg and can CTL-c our way # a failed crash-REPL request msg and can CTL-c our way
# out. # out.
# ?TODO, match depending on `tpt_proto(s)`?
# - [ ] how can we pass it into the script tho?
tpt: str = 'UDS'
if _non_linux:
tpt: str = 'TCP'
assert_before( assert_before(
child, child,
['peer IPC channel closed abruptly?', ['peer IPC channel closed abruptly?',
'another task closed this fd', 'another task closed this fd',
'Debug lock request was CANCELLED?', 'Debug lock request was CANCELLED?',
"TransportClosed: 'MsgpackUDSStream' was already closed locally ?",] f"'Msgpack{tpt}Stream' was already closed locally?",
f"TransportClosed: 'Msgpack{tpt}Stream' was already closed 'by peer'?",
]
# XXX races on whether these show/hit? # XXX races on whether these show/hit?
# 'Failed to REPl via `_pause()` You called `tractor.pause()` from an already cancelled scope!', # 'Failed to REPl via `_pause()` You called `tractor.pause()` from an already cancelled scope!',

View File

@ -31,6 +31,9 @@ from .conftest import (
PROMPT, PROMPT,
_pause_msg, _pause_msg,
) )
from ..conftest import (
no_macos,
)
import pytest import pytest
from pexpect.exceptions import ( from pexpect.exceptions import (
@ -42,6 +45,7 @@ if TYPE_CHECKING:
from ..conftest import PexpectSpawner from ..conftest import PexpectSpawner
@no_macos
def test_shield_pause( def test_shield_pause(
spawn: PexpectSpawner, spawn: PexpectSpawner,
): ):
@ -57,6 +61,7 @@ def test_shield_pause(
expect( expect(
child, child,
'Yo my child hanging..?', 'Yo my child hanging..?',
timeout=3,
) )
assert_before( assert_before(
child, child,
@ -121,7 +126,7 @@ def test_shield_pause(
child.pid, child.pid,
signal.SIGINT, signal.SIGINT,
) )
from tractor._supervise import _shutdown_msg from tractor.runtime._supervise import _shutdown_msg
expect( expect(
child, child,
# 'Shutting down actor runtime', # 'Shutting down actor runtime',

View File

View File

@ -0,0 +1,376 @@
'''
Multiaddr construction, parsing, and round-trip tests for
`tractor.discovery._multiaddr.mk_maddr()` and
`tractor.discovery._multiaddr.parse_maddr()`.
'''
from pathlib import Path
from types import SimpleNamespace
import pytest
from multiaddr import Multiaddr
from tractor.ipc._tcp import TCPAddress
from tractor.ipc._uds import UDSAddress
from tractor.discovery._multiaddr import (
mk_maddr,
parse_maddr,
parse_endpoints,
_tpt_proto_to_maddr,
_maddr_to_tpt_proto,
)
from tractor.discovery._addr import wrap_address
def test_tpt_proto_to_maddr_mapping():
'''
`_tpt_proto_to_maddr` maps all supported `proto_key`
values to their correct multiaddr protocol names.
'''
assert _tpt_proto_to_maddr['tcp'] == 'tcp'
assert _tpt_proto_to_maddr['uds'] == 'unix'
assert len(_tpt_proto_to_maddr) == 2
def test_mk_maddr_tcp_ipv4():
'''
`mk_maddr()` on a `TCPAddress` with an IPv4 host
produces the correct `/ip4/<host>/tcp/<port>` multiaddr.
'''
addr = TCPAddress('127.0.0.1', 1234)
result: Multiaddr = mk_maddr(addr)
assert isinstance(result, Multiaddr)
assert str(result) == '/ip4/127.0.0.1/tcp/1234'
protos = result.protocols()
assert protos[0].name == 'ip4'
assert protos[1].name == 'tcp'
assert result.value_for_protocol('ip4') == '127.0.0.1'
assert result.value_for_protocol('tcp') == '1234'
def test_mk_maddr_tcp_ipv6():
'''
`mk_maddr()` on a `TCPAddress` with an IPv6 host
produces the correct `/ip6/<host>/tcp/<port>` multiaddr.
'''
addr = TCPAddress('::1', 5678)
result: Multiaddr = mk_maddr(addr)
assert str(result) == '/ip6/::1/tcp/5678'
protos = result.protocols()
assert protos[0].name == 'ip6'
assert protos[1].name == 'tcp'
def test_mk_maddr_uds():
'''
`mk_maddr()` on a `UDSAddress` produces a `/unix/<path>`
multiaddr containing the full socket path.
'''
# NOTE, use an absolute `filedir` to match real runtime
# UDS paths; `mk_maddr()` strips the leading `/` to avoid
# the double-slash `/unix//run/..` that py-multiaddr
# rejects as "empty protocol path".
filedir = '/tmp/tractor_test'
filename = 'test_sock.sock'
addr = UDSAddress(
filedir=filedir,
filename=filename,
)
result: Multiaddr = mk_maddr(addr)
assert isinstance(result, Multiaddr)
result_str: str = str(result)
assert result_str.startswith('/unix/')
# verify the leading `/` was stripped to avoid double-slash
assert '/unix/tmp/tractor_test/' in result_str
sockpath_rel: str = str(
Path(filedir) / filename
).lstrip('/')
unix_val: str = result.value_for_protocol('unix')
assert unix_val.endswith(sockpath_rel)
def test_mk_maddr_unsupported_proto_key():
'''
`mk_maddr()` raises `ValueError` for an unsupported
`proto_key`.
'''
fake_addr = SimpleNamespace(proto_key='quic')
with pytest.raises(
ValueError,
match='Unsupported proto_key',
):
mk_maddr(fake_addr)
@pytest.mark.parametrize(
'addr',
[
pytest.param(
TCPAddress('127.0.0.1', 9999),
id='tcp-ipv4',
),
pytest.param(
UDSAddress(
filedir='/tmp/tractor_rt',
filename='roundtrip.sock',
),
id='uds',
),
],
)
def test_mk_maddr_roundtrip(addr):
'''
`mk_maddr()` output is valid multiaddr syntax that the
library can re-parse back into an equivalent `Multiaddr`.
'''
maddr: Multiaddr = mk_maddr(addr)
reparsed = Multiaddr(str(maddr))
assert reparsed == maddr
assert str(reparsed) == str(maddr)
# ------ parse_maddr() tests ------
def test_maddr_to_tpt_proto_mapping():
'''
`_maddr_to_tpt_proto` is the exact inverse of
`_tpt_proto_to_maddr`.
'''
assert _maddr_to_tpt_proto == {
'tcp': 'tcp',
'unix': 'uds',
}
def test_parse_maddr_tcp_ipv4():
'''
`parse_maddr()` on an IPv4 TCP multiaddr string
produce a `TCPAddress` with the correct host and port.
'''
result = parse_maddr('/ip4/127.0.0.1/tcp/1234')
assert isinstance(result, TCPAddress)
assert result.unwrap() == ('127.0.0.1', 1234)
def test_parse_maddr_tcp_ipv6():
'''
`parse_maddr()` on an IPv6 TCP multiaddr string
produce a `TCPAddress` with the correct host and port.
'''
result = parse_maddr('/ip6/::1/tcp/5678')
assert isinstance(result, TCPAddress)
assert result.unwrap() == ('::1', 5678)
def test_parse_maddr_uds():
'''
`parse_maddr()` on a `/unix/...` multiaddr string
produce a `UDSAddress` with the correct dir and filename,
preserving absolute path semantics.
'''
result = parse_maddr('/unix/tmp/tractor_test/test.sock')
assert isinstance(result, UDSAddress)
filedir, filename = result.unwrap()
assert filename == 'test.sock'
assert str(filedir) == '/tmp/tractor_test'
def test_parse_maddr_unsupported():
'''
`parse_maddr()` raise `ValueError` for an unsupported
protocol combination like UDP.
'''
with pytest.raises(
ValueError,
match='Unsupported multiaddr protocol combo',
):
parse_maddr('/ip4/127.0.0.1/udp/1234')
@pytest.mark.parametrize(
'addr',
[
pytest.param(
TCPAddress('127.0.0.1', 9999),
id='tcp-ipv4',
),
pytest.param(
UDSAddress(
filedir='/tmp/tractor_rt',
filename='roundtrip.sock',
),
id='uds',
),
],
)
def test_parse_maddr_roundtrip(addr):
'''
Full round-trip: `addr -> mk_maddr -> str -> parse_maddr`
produce an `Address` whose `.unwrap()` matches the original.
'''
maddr: Multiaddr = mk_maddr(addr)
maddr_str: str = str(maddr)
parsed = parse_maddr(maddr_str)
assert type(parsed) is type(addr)
assert parsed.unwrap() == addr.unwrap()
def test_wrap_address_maddr_str():
'''
`wrap_address()` accept a multiaddr-format string and
return the correct `Address` type.
'''
result = wrap_address('/ip4/127.0.0.1/tcp/9999')
assert isinstance(result, TCPAddress)
assert result.unwrap() == ('127.0.0.1', 9999)
# ------ parse_endpoints() tests ------
def test_parse_endpoints_tcp_only():
'''
`parse_endpoints()` with a single TCP maddr per actor
produce the correct `TCPAddress` instances.
'''
table = {
'registry': ['/ip4/127.0.0.1/tcp/1616'],
'data_feed': ['/ip4/0.0.0.0/tcp/5555'],
}
result = parse_endpoints(table)
assert set(result.keys()) == {'registry', 'data_feed'}
reg_addr = result['registry'][0]
assert isinstance(reg_addr, TCPAddress)
assert reg_addr.unwrap() == ('127.0.0.1', 1616)
feed_addr = result['data_feed'][0]
assert isinstance(feed_addr, TCPAddress)
assert feed_addr.unwrap() == ('0.0.0.0', 5555)
def test_parse_endpoints_mixed_tpts():
'''
`parse_endpoints()` with both TCP and UDS maddrs for
the same actor produce the correct mixed `Address` list.
'''
table = {
'broker': [
'/ip4/127.0.0.1/tcp/4040',
'/unix/tmp/tractor/broker.sock',
],
}
result = parse_endpoints(table)
addrs = result['broker']
assert len(addrs) == 2
assert isinstance(addrs[0], TCPAddress)
assert addrs[0].unwrap() == ('127.0.0.1', 4040)
assert isinstance(addrs[1], UDSAddress)
filedir, filename = addrs[1].unwrap()
assert filename == 'broker.sock'
assert str(filedir) == '/tmp/tractor'
def test_parse_endpoints_unwrapped_tuples():
'''
`parse_endpoints()` accept raw `(host, port)` tuples
and wrap them as `TCPAddress`.
'''
table = {
'ems': [('127.0.0.1', 6666)],
}
result = parse_endpoints(table)
addr = result['ems'][0]
assert isinstance(addr, TCPAddress)
assert addr.unwrap() == ('127.0.0.1', 6666)
def test_parse_endpoints_mixed_str_and_tuple():
'''
`parse_endpoints()` accept a mix of maddr strings and
raw tuples in the same actor entry list.
'''
table = {
'quoter': [
'/ip4/127.0.0.1/tcp/7777',
('127.0.0.1', 8888),
],
}
result = parse_endpoints(table)
addrs = result['quoter']
assert len(addrs) == 2
assert isinstance(addrs[0], TCPAddress)
assert addrs[0].unwrap() == ('127.0.0.1', 7777)
assert isinstance(addrs[1], TCPAddress)
assert addrs[1].unwrap() == ('127.0.0.1', 8888)
def test_parse_endpoints_unsupported_proto():
'''
`parse_endpoints()` raise `ValueError` when a maddr
string uses an unsupported protocol like `/udp/`.
'''
table = {
'bad_actor': ['/ip4/127.0.0.1/udp/9999'],
}
with pytest.raises(
ValueError,
match='Unsupported multiaddr protocol combo',
):
parse_endpoints(table)
def test_parse_endpoints_empty_table():
'''
`parse_endpoints()` on an empty table return an empty
dict.
'''
assert parse_endpoints({}) == {}
def test_parse_endpoints_empty_actor_list():
'''
`parse_endpoints()` with an actor mapped to an empty
list preserve the key with an empty list value.
'''
result = parse_endpoints({'x': []})
assert result == {'x': []}

View File

@ -1,11 +1,14 @@
""" '''
Actor "discovery" testing Discovery subsystem via a "registrar" actor scenarios.
"""
'''
import os import os
import signal import signal
import platform import platform
from functools import partial from functools import partial
import itertools import itertools
import time
from typing import Callable
import psutil import psutil
import pytest import pytest
@ -13,13 +16,17 @@ import subprocess
import tractor import tractor
from tractor.trionics import collapse_eg from tractor.trionics import collapse_eg
from tractor._testing import tractor_test from tractor._testing import tractor_test
from tractor.discovery._addr import wrap_address
from tractor.discovery._multiaddr import mk_maddr
import trio import trio
@tractor_test @tractor_test
async def test_reg_then_unreg(reg_addr): async def test_reg_then_unreg(
reg_addr: tuple,
):
actor = tractor.current_actor() actor = tractor.current_actor()
assert actor.is_arbiter assert actor.is_registrar
assert len(actor._registry) == 1 # only self is registered assert len(actor._registry) == 1 # only self is registered
async with tractor.open_nursery( async with tractor.open_nursery(
@ -27,10 +34,10 @@ async def test_reg_then_unreg(reg_addr):
) as n: ) as n:
portal = await n.start_actor('actor', enable_modules=[__name__]) portal = await n.start_actor('actor', enable_modules=[__name__])
uid = portal.channel.uid uid = portal.channel.aid.uid
async with tractor.get_registry(reg_addr) as aportal: async with tractor.get_registry(reg_addr) as aportal:
# this local actor should be the arbiter # this local actor should be the registrar
assert actor is aportal.actor assert actor is aportal.actor
async with tractor.wait_for_actor('actor'): async with tractor.wait_for_actor('actor'):
@ -48,6 +55,49 @@ async def test_reg_then_unreg(reg_addr):
assert not sockaddrs assert not sockaddrs
@tractor_test
async def test_reg_then_unreg_maddr(
reg_addr: tuple,
):
'''
Same as `test_reg_then_unreg` but pass the registry
address as a multiaddr string to verify `wrap_address()`
multiaddr parsing end-to-end through the runtime.
'''
# tuple -> Address -> multiaddr string
addr_obj = wrap_address(reg_addr)
maddr_str: str = str(mk_maddr(addr_obj))
actor = tractor.current_actor()
assert actor.is_registrar
async with tractor.open_nursery(
registry_addrs=[maddr_str],
) as n:
portal = await n.start_actor(
'actor_maddr',
enable_modules=[__name__],
)
uid = portal.channel.aid.uid
async with tractor.get_registry(maddr_str) as aportal:
assert actor is aportal.actor
async with tractor.wait_for_actor('actor_maddr'):
assert uid in aportal.actor._registry
sockaddrs = actor._registry[uid]
assert sockaddrs
await n.cancel()
await trio.sleep(0.1)
assert uid not in aportal.actor._registry
sockaddrs = actor._registry.get(uid)
assert not sockaddrs
the_line = 'Hi my name is {}' the_line = 'Hi my name is {}'
@ -82,11 +132,15 @@ async def say_hello_use_wait(
@tractor_test @tractor_test
@pytest.mark.parametrize('func', [say_hello, say_hello_use_wait]) @pytest.mark.parametrize(
'func',
[say_hello,
say_hello_use_wait]
)
async def test_trynamic_trio( async def test_trynamic_trio(
func, func: Callable,
start_method, start_method: str,
reg_addr, reg_addr: tuple,
): ):
''' '''
Root actor acting as the "director" and running one-shot-task-actors Root actor acting as the "director" and running one-shot-task-actors
@ -119,7 +173,10 @@ async def stream_forever():
await trio.sleep(0.01) await trio.sleep(0.01)
async def cancel(use_signal, delay=0): async def cancel(
use_signal: bool,
delay: float = 0,
):
# hold on there sally # hold on there sally
await trio.sleep(delay) await trio.sleep(delay)
@ -132,15 +189,17 @@ async def cancel(use_signal, delay=0):
raise KeyboardInterrupt raise KeyboardInterrupt
async def stream_from(portal): async def stream_from(portal: tractor.Portal):
async with portal.open_stream_from(stream_forever) as stream: async with portal.open_stream_from(stream_forever) as stream:
async for value in stream: async for value in stream:
print(value) print(value)
async def unpack_reg(actor_or_portal): async def unpack_reg(
actor_or_portal: tractor.Portal|tractor.Actor,
):
''' '''
Get and unpack a "registry" RPC request from the "arbiter" registry Get and unpack a "registry" RPC request from the registrar
system. system.
''' '''
@ -149,7 +208,10 @@ async def unpack_reg(actor_or_portal):
else: else:
msg = await actor_or_portal.run_from_ns('self', 'get_registry') msg = await actor_or_portal.run_from_ns('self', 'get_registry')
return {tuple(key.split('.')): val for key, val in msg.items()} return {
tuple(key.split('.')): val
for key, val in msg.items()
}
async def spawn_and_check_registry( async def spawn_and_check_registry(
@ -173,24 +235,26 @@ async def spawn_and_check_registry(
registry_addrs=[reg_addr], registry_addrs=[reg_addr],
debug_mode=debug_mode, debug_mode=debug_mode,
): ):
async with tractor.get_registry(reg_addr) as portal: async with tractor.get_registry(
addr=reg_addr,
) as portal:
# runtime needs to be up to call this # runtime needs to be up to call this
actor = tractor.current_actor() actor = tractor.current_actor()
if remote_arbiter: if remote_arbiter:
assert not actor.is_arbiter assert not actor.is_registrar
if actor.is_arbiter: if actor.is_registrar:
extra = 1 # arbiter is local root actor extra = 1 # registrar is local root actor
get_reg = partial(unpack_reg, actor) get_reg = partial(unpack_reg, actor)
else: else:
get_reg = partial(unpack_reg, portal) get_reg = partial(unpack_reg, portal)
extra = 2 # local root actor + remote arbiter extra = 2 # local root actor + remote registrar
# ensure current actor is registered # ensure current actor is registered
registry: dict = await get_reg() registry: dict = await get_reg()
assert actor.uid in registry assert actor.aid.uid in registry
try: try:
async with tractor.open_nursery() as an: async with tractor.open_nursery() as an:
@ -238,22 +302,35 @@ async def spawn_and_check_registry(
# all subactors should have de-registered # all subactors should have de-registered
registry = await get_reg() registry = await get_reg()
start: float = time.time()
while (
not (len(registry) == extra)
and
(time.time() - start) < 5
):
print(
f'Waiting for remaining subs to dereg..\n'
f'{registry!r}\n'
)
await trio.sleep(0.3)
else:
assert len(registry) == extra assert len(registry) == extra
assert actor.uid in registry
assert actor.aid.uid in registry
@pytest.mark.parametrize('use_signal', [False, True]) @pytest.mark.parametrize('use_signal', [False, True])
@pytest.mark.parametrize('with_streaming', [False, True]) @pytest.mark.parametrize('with_streaming', [False, True])
def test_subactors_unregister_on_cancel( def test_subactors_unregister_on_cancel(
debug_mode: bool, debug_mode: bool,
start_method, start_method: str,
use_signal, use_signal: bool,
reg_addr, reg_addr: tuple,
with_streaming, with_streaming: bool,
): ):
''' '''
Verify that cancelling a nursery results in all subactors Verify that cancelling a nursery results in all subactors
deregistering themselves with the arbiter. deregistering themselves with the registrar.
''' '''
with pytest.raises(KeyboardInterrupt): with pytest.raises(KeyboardInterrupt):
@ -274,15 +351,17 @@ def test_subactors_unregister_on_cancel(
def test_subactors_unregister_on_cancel_remote_daemon( def test_subactors_unregister_on_cancel_remote_daemon(
daemon: subprocess.Popen, daemon: subprocess.Popen,
debug_mode: bool, debug_mode: bool,
start_method, start_method: str,
use_signal, use_signal: bool,
reg_addr, reg_addr: tuple,
with_streaming, with_streaming: bool,
): ):
"""Verify that cancelling a nursery results in all subactors '''
deregistering themselves with a **remote** (not in the local process Verify that cancelling a nursery results in all subactors
tree) arbiter. deregistering themselves with a **remote** (not in the local
""" process tree) registrar.
'''
with pytest.raises(KeyboardInterrupt): with pytest.raises(KeyboardInterrupt):
trio.run( trio.run(
partial( partial(
@ -325,20 +404,24 @@ async def close_chans_before_nursery(
try: try:
get_reg = partial(unpack_reg, aportal) get_reg = partial(unpack_reg, aportal)
async with tractor.open_nursery() as tn: async with tractor.open_nursery() as an:
portal1 = await tn.start_actor( portal1 = await an.start_actor(
name='consumer1', enable_modules=[__name__]) name='consumer1',
portal2 = await tn.start_actor( enable_modules=[__name__],
'consumer2', enable_modules=[__name__]) )
portal2 = await an.start_actor(
'consumer2',
enable_modules=[__name__],
)
# TODO: compact this back as was in last commit once async with (
# 3.9+, see https://github.com/goodboy/tractor/issues/207 portal1.open_stream_from(
async with portal1.open_stream_from(
stream_forever stream_forever
) as agen1: ) as agen1,
async with portal2.open_stream_from( portal2.open_stream_from(
stream_forever stream_forever
) as agen2: ) as agen2,
):
async with ( async with (
collapse_eg(), collapse_eg(),
trio.open_nursery() as tn, trio.open_nursery() as tn,
@ -349,7 +432,7 @@ async def close_chans_before_nursery(
await streamer(agen2) await streamer(agen2)
finally: finally:
# Kill the root nursery thus resulting in # Kill the root nursery thus resulting in
# normal arbiter channel ops to fail during # normal registrar channel ops to fail during
# teardown. It doesn't seem like this is # teardown. It doesn't seem like this is
# reliably triggered by an external SIGINT. # reliably triggered by an external SIGINT.
# tractor.current_actor()._root_nursery.cancel_scope.cancel() # tractor.current_actor()._root_nursery.cancel_scope.cancel()
@ -361,27 +444,30 @@ async def close_chans_before_nursery(
# also kill off channels cuz why not # also kill off channels cuz why not
await agen1.aclose() await agen1.aclose()
await agen2.aclose() await agen2.aclose()
finally: finally:
with trio.CancelScope(shield=True): with trio.CancelScope(shield=True):
await trio.sleep(1) await trio.sleep(1)
# all subactors should have de-registered # all subactors should have de-registered
registry = await get_reg() registry = await get_reg()
assert portal1.channel.uid not in registry assert portal1.channel.aid.uid not in registry
assert portal2.channel.uid not in registry assert portal2.channel.aid.uid not in registry
assert len(registry) == entries_at_end assert len(registry) == entries_at_end
@pytest.mark.parametrize('use_signal', [False, True]) @pytest.mark.parametrize('use_signal', [False, True])
def test_close_channel_explicit( def test_close_channel_explicit(
start_method, start_method: str,
use_signal, use_signal: bool,
reg_addr, reg_addr: tuple,
): ):
"""Verify that closing a stream explicitly and killing the actor's '''
Verify that closing a stream explicitly and killing the actor's
"root nursery" **before** the containing nursery tears down also "root nursery" **before** the containing nursery tears down also
results in subactor(s) deregistering from the arbiter. results in subactor(s) deregistering from the registrar.
"""
'''
with pytest.raises(KeyboardInterrupt): with pytest.raises(KeyboardInterrupt):
trio.run( trio.run(
partial( partial(
@ -394,16 +480,18 @@ def test_close_channel_explicit(
@pytest.mark.parametrize('use_signal', [False, True]) @pytest.mark.parametrize('use_signal', [False, True])
def test_close_channel_explicit_remote_arbiter( def test_close_channel_explicit_remote_registrar(
daemon: subprocess.Popen, daemon: subprocess.Popen,
start_method, start_method: str,
use_signal, use_signal: bool,
reg_addr, reg_addr: tuple,
): ):
"""Verify that closing a stream explicitly and killing the actor's '''
Verify that closing a stream explicitly and killing the actor's
"root nursery" **before** the containing nursery tears down also "root nursery" **before** the containing nursery tears down also
results in subactor(s) deregistering from the arbiter. results in subactor(s) deregistering from the registrar.
"""
'''
with pytest.raises(KeyboardInterrupt): with pytest.raises(KeyboardInterrupt):
trio.run( trio.run(
partial( partial(
@ -413,3 +501,65 @@ def test_close_channel_explicit_remote_arbiter(
remote_arbiter=True, remote_arbiter=True,
), ),
) )
@tractor.context
async def kill_transport(
ctx: tractor.Context,
) -> None:
await ctx.started()
actor: tractor.Actor = tractor.current_actor()
actor.ipc_server.cancel()
await trio.sleep_forever()
# @pytest.mark.parametrize('use_signal', [False, True])
def test_stale_entry_is_deleted(
debug_mode: bool,
daemon: subprocess.Popen,
start_method: str,
reg_addr: tuple,
):
'''
Ensure that when a stale entry is detected in the registrar's
table that the `find_actor()` API takes care of deleting the
stale entry and not delivering a bad portal.
'''
async def main():
name: str = 'transport_fails_actor'
_reg_ptl: tractor.Portal
an: tractor.ActorNursery
async with (
tractor.open_nursery(
debug_mode=debug_mode,
registry_addrs=[reg_addr],
) as an,
tractor.get_registry(reg_addr) as _reg_ptl,
):
ptl: tractor.Portal = await an.start_actor(
name,
enable_modules=[__name__],
)
async with ptl.open_context(
kill_transport,
) as (first, ctx):
async with tractor.find_actor(
name,
registry_addrs=[reg_addr],
) as maybe_portal:
# because the transitive
# `._api.maybe_open_portal()` call should
# fail and implicitly call `.delete_addr()`
assert maybe_portal is None
registry: dict = await unpack_reg(_reg_ptl)
assert ptl.chan.aid.uid not in registry
# should fail since we knocked out the IPC tpt XD
await ptl.cancel_actor()
await an.cancel()
trio.run(main)

View File

@ -0,0 +1,345 @@
'''
`open_root_actor(tpt_bind_addrs=...)` test suite.
Verify all three runtime code paths for explicit IPC-server
bind-address selection in `_root.py`:
1. Non-registrar, no explicit bind -> random addrs from registry proto
2. Registrar, no explicit bind -> binds to registry_addrs
3. Explicit bind given -> wraps via `wrap_address()` and uses them
'''
import pytest
import trio
import tractor
from tractor.discovery._addr import (
wrap_address,
)
from tractor.discovery._multiaddr import mk_maddr
from tractor._testing.addr import get_rando_addr
# ------------------------------------------------------------------
# helpers
# ------------------------------------------------------------------
def _bound_bindspaces(
actor: tractor.Actor,
) -> set[str]:
'''
Collect the set of bindspace strings from the actor's
currently bound IPC-server accept addresses.
'''
return {
wrap_address(a).bindspace
for a in actor.accept_addrs
}
def _bound_wrapped(
actor: tractor.Actor,
) -> list:
'''
Return the actor's accept addrs as wrapped `Address` objects.
'''
return [
wrap_address(a)
for a in actor.accept_addrs
]
# ------------------------------------------------------------------
# 1) Registrar + explicit tpt_bind_addrs
# ------------------------------------------------------------------
@pytest.mark.parametrize(
'addr_combo',
[
'bind-eq-reg',
'bind-subset-reg',
'bind-disjoint-reg',
],
ids=lambda v: v,
)
def test_registrar_root_tpt_bind_addrs(
reg_addr: tuple,
tpt_proto: str,
debug_mode: bool,
addr_combo: str,
):
'''
Registrar root-actor with explicit `tpt_bind_addrs`:
bound set must include all registry + all bind addr bindspaces
(merge behavior).
'''
reg_wrapped = wrap_address(reg_addr)
if addr_combo == 'bind-eq-reg':
bind_addrs = [reg_addr]
# extra secondary reg addr for subset test
extra_reg = []
elif addr_combo == 'bind-subset-reg':
second_reg = get_rando_addr(tpt_proto)
bind_addrs = [reg_addr]
extra_reg = [second_reg]
elif addr_combo == 'bind-disjoint-reg':
# port=0 on same host -> completely different addr
rando = wrap_address(reg_addr).get_random(
bindspace=reg_wrapped.bindspace,
)
bind_addrs = [rando.unwrap()]
extra_reg = []
all_reg = [reg_addr] + extra_reg
async def _main():
async with tractor.open_root_actor(
registry_addrs=all_reg,
tpt_bind_addrs=bind_addrs,
debug_mode=debug_mode,
):
actor = tractor.current_actor()
assert actor.is_registrar
bound = actor.accept_addrs
bound_bs = _bound_bindspaces(actor)
# all registry bindspaces must appear in bound set
for ra in all_reg:
assert wrap_address(ra).bindspace in bound_bs
# all bind-addr bindspaces must appear
for ba in bind_addrs:
assert wrap_address(ba).bindspace in bound_bs
# registry addr must appear verbatim in bound
# (after wrapping both sides for comparison)
bound_w = _bound_wrapped(actor)
assert reg_wrapped in bound_w
if addr_combo == 'bind-disjoint-reg':
assert len(bound) >= 2
trio.run(_main)
@pytest.mark.parametrize(
'addr_combo',
[
'bind-same-bindspace',
'bind-disjoint',
],
ids=lambda v: v,
)
def test_non_registrar_root_tpt_bind_addrs(
daemon,
reg_addr: tuple,
tpt_proto: str,
debug_mode: bool,
addr_combo: str,
):
'''
Non-registrar root with explicit `tpt_bind_addrs`:
bound set must exactly match the requested bind addrs
(no merge with registry).
'''
reg_wrapped = wrap_address(reg_addr)
if addr_combo == 'bind-same-bindspace':
# same bindspace as reg but port=0 so we get a random port
rando = reg_wrapped.get_random(
bindspace=reg_wrapped.bindspace,
)
bind_addrs = [rando.unwrap()]
elif addr_combo == 'bind-disjoint':
rando = reg_wrapped.get_random(
bindspace=reg_wrapped.bindspace,
)
bind_addrs = [rando.unwrap()]
async def _main():
async with tractor.open_root_actor(
registry_addrs=[reg_addr],
tpt_bind_addrs=bind_addrs,
debug_mode=debug_mode,
):
actor = tractor.current_actor()
assert not actor.is_registrar
bound = actor.accept_addrs
assert len(bound) == len(bind_addrs)
# bindspaces must match
bound_bs = _bound_bindspaces(actor)
for ba in bind_addrs:
assert wrap_address(ba).bindspace in bound_bs
# TCP port=0 should resolve to a real port
for uw_addr in bound:
w = wrap_address(uw_addr)
if w.proto_key == 'tcp':
_host, port = uw_addr
assert port > 0
trio.run(_main)
# ------------------------------------------------------------------
# 3) Non-registrar, default random bind (baseline)
# ------------------------------------------------------------------
def test_non_registrar_default_random_bind(
daemon,
reg_addr: tuple,
debug_mode: bool,
):
'''
Baseline: no `tpt_bind_addrs`, daemon running.
Bound bindspace matches registry bindspace,
but bound addr differs from reg_addr (random).
'''
reg_wrapped = wrap_address(reg_addr)
async def _main():
async with tractor.open_root_actor(
registry_addrs=[reg_addr],
debug_mode=debug_mode,
):
actor = tractor.current_actor()
assert not actor.is_registrar
bound_bs = _bound_bindspaces(actor)
assert reg_wrapped.bindspace in bound_bs
# bound addr should differ from the registry addr
# (the runtime picks a random port/path)
bound_w = _bound_wrapped(actor)
assert reg_wrapped not in bound_w
trio.run(_main)
# ------------------------------------------------------------------
# 4) Multiaddr string input
# ------------------------------------------------------------------
def test_tpt_bind_addrs_as_maddr_str(
reg_addr: tuple,
debug_mode: bool,
):
'''
Pass multiaddr strings as `tpt_bind_addrs`.
Runtime should parse and bind successfully.
'''
reg_wrapped = wrap_address(reg_addr)
# build a port-0 / random maddr string for binding
rando = reg_wrapped.get_random(
bindspace=reg_wrapped.bindspace,
)
maddr_str: str = str(mk_maddr(rando))
async def _main():
async with tractor.open_root_actor(
registry_addrs=[reg_addr],
tpt_bind_addrs=[maddr_str],
debug_mode=debug_mode,
):
actor = tractor.current_actor()
assert actor.is_registrar
for uw_addr in actor.accept_addrs:
w = wrap_address(uw_addr)
if w.proto_key == 'tcp':
_host, port = uw_addr
assert port > 0
trio.run(_main)
# ------------------------------------------------------------------
# 5) Registrar merge produces union of binds
# ------------------------------------------------------------------
def test_registrar_merge_binds_union(
tpt_proto: str,
debug_mode: bool,
):
'''
Registrar + disjoint bind addr: bound set must include
both registry and explicit bind addresses.
'''
reg_addr = get_rando_addr(tpt_proto)
reg_wrapped = wrap_address(reg_addr)
rando = reg_wrapped.get_random(
bindspace=reg_wrapped.bindspace,
)
bind_addrs = [rando.unwrap()]
# NOTE: for UDS, `get_random()` produces the same
# filename for the same pid+actor-state, so the
# "disjoint" premise only holds when the addrs
# actually differ (always true for TCP, may
# collide for UDS).
expect_disjoint: bool = (
tuple(reg_addr) != rando.unwrap()
)
async def _main():
async with tractor.open_root_actor(
registry_addrs=[reg_addr],
tpt_bind_addrs=bind_addrs,
debug_mode=debug_mode,
):
actor = tractor.current_actor()
assert actor.is_registrar
bound = actor.accept_addrs
bound_w = _bound_wrapped(actor)
if expect_disjoint:
# must have at least 2 (registry + bind)
assert len(bound) >= 2
# registry addr must appear in bound set
assert reg_wrapped in bound_w
trio.run(_main)
# ------------------------------------------------------------------
# 6) open_nursery forwards tpt_bind_addrs
# ------------------------------------------------------------------
def test_open_nursery_forwards_tpt_bind_addrs(
reg_addr: tuple,
debug_mode: bool,
):
'''
`open_nursery(tpt_bind_addrs=...)` forwards through
`**kwargs` to `open_root_actor()`.
'''
reg_wrapped = wrap_address(reg_addr)
rando = reg_wrapped.get_random(
bindspace=reg_wrapped.bindspace,
)
bind_addrs = [rando.unwrap()]
async def _main():
async with tractor.open_nursery(
registry_addrs=[reg_addr],
tpt_bind_addrs=bind_addrs,
debug_mode=debug_mode,
):
actor = tractor.current_actor()
bound_bs = _bound_bindspaces(actor)
for ba in bind_addrs:
assert wrap_address(ba).bindspace in bound_bs
trio.run(_main)

View File

@ -8,17 +8,16 @@ from pathlib import Path
import pytest import pytest
import trio import trio
import tractor import tractor
from tractor import ( from tractor import Actor
Actor, from tractor.runtime import _state
_state, from tractor.discovery import _addr
_addr,
)
@pytest.fixture @pytest.fixture
def bindspace_dir_str() -> str: def bindspace_dir_str() -> str:
rt_dir: Path = tractor._state.get_rt_dir() from tractor.runtime._state import get_rt_dir
rt_dir: Path = get_rt_dir()
bs_dir: Path = rt_dir / 'doggy' bs_dir: Path = rt_dir / 'doggy'
bs_dir_str: str = str(bs_dir) bs_dir_str: str = str(bs_dir)
assert not bs_dir.is_dir() assert not bs_dir.is_dir()

View File

@ -13,9 +13,9 @@ from tractor import (
Portal, Portal,
ipc, ipc,
msg, msg,
_state,
_addr,
) )
from tractor.runtime import _state
from tractor.discovery import _addr
@tractor.context @tractor.context
async def chk_tpts( async def chk_tpts(
@ -62,6 +62,13 @@ def test_root_passes_tpt_to_sub(
reg_addr: tuple, reg_addr: tuple,
debug_mode: bool, debug_mode: bool,
): ):
# XXX NOTE, the `reg_addr` addr won't be the same type as the
# `tpt_proto_key` would deliver here unless you pass `--tpt-proto
# <tpt_proto_key>` on the CLI.
#
# if tpt_proto_key == 'uds':
# breakpoint()
async def main(): async def main():
async with tractor.open_nursery( async with tractor.open_nursery(
enable_transports=[tpt_proto_key], enable_transports=[tpt_proto_key],

View File

@ -0,0 +1,4 @@
'''
`tractor.msg.*` sub-sys test suite.
'''

View File

@ -0,0 +1,4 @@
'''
`tractor.msg.*` test sub-pkg conf.
'''

View File

@ -61,7 +61,7 @@ async def maybe_expect_raises(
Async wrapper for ensuring errors propagate from the inner scope. Async wrapper for ensuring errors propagate from the inner scope.
''' '''
if tractor._state.debug_mode(): if tractor.debug_mode():
timeout += 999 timeout += 999
with trio.fail_after(timeout): with trio.fail_after(timeout):

View File

@ -0,0 +1,240 @@
'''
Unit tests for `tractor.msg.pretty_struct`
private-field filtering in `pformat()`.
'''
import pytest
from tractor.msg.pretty_struct import (
Struct,
pformat,
iter_struct_ppfmt_lines,
)
from tractor.msg._codec import (
MsgDec,
mk_dec,
)
# ------ test struct definitions ------ #
class PublicOnly(Struct):
'''
All-public fields for baseline testing.
'''
name: str = 'alice'
age: int = 30
class PrivateOnly(Struct):
'''
Only underscore-prefixed (private) fields.
'''
_secret: str = 'hidden'
_internal: int = 99
class MixedFields(Struct):
'''
Mix of public and private fields.
'''
name: str = 'bob'
_hidden: int = 42
value: float = 3.14
_meta: str = 'internal'
class Inner(
Struct,
frozen=True,
):
'''
Frozen inner struct with a private field,
for nesting tests.
'''
x: int = 1
_secret: str = 'nope'
class Outer(Struct):
'''
Outer struct nesting an `Inner`.
'''
label: str = 'outer'
inner: Inner = Inner()
class EmptyStruct(Struct):
'''
Struct with zero fields.
'''
pass
# ------ tests ------ #
@pytest.mark.parametrize(
'struct_and_expected',
[
(
PublicOnly(),
{
'shown': ['name', 'age'],
'hidden': [],
},
),
(
MixedFields(),
{
'shown': ['name', 'value'],
'hidden': ['_hidden', '_meta'],
},
),
(
PrivateOnly(),
{
'shown': [],
'hidden': ['_secret', '_internal'],
},
),
],
ids=[
'all-public',
'mixed-pub-priv',
'all-private',
],
)
def test_field_visibility_in_pformat(
struct_and_expected: tuple[
Struct,
dict[str, list[str]],
],
):
'''
Verify `pformat()` shows public fields
and hides `_`-prefixed private fields.
'''
(
struct,
expected,
) = struct_and_expected
output: str = pformat(struct)
for field_name in expected['shown']:
assert field_name in output, (
f'{field_name!r} should appear in:\n'
f'{output}'
)
for field_name in expected['hidden']:
assert field_name not in output, (
f'{field_name!r} should NOT appear in:\n'
f'{output}'
)
def test_iter_ppfmt_lines_skips_private():
'''
Directly verify `iter_struct_ppfmt_lines()`
never yields tuples with `_`-prefixed field
names.
'''
struct = MixedFields()
lines: list[tuple[str, str]] = list(
iter_struct_ppfmt_lines(
struct,
field_indent=2,
)
)
# should have lines for public fields only
assert len(lines) == 2
for _prefix, line_content in lines:
field_name: str = (
line_content.split(':')[0].strip()
)
assert not field_name.startswith('_'), (
f'private field leaked: {field_name!r}'
)
def test_nested_struct_filters_inner_private():
'''
Verify that nested struct's private fields
are also filtered out during recursion.
'''
outer = Outer()
output: str = pformat(outer)
# outer's public field
assert 'label' in output
# inner's public field (recursed into)
assert 'x' in output
# inner's private field must be hidden
assert '_secret' not in output
def test_empty_struct_pformat():
'''
An empty struct should produce a valid
`pformat()` result with no field lines.
'''
output: str = pformat(EmptyStruct())
assert 'EmptyStruct(' in output
assert output.rstrip().endswith(')')
# no field lines => only struct header+footer
lines: list[tuple[str, str]] = list(
iter_struct_ppfmt_lines(
EmptyStruct(),
field_indent=2,
)
)
assert lines == []
def test_real_msgdec_pformat_hides_private():
'''
Verify `pformat()` on a real `MsgDec`
hides the `_dec` internal field.
NOTE: `MsgDec.__repr__` is custom and does
NOT call `pformat()`, so we call it directly.
'''
dec: MsgDec = mk_dec(spec=int)
output: str = pformat(dec)
# the private `_dec` field should be filtered
assert '_dec' not in output
# but the struct type name should be present
assert 'MsgDec(' in output
def test_pformat_repr_integration():
'''
Verify that `Struct.__repr__()` (which calls
`pformat()`) also hides private fields for
custom structs that do NOT override `__repr__`.
'''
mixed = MixedFields()
output: str = repr(mixed)
assert 'name' in output
assert 'value' in output
assert '_hidden' not in output
assert '_meta' not in output

View File

@ -1,7 +1,12 @@
""" '''
Bidirectional streaming. Audit the simplest inter-actor bidirectional (streaming)
msg patterns.
""" '''
from __future__ import annotations
from typing import (
Callable,
)
import pytest import pytest
import trio import trio
import tractor import tractor
@ -9,10 +14,8 @@ import tractor
@tractor.context @tractor.context
async def simple_rpc( async def simple_rpc(
ctx: tractor.Context, ctx: tractor.Context,
data: int, data: int,
) -> None: ) -> None:
''' '''
Test a small ping-pong server. Test a small ping-pong server.
@ -39,15 +42,13 @@ async def simple_rpc(
@tractor.context @tractor.context
async def simple_rpc_with_forloop( async def simple_rpc_with_forloop(
ctx: tractor.Context, ctx: tractor.Context,
data: int, data: int,
) -> None: ) -> None:
"""Same as previous test but using ``async for`` syntax/api. '''
Same as previous test but using `async for` syntax/api.
"""
'''
# signal to parent that we're up # signal to parent that we're up
await ctx.started(data + 1) await ctx.started(data + 1)
@ -68,21 +69,37 @@ async def simple_rpc_with_forloop(
@pytest.mark.parametrize( @pytest.mark.parametrize(
'use_async_for', 'use_async_for',
[True, False], [
True,
False,
],
ids='use_async_for={}'.format,
) )
@pytest.mark.parametrize( @pytest.mark.parametrize(
'server_func', 'server_func',
[simple_rpc, simple_rpc_with_forloop], [
simple_rpc,
simple_rpc_with_forloop,
],
ids='server_func={}'.format,
) )
def test_simple_rpc(server_func, use_async_for): def test_simple_rpc(
server_func: Callable,
use_async_for: bool,
loglevel: str,
debug_mode: bool,
):
''' '''
The simplest request response pattern. The simplest request response pattern.
''' '''
async def main(): async def main():
async with tractor.open_nursery() as n: with trio.fail_after(6):
async with tractor.open_nursery(
portal = await n.start_actor( loglevel=loglevel,
debug_mode=debug_mode,
) as an:
portal: tractor.Portal = await an.start_actor(
'rpc_server', 'rpc_server',
enable_modules=[__name__], enable_modules=[__name__],
) )

View File

@ -98,7 +98,8 @@ def test_ipc_channel_break_during_stream(
expect_final_exc = TransportClosed expect_final_exc = TransportClosed
mod: ModuleType = import_path( mod: ModuleType = import_path(
examples_dir() / 'advanced_faults' examples_dir()
/ 'advanced_faults'
/ 'ipc_failure_during_stream.py', / 'ipc_failure_during_stream.py',
root=examples_dir(), root=examples_dir(),
consider_namespace_packages=False, consider_namespace_packages=False,
@ -113,8 +114,9 @@ def test_ipc_channel_break_during_stream(
if ( if (
# only expect EoC if trans is broken on the child side, # only expect EoC if trans is broken on the child side,
ipc_break['break_child_ipc_after'] is not False ipc_break['break_child_ipc_after'] is not False
and
# AND we tell the child to call `MsgStream.aclose()`. # AND we tell the child to call `MsgStream.aclose()`.
and pre_aclose_msgstream pre_aclose_msgstream
): ):
# expect_final_exc = trio.EndOfChannel # expect_final_exc = trio.EndOfChannel
# ^XXX NOPE! XXX^ since now `.open_stream()` absorbs this # ^XXX NOPE! XXX^ since now `.open_stream()` absorbs this
@ -144,9 +146,6 @@ def test_ipc_channel_break_during_stream(
# a user sending ctl-c by raising a KBI. # a user sending ctl-c by raising a KBI.
if pre_aclose_msgstream: if pre_aclose_msgstream:
expect_final_exc = KeyboardInterrupt expect_final_exc = KeyboardInterrupt
if tpt_proto == 'uds':
expect_final_exc = TransportClosed
expect_final_cause = trio.BrokenResourceError
# XXX OLD XXX # XXX OLD XXX
# if child calls `MsgStream.aclose()` then expect EoC. # if child calls `MsgStream.aclose()` then expect EoC.
@ -160,16 +159,13 @@ def test_ipc_channel_break_during_stream(
ipc_break['break_child_ipc_after'] is not False ipc_break['break_child_ipc_after'] is not False
and ( and (
ipc_break['break_parent_ipc_after'] ipc_break['break_parent_ipc_after']
> ipc_break['break_child_ipc_after'] >
ipc_break['break_child_ipc_after']
) )
): ):
if pre_aclose_msgstream: if pre_aclose_msgstream:
expect_final_exc = KeyboardInterrupt expect_final_exc = KeyboardInterrupt
if tpt_proto == 'uds':
expect_final_exc = TransportClosed
expect_final_cause = trio.BrokenResourceError
# NOTE when the parent IPC side dies (even if the child does as well # NOTE when the parent IPC side dies (even if the child does as well
# but the child fails BEFORE the parent) we always expect the # but the child fails BEFORE the parent) we always expect the
# IPC layer to raise a closed-resource, NEVER do we expect # IPC layer to raise a closed-resource, NEVER do we expect
@ -248,8 +244,15 @@ def test_ipc_channel_break_during_stream(
# get raw instance from pytest wrapper # get raw instance from pytest wrapper
value = excinfo.value value = excinfo.value
if isinstance(value, ExceptionGroup): if isinstance(value, ExceptionGroup):
excs = value.exceptions excs: tuple[Exception] = value.exceptions
assert len(excs) == 1 assert (
len(excs) <= 2
and
all(
isinstance(exc, TransportClosed)
for exc in excs
)
)
final_exc = excs[0] final_exc = excs[0]
assert isinstance(final_exc, expect_final_exc) assert isinstance(final_exc, expect_final_exc)

View File

@ -17,8 +17,8 @@ from tractor._testing import (
from .conftest import no_windows from .conftest import no_windows
def is_win(): _non_linux: bool = platform.system() != 'Linux'
return platform.system() == 'Windows' _friggin_windows: bool = platform.system() == 'Windows'
async def assert_err(delay=0): async def assert_err(delay=0):
@ -431,7 +431,7 @@ async def test_nested_multierrors(loglevel, start_method):
for subexc in err.exceptions: for subexc in err.exceptions:
# verify first level actor errors are wrapped as remote # verify first level actor errors are wrapped as remote
if is_win(): if _friggin_windows:
# windows is often too slow and cancellation seems # windows is often too slow and cancellation seems
# to happen before an actor is spawned # to happen before an actor is spawned
@ -464,7 +464,7 @@ async def test_nested_multierrors(loglevel, start_method):
# XXX not sure what's up with this.. # XXX not sure what's up with this..
# on windows sometimes spawning is just too slow and # on windows sometimes spawning is just too slow and
# we get back the (sent) cancel signal instead # we get back the (sent) cancel signal instead
if is_win(): if _friggin_windows:
if isinstance(subexc, tractor.RemoteActorError): if isinstance(subexc, tractor.RemoteActorError):
assert subexc.boxed_type in ( assert subexc.boxed_type in (
BaseExceptionGroup, BaseExceptionGroup,
@ -490,7 +490,7 @@ def test_cancel_via_SIGINT(
"""Ensure that a control-C (SIGINT) signal cancels both the parent and """Ensure that a control-C (SIGINT) signal cancels both the parent and
child processes in trionic fashion child processes in trionic fashion
""" """
pid = os.getpid() pid: int = os.getpid()
async def main(): async def main():
with trio.fail_after(2): with trio.fail_after(2):
@ -507,19 +507,31 @@ def test_cancel_via_SIGINT(
@no_windows @no_windows
def test_cancel_via_SIGINT_other_task( def test_cancel_via_SIGINT_other_task(
loglevel, loglevel: str,
start_method, start_method: str,
spawn_backend, spawn_backend: str,
): ):
"""Ensure that a control-C (SIGINT) signal cancels both the parent '''
and child processes in trionic fashion even a subprocess is started Ensure that a control-C (SIGINT) signal cancels both the parent
from a seperate ``trio`` child task. and child processes in trionic fashion even a subprocess is
""" started from a seperate ``trio`` child task.
pid = os.getpid()
timeout: float = 2 '''
if is_win(): # smh from .conftest import cpu_scaling_factor
pid: int = os.getpid()
timeout: float = (
4 if _non_linux
else 2
)
if _friggin_windows: # smh
timeout += 1 timeout += 1
# add latency headroom for CPU freq scaling (auto-cpufreq et al.)
headroom: float = cpu_scaling_factor()
if headroom != 1.:
timeout *= headroom
async def spawn_and_sleep_forever( async def spawn_and_sleep_forever(
task_status=trio.TASK_STATUS_IGNORED task_status=trio.TASK_STATUS_IGNORED
): ):
@ -644,7 +656,11 @@ def test_cancel_while_childs_child_in_sync_sleep(
# #
# delay = 1 # no AssertionError in eg, TooSlowError raised. # delay = 1 # no AssertionError in eg, TooSlowError raised.
# delay = 2 # is AssertionError in eg AND no TooSlowError !? # delay = 2 # is AssertionError in eg AND no TooSlowError !?
delay = 4 # is AssertionError in eg AND no _cs cancellation. # is AssertionError in eg AND no _cs cancellation.
delay = (
6 if _non_linux
else 4
)
with trio.fail_after(delay) as _cs: with trio.fail_after(delay) as _cs:
# with trio.CancelScope() as cs: # with trio.CancelScope() as cs:
@ -696,7 +712,7 @@ def test_fast_graceful_cancel_when_spawn_task_in_soft_proc_wait_for_daemon(
kbi_delay = 0.5 kbi_delay = 0.5
timeout: float = 2.9 timeout: float = 2.9
if is_win(): # smh if _friggin_windows: # smh
timeout += 1 timeout += 1
async def main(): async def main():

View File

@ -18,16 +18,15 @@ from tractor import RemoteActorError
async def aio_streamer( async def aio_streamer(
from_trio: asyncio.Queue, chan: tractor.to_asyncio.LinkedTaskChannel,
to_trio: trio.abc.SendChannel,
) -> trio.abc.ReceiveChannel: ) -> trio.abc.ReceiveChannel:
# required first msg to sync caller # required first msg to sync caller
to_trio.send_nowait(None) chan.started_nowait(None)
from itertools import cycle from itertools import cycle
for i in cycle(range(10)): for i in cycle(range(10)):
to_trio.send_nowait(i) chan.send_nowait(i)
await asyncio.sleep(0.01) await asyncio.sleep(0.01)
@ -69,7 +68,7 @@ async def wrapper_mngr(
else: else:
async with tractor.to_asyncio.open_channel_from( async with tractor.to_asyncio.open_channel_from(
aio_streamer, aio_streamer,
) as (first, from_aio): ) as (from_aio, first):
assert not first assert not first
# cache it so next task uses broadcast receiver # cache it so next task uses broadcast receiver

View File

@ -10,7 +10,19 @@ from tractor._testing import tractor_test
MESSAGE = 'tractoring at full speed' MESSAGE = 'tractoring at full speed'
def test_empty_mngrs_input_raises() -> None: def test_empty_mngrs_input_raises(
tpt_proto: str,
) -> None:
# TODO, the `open_actor_cluster()` teardown hangs
# intermittently on UDS when `gather_contexts(mngrs=())`
# raises `ValueError` mid-setup; likely a race in the
# actor-nursery cleanup vs UDS socket shutdown. Needs
# a deeper look at `._clustering`/`._supervise` teardown
# paths with the UDS transport.
if tpt_proto == 'uds':
pytest.skip(
'actor-cluster teardown hangs intermittently on UDS'
)
async def main(): async def main():
with trio.fail_after(3): with trio.fail_after(3):
@ -56,13 +68,27 @@ async def worker(
print(msg) print(msg)
assert msg == MESSAGE assert msg == MESSAGE
# TODO: does this ever cause a hang # ?TODO, does this ever cause a hang?
# assert 0 # assert 0
# ?TODO, but needs a fn-scoped tpt_proto fixture..
# @pytest.mark.no_tpt('uds')
@tractor_test @tractor_test
async def test_streaming_to_actor_cluster() -> None: async def test_streaming_to_actor_cluster(
tpt_proto: str,
):
'''
Open an actor "cluster" using the (experimental) `._clustering`
API and conduct standard inter-task-ctx streaming.
'''
if tpt_proto == 'uds':
pytest.skip(
f'Test currently fails with tpt-proto={tpt_proto!r}\n'
)
with trio.fail_after(6):
async with ( async with (
open_actor_cluster(modules=[__name__]) as portals, open_actor_cluster(modules=[__name__]) as portals,

View File

@ -9,6 +9,7 @@ from itertools import count
import math import math
import platform import platform
from pprint import pformat from pprint import pformat
import sys
from typing import ( from typing import (
Callable, Callable,
) )
@ -25,7 +26,7 @@ from tractor._exceptions import (
StreamOverrun, StreamOverrun,
ContextCancelled, ContextCancelled,
) )
from tractor._state import current_ipc_ctx from tractor.runtime._state import current_ipc_ctx
from tractor._testing import ( from tractor._testing import (
tractor_test, tractor_test,
@ -938,9 +939,14 @@ def test_one_end_stream_not_opened(
''' '''
overrunner, buf_size_increase, entrypoint = overrun_by overrunner, buf_size_increase, entrypoint = overrun_by
from tractor._runtime import Actor from tractor.runtime._runtime import Actor
buf_size = buf_size_increase + Actor.msg_buffer_size buf_size = buf_size_increase + Actor.msg_buffer_size
timeout: float = (
1 if sys.platform == 'linux'
else 3
)
async def main(): async def main():
async with tractor.open_nursery( async with tractor.open_nursery(
debug_mode=debug_mode, debug_mode=debug_mode,
@ -950,7 +956,7 @@ def test_one_end_stream_not_opened(
enable_modules=[__name__], enable_modules=[__name__],
) )
with trio.fail_after(1): with trio.fail_after(timeout):
async with portal.open_context( async with portal.open_context(
entrypoint, entrypoint,
) as (ctx, sent): ) as (ctx, sent):

View File

@ -9,12 +9,17 @@ import sys
import subprocess import subprocess
import platform import platform
import shutil import shutil
from typing import Callable
import pytest import pytest
import tractor
from tractor._testing import ( from tractor._testing import (
examples_dir, examples_dir,
) )
_non_linux: bool = platform.system() != 'Linux'
_friggin_macos: bool = platform.system() == 'Darwin'
@pytest.fixture @pytest.fixture
def run_example_in_subproc( def run_example_in_subproc(
@ -89,8 +94,10 @@ def run_example_in_subproc(
for f in p[2] for f in p[2]
if ( if (
'__' not in f '__' not in f # ignore any pkg-mods
and f[0] != '_' # ignore any `__pycache__` subdir
and '__pycache__' not in str(p[0])
and f[0] != '_' # ignore any WIP "examplel mods"
and 'debugging' not in p[0] and 'debugging' not in p[0]
and 'integration' not in p[0] and 'integration' not in p[0]
and 'advanced_faults' not in p[0] and 'advanced_faults' not in p[0]
@ -101,8 +108,10 @@ def run_example_in_subproc(
ids=lambda t: t[1], ids=lambda t: t[1],
) )
def test_example( def test_example(
run_example_in_subproc, run_example_in_subproc: Callable,
example_script, example_script: str,
test_log: tractor.log.StackLevelAdapter,
ci_env: bool,
): ):
''' '''
Load and run scripts from this repo's ``examples/`` dir as a user Load and run scripts from this repo's ``examples/`` dir as a user
@ -116,9 +125,39 @@ def test_example(
''' '''
ex_file: str = os.path.join(*example_script) ex_file: str = os.path.join(*example_script)
if 'rpc_bidir_streaming' in ex_file and sys.version_info < (3, 9): if (
'rpc_bidir_streaming' in ex_file
and
sys.version_info < (3, 9)
):
pytest.skip("2-way streaming example requires py3.9 async with syntax") pytest.skip("2-way streaming example requires py3.9 async with syntax")
if (
'full_fledged_streaming_service' in ex_file
and
_friggin_macos
and
ci_env
):
pytest.skip(
'Streaming example is too flaky in CI\n'
'AND their competitor runs this CI service..\n'
'This test does run just fine "in person" however..'
)
from .conftest import cpu_scaling_factor
timeout: float = (
60
if ci_env and _non_linux
else 16
)
# add latency headroom for CPU freq scaling (auto-cpufreq et al.)
headroom: float = cpu_scaling_factor()
if headroom != 1.:
timeout *= headroom
with open(ex_file, 'r') as ex: with open(ex_file, 'r') as ex:
code = ex.read() code = ex.read()
@ -126,9 +165,12 @@ def test_example(
err = None err = None
try: try:
if not proc.poll(): if not proc.poll():
_, err = proc.communicate(timeout=15) _, err = proc.communicate(timeout=timeout)
except subprocess.TimeoutExpired as e: except subprocess.TimeoutExpired as e:
test_log.exception(
f'Example failed to finish within {timeout}s ??\n'
)
proc.kill() proc.kill()
err = e.stderr err = e.stderr

View File

@ -26,8 +26,8 @@ from tractor import (
to_asyncio, to_asyncio,
RemoteActorError, RemoteActorError,
ContextCancelled, ContextCancelled,
_state,
) )
from tractor.runtime import _state
from tractor.trionics import BroadcastReceiver from tractor.trionics import BroadcastReceiver
from tractor._testing import expect_ctxc from tractor._testing import expect_ctxc
@ -47,12 +47,11 @@ async def sleep_and_err(
# just signature placeholders for compat with # just signature placeholders for compat with
# ``to_asyncio.open_channel_from()`` # ``to_asyncio.open_channel_from()``
to_trio: trio.MemorySendChannel|None = None, chan: to_asyncio.LinkedTaskChannel|None = None,
from_trio: asyncio.Queue|None = None,
): ):
if to_trio: if chan:
to_trio.send_nowait('start') chan.started_nowait('start')
await asyncio.sleep(sleep_for) await asyncio.sleep(sleep_for)
assert 0 assert 0
@ -238,7 +237,7 @@ async def trio_ctx(
trio.open_nursery() as tn, trio.open_nursery() as tn,
tractor.to_asyncio.open_channel_from( tractor.to_asyncio.open_channel_from(
sleep_and_err, sleep_and_err,
) as (first, chan), ) as (chan, first),
): ):
assert first == 'start' assert first == 'start'
@ -399,7 +398,7 @@ async def no_to_trio_in_args():
async def push_from_aio_task( async def push_from_aio_task(
sequence: Iterable, sequence: Iterable,
to_trio: trio.abc.SendChannel, chan: to_asyncio.LinkedTaskChannel,
expect_cancel: False, expect_cancel: False,
fail_early: bool, fail_early: bool,
exit_early: bool, exit_early: bool,
@ -407,15 +406,12 @@ async def push_from_aio_task(
) -> None: ) -> None:
try: try:
# print('trying breakpoint')
# breakpoint()
# sync caller ctx manager # sync caller ctx manager
to_trio.send_nowait(True) chan.started_nowait(True)
for i in sequence: for i in sequence:
print(f'asyncio sending {i}') print(f'asyncio sending {i}')
to_trio.send_nowait(i) chan.send_nowait(i)
await asyncio.sleep(0.001) await asyncio.sleep(0.001)
if ( if (
@ -478,7 +474,7 @@ async def stream_from_aio(
trio_exit_early trio_exit_early
)) ))
) as (first, chan): ) as (chan, first):
assert first is True assert first is True
@ -732,15 +728,21 @@ def test_aio_errors_and_channel_propagates_and_closes(
async def aio_echo_server( async def aio_echo_server(
to_trio: trio.MemorySendChannel, chan: to_asyncio.LinkedTaskChannel,
from_trio: asyncio.Queue,
) -> None: ) -> None:
'''
An IPC-msg "echo server" with msgs received and relayed by
a parent `trio.Task` into a child `asyncio.Task`
and then repeated back to that local parent (`trio.Task`)
and sent again back to the original calling remote actor.
to_trio.send_nowait('start') '''
# same semantics as `trio.TaskStatus.started()`
chan.started_nowait('start')
while True: while True:
try: try:
msg = await from_trio.get() msg = await chan.get()
except to_asyncio.TrioTaskExited: except to_asyncio.TrioTaskExited:
print( print(
'breaking aio echo loop due to `trio` exit!' 'breaking aio echo loop due to `trio` exit!'
@ -748,7 +750,7 @@ async def aio_echo_server(
break break
# echo the msg back # echo the msg back
to_trio.send_nowait(msg) chan.send_nowait(msg)
# if we get the terminate sentinel # if we get the terminate sentinel
# break the echo loop # break the echo loop
@ -765,7 +767,10 @@ async def trio_to_aio_echo_server(
): ):
async with to_asyncio.open_channel_from( async with to_asyncio.open_channel_from(
aio_echo_server, aio_echo_server,
) as (first, chan): ) as (
chan,
first, # value from `chan.started_nowait()` above
):
assert first == 'start' assert first == 'start'
await ctx.started(first) await ctx.started(first)
@ -776,7 +781,8 @@ async def trio_to_aio_echo_server(
await chan.send(msg) await chan.send(msg)
out = await chan.receive() out = await chan.receive()
# echo back to parent actor-task
# echo back to parent-actor's remote parent-ctx-task!
await stream.send(out) await stream.send(out)
if out is None: if out is None:
@ -1090,24 +1096,21 @@ def test_sigint_closes_lifetime_stack(
# ?TODO asyncio.Task fn-deco? # ?TODO asyncio.Task fn-deco?
# -[ ] do sig checkingat import time like @context?
# -[ ] maybe name it @aio_task ??
# -[ ] chan: to_asyncio.InterloopChannel ?? # -[ ] chan: to_asyncio.InterloopChannel ??
# -[ ] do fn-sig checking at import time like @context?
# |_[ ] maybe name it @a(sync)io_task ??
# @asyncio_task <- not bad ??
async def raise_before_started( async def raise_before_started(
# from_trio: asyncio.Queue,
# to_trio: trio.abc.SendChannel,
chan: to_asyncio.LinkedTaskChannel, chan: to_asyncio.LinkedTaskChannel,
) -> None: ) -> None:
''' '''
`asyncio.Task` entry point which RTEs before calling `asyncio.Task` entry point which RTEs before calling
`to_trio.send_nowait()`. `chan.started_nowait()`.
''' '''
await asyncio.sleep(0.2) await asyncio.sleep(0.2)
raise RuntimeError('Some shite went wrong before `.send_nowait()`!!') raise RuntimeError('Some shite went wrong before `.send_nowait()`!!')
# to_trio.send_nowait('Uhh we shouldve RTE-d ^^ ??')
chan.started_nowait('Uhh we shouldve RTE-d ^^ ??') chan.started_nowait('Uhh we shouldve RTE-d ^^ ??')
await asyncio.sleep(float('inf')) await asyncio.sleep(float('inf'))

View File

@ -11,18 +11,21 @@ import trio
import tractor import tractor
from tractor import ( # typing from tractor import ( # typing
Actor, Actor,
current_actor,
open_nursery,
Portal,
Context, Context,
ContextCancelled, ContextCancelled,
MsgStream,
Portal,
RemoteActorError, RemoteActorError,
current_actor,
open_nursery,
) )
from tractor._testing import ( from tractor._testing import (
# tractor_test, # tractor_test,
expect_ctxc, expect_ctxc,
) )
from .conftest import cpu_scaling_factor
# XXX TODO cases: # XXX TODO cases:
# - [x] WE cancelled the peer and thus should not see any raised # - [x] WE cancelled the peer and thus should not see any raised
# `ContextCancelled` as it should be reaped silently? # `ContextCancelled` as it should be reaped silently?
@ -200,7 +203,7 @@ async def stream_from_peer(
) -> None: ) -> None:
# sanity # sanity
assert tractor._state.debug_mode() == debug_mode assert tractor.debug_mode() == debug_mode
peer: Portal peer: Portal
try: try:
@ -580,7 +583,7 @@ def test_peer_canceller(
assert ( assert (
re.canceller re.canceller
== ==
root.uid root.aid.uid
) )
else: # the other 2 ctxs else: # the other 2 ctxs
@ -589,7 +592,7 @@ def test_peer_canceller(
and ( and (
re.canceller re.canceller
== ==
canceller.channel.uid canceller.channel.aid.uid
) )
) )
@ -744,7 +747,7 @@ def test_peer_canceller(
# -> each context should have received # -> each context should have received
# a silently absorbed context cancellation # a silently absorbed context cancellation
# in its remote nursery scope. # in its remote nursery scope.
# assert ctx.chan.uid == ctx.canceller # assert ctx.chan.aid.uid == ctx.canceller
# NOTE: when an inter-peer cancellation # NOTE: when an inter-peer cancellation
# occurred, we DO NOT expect this # occurred, we DO NOT expect this
@ -796,12 +799,12 @@ async def basic_echo_server(
) -> None: ) -> None:
''' '''
Just the simplest `MsgStream` echo server which resays what Just the simplest `MsgStream` echo server which resays what you
you told it but with its uid in front ;) told it but with its uid in front ;)
''' '''
actor: Actor = tractor.current_actor() actor: Actor = tractor.current_actor()
uid: tuple = actor.uid uid: tuple = actor.aid.uid
await ctx.started(uid) await ctx.started(uid)
async with ctx.open_stream() as ipc: async with ctx.open_stream() as ipc:
async for msg in ipc: async for msg in ipc:
@ -840,7 +843,7 @@ async def serve_subactors(
async with open_nursery() as an: async with open_nursery() as an:
# sanity # sanity
assert tractor._state.debug_mode() == debug_mode assert tractor.debug_mode() == debug_mode
await ctx.started(peer_name) await ctx.started(peer_name)
async with ctx.open_stream() as ipc: async with ctx.open_stream() as ipc:
@ -856,7 +859,7 @@ async def serve_subactors(
f'|_{peer}\n' f'|_{peer}\n'
) )
await ipc.send(( await ipc.send((
peer.chan.uid, peer.chan.aid.uid,
peer.chan.raddr.unwrap(), peer.chan.raddr.unwrap(),
)) ))
@ -879,7 +882,7 @@ async def client_req_subactor(
) -> None: ) -> None:
# sanity # sanity
if debug_mode: if debug_mode:
assert tractor._state.debug_mode() assert tractor.debug_mode()
# TODO: other cases to do with sub lifetimes: # TODO: other cases to do with sub lifetimes:
# -[ ] test that we can have the server spawn a sub # -[ ] test that we can have the server spawn a sub
@ -966,9 +969,14 @@ async def tell_little_bro(
caller: str = '', caller: str = '',
err_after: float|None = None, err_after: float|None = None,
rng_seed: int = 50, rng_seed: int = 100,
# NOTE, ensure ^ is large enough (on fast hw anyway)
# to ensure the peer cancel req arrives before the
# echoing dialog does itself Bp
): ):
# contact target actor, do a stream dialog. # contact target actor, do a stream dialog.
lb: Portal
echo_ipc: MsgStream
async with ( async with (
tractor.wait_for_actor( tractor.wait_for_actor(
name=actor_name name=actor_name
@ -983,17 +991,17 @@ async def tell_little_bro(
else None else None
), ),
) as (sub_ctx, first), ) as (sub_ctx, first),
sub_ctx.open_stream() as echo_ipc, sub_ctx.open_stream() as echo_ipc,
): ):
actor: Actor = current_actor() actor: Actor = current_actor()
uid: tuple = actor.uid uid: tuple = actor.aid.uid
for i in range(rng_seed): for i in range(rng_seed):
msg: tuple = ( msg: tuple = (
uid, uid,
i, i,
) )
await echo_ipc.send(msg) await echo_ipc.send(msg)
await trio.sleep(0.001)
resp = await echo_ipc.receive() resp = await echo_ipc.receive()
print( print(
f'{caller} => {actor_name}: {msg}\n' f'{caller} => {actor_name}: {msg}\n'
@ -1006,6 +1014,9 @@ async def tell_little_bro(
assert sub_uid != uid assert sub_uid != uid
assert _i == i assert _i == i
# XXX, usually should never get here!
# await tractor.pause()
@pytest.mark.parametrize( @pytest.mark.parametrize(
'raise_client_error', 'raise_client_error',
@ -1020,6 +1031,10 @@ def test_peer_spawns_and_cancels_service_subactor(
raise_client_error: str, raise_client_error: str,
reg_addr: tuple[str, int], reg_addr: tuple[str, int],
raise_sub_spawn_error_after: float|None, raise_sub_spawn_error_after: float|None,
loglevel: str,
test_log: tractor.log.StackLevelAdapter,
# ^XXX, set to 'warning' to see masked-exc warnings
# that may transpire during actor-nursery teardown.
): ):
# NOTE: this tests for the modden `mod wks open piker` bug # NOTE: this tests for the modden `mod wks open piker` bug
# discovered as part of implementing workspace ctx # discovered as part of implementing workspace ctx
@ -1049,6 +1064,7 @@ def test_peer_spawns_and_cancels_service_subactor(
# NOTE: to halt the peer tasks on ctxc, uncomment this. # NOTE: to halt the peer tasks on ctxc, uncomment this.
debug_mode=debug_mode, debug_mode=debug_mode,
registry_addrs=[reg_addr], registry_addrs=[reg_addr],
loglevel=loglevel,
) as an: ) as an:
server: Portal = await an.start_actor( server: Portal = await an.start_actor(
(server_name := 'spawn_server'), (server_name := 'spawn_server'),
@ -1084,7 +1100,7 @@ def test_peer_spawns_and_cancels_service_subactor(
) as (client_ctx, client_says), ) as (client_ctx, client_says),
): ):
root: Actor = current_actor() root: Actor = current_actor()
spawner_uid: tuple = spawn_ctx.chan.uid spawner_uid: tuple = spawn_ctx.chan.aid.uid
print( print(
f'Server says: {first}\n' f'Server says: {first}\n'
f'Client says: {client_says}\n' f'Client says: {client_says}\n'
@ -1103,7 +1119,7 @@ def test_peer_spawns_and_cancels_service_subactor(
print( print(
'Sub-spawn came online\n' 'Sub-spawn came online\n'
f'portal: {sub}\n' f'portal: {sub}\n'
f'.uid: {sub.actor.uid}\n' f'.uid: {sub.actor.aid.uid}\n'
f'chan.raddr: {sub.chan.raddr}\n' f'chan.raddr: {sub.chan.raddr}\n'
) )
@ -1137,7 +1153,7 @@ def test_peer_spawns_and_cancels_service_subactor(
assert isinstance(res, ContextCancelled) assert isinstance(res, ContextCancelled)
assert client_ctx.cancel_acked assert client_ctx.cancel_acked
assert res.canceller == root.uid assert res.canceller == root.aid.uid
assert not raise_sub_spawn_error_after assert not raise_sub_spawn_error_after
# cancelling the spawner sub should # cancelling the spawner sub should
@ -1171,8 +1187,8 @@ def test_peer_spawns_and_cancels_service_subactor(
# little_bro: a `RuntimeError`. # little_bro: a `RuntimeError`.
# #
check_inner_rte(rae) check_inner_rte(rae)
assert rae.relay_uid == client.chan.uid assert rae.relay_uid == client.chan.aid.uid
assert rae.src_uid == sub.chan.uid assert rae.src_uid == sub.chan.aid.uid
assert not client_ctx.cancel_acked assert not client_ctx.cancel_acked
assert ( assert (
@ -1201,12 +1217,12 @@ def test_peer_spawns_and_cancels_service_subactor(
except ContextCancelled as ctxc: except ContextCancelled as ctxc:
_ctxc = ctxc _ctxc = ctxc
print( print(
f'{root.uid} caught ctxc from ctx with {client_ctx.chan.uid}\n' f'{root.aid.uid} caught ctxc from ctx with {client_ctx.chan.aid.uid}\n'
f'{repr(ctxc)}\n' f'{repr(ctxc)}\n'
) )
if not raise_sub_spawn_error_after: if not raise_sub_spawn_error_after:
assert ctxc.canceller == root.uid assert ctxc.canceller == root.aid.uid
else: else:
assert ctxc.canceller == spawner_uid assert ctxc.canceller == spawner_uid
@ -1237,9 +1253,20 @@ def test_peer_spawns_and_cancels_service_subactor(
# assert spawn_ctx.cancelled_caught # assert spawn_ctx.cancelled_caught
async def _main(): async def _main():
headroom: float = cpu_scaling_factor()
this_fast_on_linux: float = 3
this_fast = this_fast_on_linux * headroom
if headroom != 1.:
test_log.warning(
f'Adding latency headroom on linux bc CPU scaling,\n'
f'headroom: {headroom}\n'
f'this_fast_on_linux: {this_fast_on_linux} -> {this_fast}\n'
)
with trio.fail_after( with trio.fail_after(
3 if not debug_mode this_fast
if not debug_mode
else 999 else 999
): ):
await main() await main()

View File

@ -1,9 +1,11 @@
""" """
Streaming via async gen api Streaming via the, now legacy, "async-gen API".
""" """
import time import time
from functools import partial from functools import partial
import platform import platform
from typing import Callable
import trio import trio
import tractor import tractor
@ -19,7 +21,11 @@ def test_must_define_ctx():
async def no_ctx(): async def no_ctx():
pass pass
assert "no_ctx must be `ctx: tractor.Context" in str(err.value) assert (
"no_ctx must be `ctx: tractor.Context"
in
str(err.value)
)
@tractor.stream @tractor.stream
async def has_ctx(ctx): async def has_ctx(ctx):
@ -69,14 +75,14 @@ async def stream_from_single_subactor(
async with tractor.open_nursery( async with tractor.open_nursery(
registry_addrs=[reg_addr], registry_addrs=[reg_addr],
start_method=start_method, start_method=start_method,
) as nursery: ) as an:
async with tractor.find_actor('streamerd') as portals: async with tractor.find_actor('streamerd') as portals:
if not portals: if not portals:
# no brokerd actor found # no brokerd actor found
portal = await nursery.start_actor( portal = await an.start_actor(
'streamerd', 'streamerd',
enable_modules=[__name__], enable_modules=[__name__],
) )
@ -116,11 +122,22 @@ async def stream_from_single_subactor(
@pytest.mark.parametrize( @pytest.mark.parametrize(
'stream_func', [async_gen_stream, context_stream] 'stream_func',
[
async_gen_stream,
context_stream,
],
ids='stream_func={}'.format
) )
def test_stream_from_single_subactor(reg_addr, start_method, stream_func): def test_stream_from_single_subactor(
"""Verify streaming from a spawned async generator. reg_addr: tuple,
""" start_method: str,
stream_func: Callable,
):
'''
Verify streaming from a spawned async generator.
'''
trio.run( trio.run(
partial( partial(
stream_from_single_subactor, stream_from_single_subactor,
@ -132,10 +149,9 @@ def test_stream_from_single_subactor(reg_addr, start_method, stream_func):
# this is the first 2 actors, streamer_1 and streamer_2 # this is the first 2 actors, streamer_1 and streamer_2
async def stream_data(seed): async def stream_data(seed: int):
for i in range(seed): for i in range(seed):
yield i yield i
# trigger scheduler to simulate practical usage # trigger scheduler to simulate practical usage
@ -143,15 +159,17 @@ async def stream_data(seed):
# this is the third actor; the aggregator # this is the third actor; the aggregator
async def aggregate(seed): async def aggregate(seed: int):
"""Ensure that the two streams we receive match but only stream '''
Ensure that the two streams we receive match but only stream
a single set of values to the parent. a single set of values to the parent.
"""
async with tractor.open_nursery() as nursery: '''
async with tractor.open_nursery() as an:
portals = [] portals = []
for i in range(1, 3): for i in range(1, 3):
# fork point # fork point
portal = await nursery.start_actor( portal = await an.start_actor(
name=f'streamer_{i}', name=f'streamer_{i}',
enable_modules=[__name__], enable_modules=[__name__],
) )
@ -164,20 +182,28 @@ async def aggregate(seed):
async with send_chan: async with send_chan:
async with portal.open_stream_from( async with portal.open_stream_from(
stream_data, seed=seed, stream_data,
seed=seed,
) as stream: ) as stream:
async for value in stream: async for value in stream:
# leverage trio's built-in backpressure # leverage trio's built-in backpressure
await send_chan.send(value) await send_chan.send(value)
print(f"FINISHED ITERATING {portal.channel.uid}") print(
f'FINISHED ITERATING!\n'
f'peer: {portal.channel.aid.uid}'
)
# spawn 2 trio tasks to collect streams and push to a local queue # spawn 2 trio tasks to collect streams and push to a local queue
async with trio.open_nursery() as n: async with trio.open_nursery() as tn:
for portal in portals: for portal in portals:
n.start_soon(push_to_chan, portal, send_chan.clone()) tn.start_soon(
push_to_chan,
portal,
send_chan.clone(),
)
# close this local task's reference to send side # close this local task's reference to send side
await send_chan.aclose() await send_chan.aclose()
@ -194,20 +220,21 @@ async def aggregate(seed):
print("FINISHED ITERATING in aggregator") print("FINISHED ITERATING in aggregator")
await nursery.cancel() await an.cancel()
print("WAITING on `ActorNursery` to finish") print("WAITING on `ActorNursery` to finish")
print("AGGREGATOR COMPLETE!") print("AGGREGATOR COMPLETE!")
# this is the main actor and *arbiter* async def a_quadruple_example() -> list[int]:
async def a_quadruple_example(): '''
# a nursery which spawns "actors" Open the root-actor which is also a "registrar".
async with tractor.open_nursery() as nursery:
'''
async with tractor.open_nursery() as an:
seed = int(1e3) seed = int(1e3)
pre_start = time.time() pre_start = time.time()
portal = await nursery.start_actor( portal = await an.start_actor(
name='aggregator', name='aggregator',
enable_modules=[__name__], enable_modules=[__name__],
) )
@ -228,8 +255,14 @@ async def a_quadruple_example():
return result_stream return result_stream
async def cancel_after(wait, reg_addr): async def cancel_after(
async with tractor.open_root_actor(registry_addrs=[reg_addr]): wait: float,
reg_addr: tuple,
) -> list[int]:
async with tractor.open_root_actor(
registry_addrs=[reg_addr],
):
with trio.move_on_after(wait): with trio.move_on_after(wait):
return await a_quadruple_example() return await a_quadruple_example()
@ -240,6 +273,10 @@ def time_quad_ex(
ci_env: bool, ci_env: bool,
spawn_backend: str, spawn_backend: str,
): ):
non_linux: bool = (_sys := platform.system()) != 'Linux'
if ci_env and non_linux:
pytest.skip(f'Test is too flaky on {_sys!r} in CI')
if spawn_backend == 'mp': if spawn_backend == 'mp':
''' '''
no idea but the mp *nix runs are flaking out here often... no idea but the mp *nix runs are flaking out here often...
@ -247,32 +284,59 @@ def time_quad_ex(
''' '''
pytest.skip("Test is too flaky on mp in CI") pytest.skip("Test is too flaky on mp in CI")
timeout = 7 if platform.system() in ('Windows', 'Darwin') else 4 timeout = 7 if non_linux else 4
start = time.time() start = time.time()
results = trio.run(cancel_after, timeout, reg_addr) results: list[int] = trio.run(
diff = time.time() - start cancel_after,
timeout,
reg_addr,
)
diff: float = time.time() - start
assert results assert results
return results, diff return results, diff
def test_a_quadruple_example( def test_a_quadruple_example(
time_quad_ex: tuple, time_quad_ex: tuple[list[int], float],
ci_env: bool, ci_env: bool,
spawn_backend: str, spawn_backend: str,
test_log: tractor.log.StackLevelAdapter,
): ):
''' '''
This also serves as a kind of "we'd like to be this fast test". This also serves as a "we'd like to be this fast" smoke test
given past empirical eval of this suite.
''' '''
non_linux: bool = (_sys := platform.system()) != 'Linux'
this_fast_on_linux: float = 3
this_fast = (
6 if non_linux
else this_fast_on_linux
)
# ^ XXX NOTE,
# i've noticed that tweaking the CPU governor setting
# to not "always" enable "turbo" mode can result in latency
# which causes this limit to be too little. Not sure if it'd
# be worth it to adjust the linux value based on reading the
# CPU conf from the sys?
#
# For ex, see the `auto-cpufreq` docs on such settings,
# https://github.com/AdnanHodzic/auto-cpufreq?tab=readme-ov-file#example-config-file-contents
#
# HENCE this below latency-headroom compensation logic..
from .conftest import cpu_scaling_factor
headroom: float = cpu_scaling_factor()
if headroom != 1.:
this_fast = this_fast_on_linux * headroom
test_log.warning(
f'Adding latency headroom on linux bc CPU scaling,\n'
f'headroom: {headroom}\n'
f'this_fast_on_linux: {this_fast_on_linux} -> {this_fast}\n'
)
results, diff = time_quad_ex results, diff = time_quad_ex
assert results assert results
this_fast = (
6 if platform.system() in (
'Windows',
'Darwin',
)
else 3
)
assert diff < this_fast assert diff < this_fast
@ -281,43 +345,58 @@ def test_a_quadruple_example(
list(map(lambda i: i/10, range(3, 9))) list(map(lambda i: i/10, range(3, 9)))
) )
def test_not_fast_enough_quad( def test_not_fast_enough_quad(
reg_addr, time_quad_ex, cancel_delay, ci_env, spawn_backend reg_addr: tuple,
time_quad_ex: tuple[list[int], float],
cancel_delay: float,
ci_env: bool,
spawn_backend: str,
): ):
"""Verify we can cancel midway through the quad example and all actors '''
cancel gracefully. Verify we can cancel midway through the quad example and all
""" actors cancel gracefully.
'''
results, diff = time_quad_ex results, diff = time_quad_ex
delay = max(diff - cancel_delay, 0) delay = max(diff - cancel_delay, 0)
results = trio.run(cancel_after, delay, reg_addr) results = trio.run(
system = platform.system() cancel_after,
if system in ('Windows', 'Darwin') and results is not None: delay,
reg_addr,
)
system: str = platform.system()
if (
system in ('Windows', 'Darwin')
and
results is not None
):
# In CI envoirments it seems later runs are quicker then the first # In CI envoirments it seems later runs are quicker then the first
# so just ignore these # so just ignore these
print(f"Woa there {system} caught your breath eh?") print(f'Woa there {system} caught your breath eh?')
else: else:
# should be cancelled mid-streaming # should be cancelled mid-streaming
assert results is None assert results is None
@tractor_test @tractor_test(timeout=20)
async def test_respawn_consumer_task( async def test_respawn_consumer_task(
reg_addr, reg_addr: tuple,
spawn_backend, spawn_backend: str,
loglevel, loglevel: str,
): ):
"""Verify that ``._portal.ReceiveStream.shield()`` '''
Verify that ``._portal.ReceiveStream.shield()``
sucessfully protects the underlying IPC channel from being closed sucessfully protects the underlying IPC channel from being closed
when cancelling and respawning a consumer task. when cancelling and respawning a consumer task.
This also serves to verify that all values from the stream can be This also serves to verify that all values from the stream can be
received despite the respawns. received despite the respawns.
""" '''
stream = None stream = None
async with tractor.open_nursery() as n: async with tractor.open_nursery() as an:
portal = await n.start_actor( portal = await an.start_actor(
name='streamer', name='streamer',
enable_modules=[__name__] enable_modules=[__name__]
) )

View File

@ -1,5 +1,5 @@
""" """
Arbiter and "local" actor api Registrar and "local" actor api
""" """
import time import time
@ -12,11 +12,11 @@ from tractor._testing import tractor_test
@pytest.mark.trio @pytest.mark.trio
async def test_no_runtime(): async def test_no_runtime():
"""An arbitter must be established before any nurseries """A registrar must be established before any nurseries
can be created. can be created.
(In other words ``tractor.open_root_actor()`` must be engaged at (In other words ``tractor.open_root_actor()`` must be
some point?) engaged at some point?)
""" """
with pytest.raises(RuntimeError) : with pytest.raises(RuntimeError) :
async with tractor.find_actor('doggy'): async with tractor.find_actor('doggy'):
@ -25,9 +25,9 @@ async def test_no_runtime():
@tractor_test @tractor_test
async def test_self_is_registered(reg_addr): async def test_self_is_registered(reg_addr):
"Verify waiting on the arbiter to register itself using the standard api." "Verify waiting on the registrar to register itself using the standard api."
actor = tractor.current_actor() actor = tractor.current_actor()
assert actor.is_arbiter assert actor.is_registrar
with trio.fail_after(0.2): with trio.fail_after(0.2):
async with tractor.wait_for_actor('root') as portal: async with tractor.wait_for_actor('root') as portal:
assert portal.channel.uid[0] == 'root' assert portal.channel.uid[0] == 'root'
@ -35,11 +35,11 @@ async def test_self_is_registered(reg_addr):
@tractor_test @tractor_test
async def test_self_is_registered_localportal(reg_addr): async def test_self_is_registered_localportal(reg_addr):
"Verify waiting on the arbiter to register itself using a local portal." "Verify waiting on the registrar to register itself using a local portal."
actor = tractor.current_actor() actor = tractor.current_actor()
assert actor.is_arbiter assert actor.is_registrar
async with tractor.get_registry(reg_addr) as portal: async with tractor.get_registry(reg_addr) as portal:
assert isinstance(portal, tractor._portal.LocalPortal) assert isinstance(portal, tractor.runtime._portal.LocalPortal)
with trio.fail_after(0.2): with trio.fail_after(0.2):
sockaddr = await portal.run_from_ns( sockaddr = await portal.run_from_ns(
@ -57,8 +57,8 @@ def test_local_actor_async_func(reg_addr):
async with tractor.open_root_actor( async with tractor.open_root_actor(
registry_addrs=[reg_addr], registry_addrs=[reg_addr],
): ):
# arbiter is started in-proc if dne # registrar is started in-proc if dne
assert tractor.current_actor().is_arbiter assert tractor.current_actor().is_registrar
for i in range(10): for i in range(10):
nums.append(i) nums.append(i)

View File

@ -0,0 +1,185 @@
'''
`tractor.log`-wrapping unit tests.
'''
from pathlib import Path
import shutil
from types import ModuleType
import pytest
import tractor
from tractor import (
_code_load,
log,
)
def test_root_pkg_not_duplicated_in_logger_name():
'''
When both `pkg_name` and `name` are passed and they have
a common `<root_name>.< >` prefix, ensure that it is not
duplicated in the child's `StackLevelAdapter.name: str`.
'''
project_name: str = 'pylib'
pkg_path: str = 'pylib.subpkg.mod'
assert not tractor.current_actor(
err_on_no_runtime=False,
)
proj_log = log.get_logger(
pkg_name=project_name,
mk_sublog=False,
)
sublog = log.get_logger(
pkg_name=project_name,
name=pkg_path,
)
assert proj_log is not sublog
assert sublog.name.count(proj_log.name) == 1
assert 'mod' not in sublog.name
def test_implicit_mod_name_applied_for_child(
testdir: pytest.Pytester,
loglevel: str,
):
'''
Verify that when `.log.get_logger(pkg_name='pylib')` is called
from a given sub-mod from within the `pylib` pkg-path, we
implicitly set the equiv of `name=__name__` from the caller's
module.
'''
# tractor.log.get_console_log(level=loglevel)
proj_name: str = 'snakelib'
mod_code: str = (
f'import tractor\n'
f'\n'
# if you need to trace `testdir` stuff @ import-time..
# f'breakpoint()\n'
f'log = tractor.log.get_logger(pkg_name="{proj_name}")\n'
)
# create a sub-module for each pkg layer
_lib = testdir.mkpydir(proj_name)
pkg: Path = Path(_lib)
pkg_init_mod: Path = pkg / "__init__.py"
pkg_init_mod.write_text(mod_code)
subpkg: Path = pkg / 'subpkg'
subpkg.mkdir()
subpkgmod: Path = subpkg / "__init__.py"
subpkgmod.touch()
subpkgmod.write_text(mod_code)
_submod: Path = testdir.makepyfile(
_mod=mod_code,
)
pkg_submod = pkg / 'mod.py'
pkg_subpkg_submod = subpkg / 'submod.py'
shutil.copyfile(
_submod,
pkg_submod,
)
shutil.copyfile(
_submod,
pkg_subpkg_submod,
)
testdir.chdir()
# NOTE, to introspect the py-file-module-layout use (in .xsh
# syntax): `ranger @str(testdir)`
# XXX NOTE, once the "top level" pkg mod has been
# imported, we can then use `import` syntax to
# import it's sub-pkgs and modules.
subpkgmod: ModuleType = _code_load.load_module_from_path(
Path(pkg / '__init__.py'),
module_name=proj_name,
)
pkg_root_log = log.get_logger(
pkg_name=proj_name,
mk_sublog=False,
)
# the top level pkg-mod, created just now,
# by above API call.
assert pkg_root_log.name == proj_name
assert not pkg_root_log.logger.getChildren()
#
# ^TODO! test this same output but created via a `get_logger()`
# call in the `snakelib.__init__py`!!
# NOTE, the pkg-level "init mod" should of course
# have the same name as the package ns-path.
import snakelib as init_mod
assert init_mod.log.name == proj_name
# NOTE, a first-pkg-level sub-module should only
# use the package-name since the leaf-node-module
# will be included in log headers by default.
from snakelib import mod
assert mod.log.name == proj_name
from snakelib import subpkg
assert (
subpkg.log.name
==
subpkg.__package__
==
f'{proj_name}.subpkg'
)
from snakelib.subpkg import submod
assert (
submod.log.name
==
submod.__package__
==
f'{proj_name}.subpkg'
)
sub_logs = pkg_root_log.logger.getChildren()
assert len(sub_logs) == 1 # only one nested sub-pkg module
assert submod.log.logger in sub_logs
# TODO, moar tests against existing feats:
# ------ - ------
# - [ ] color settings?
# - [ ] header contents like,
# - actor + thread + task names from various conc-primitives,
# - [ ] `StackLevelAdapter` extensions,
# - our custom levels/methods: `transport|runtime|cance|pdb|devx`
# - [ ] custom-headers support?
#
# TODO, test driven dev of new-ideas/long-wanted feats,
# ------ - ------
# - [ ] https://github.com/goodboy/tractor/issues/244
# - [ ] @catern mentioned using a sync / deterministic sys
# and in particular `svlogd`?
# |_ https://smarden.org/runit/svlogd.8
# - [ ] using adapter vs. filters?
# - https://stackoverflow.com/questions/60691759/add-information-to-every-log-message-in-python-logging/61830838#61830838
# - [ ] `.at_least_level()` optimization which short circuits wtv
# `logging` is doing behind the scenes when the level filters
# the emission..?
# - [ ] use of `.log.get_console_log()` in subactors and the
# subtleties of ensuring it actually emits from a subproc.
# - [ ] this idea of activating per-subsys emissions with some
# kind of `.name` filter passed to the runtime or maybe configured
# via the root `StackLevelAdapter`?
# - [ ] use of `logging.dict.dictConfig()` to simplify the impl
# of any of ^^ ??
# - https://stackoverflow.com/questions/7507825/where-is-a-complete-example-of-logging-config-dictconfig
# - https://docs.python.org/3/library/logging.config.html#configuration-dictionary-schema
# - https://docs.python.org/3/library/logging.config.html#logging.config.dictConfig

View File

@ -1,8 +1,13 @@
""" """
Multiple python programs invoking the runtime. Multiple python programs invoking the runtime.
""" """
from __future__ import annotations
import platform import platform
import subprocess
import time import time
from typing import (
TYPE_CHECKING,
)
import pytest import pytest
import trio import trio
@ -10,14 +15,32 @@ import tractor
from tractor._testing import ( from tractor._testing import (
tractor_test, tractor_test,
) )
from tractor import (
current_actor,
Actor,
Context,
Portal,
)
from tractor.runtime import _state
from .conftest import ( from .conftest import (
sig_prog, sig_prog,
_INT_SIGNAL, _INT_SIGNAL,
_INT_RETURN_CODE, _INT_RETURN_CODE,
) )
if TYPE_CHECKING:
from tractor.msg import Aid
from tractor.discovery._addr import (
UnwrappedAddress,
)
def test_abort_on_sigint(daemon):
_non_linux: bool = platform.system() != 'Linux'
def test_abort_on_sigint(
daemon: subprocess.Popen,
):
assert daemon.returncode is None assert daemon.returncode is None
time.sleep(0.1) time.sleep(0.1)
sig_prog(daemon, _INT_SIGNAL) sig_prog(daemon, _INT_SIGNAL)
@ -30,39 +53,131 @@ def test_abort_on_sigint(daemon):
@tractor_test @tractor_test
async def test_cancel_remote_arbiter(daemon, reg_addr): async def test_cancel_remote_registrar(
assert not tractor.current_actor().is_arbiter daemon: subprocess.Popen,
reg_addr: UnwrappedAddress,
):
assert not current_actor().is_registrar
async with tractor.get_registry(reg_addr) as portal: async with tractor.get_registry(reg_addr) as portal:
await portal.cancel_actor() await portal.cancel_actor()
time.sleep(0.1) time.sleep(0.1)
# the arbiter channel server is cancelled but not its main task # the registrar channel server is cancelled but not its main task
assert daemon.returncode is None assert daemon.returncode is None
# no arbiter socket should exist # no registrar socket should exist
with pytest.raises(OSError): with pytest.raises(OSError):
async with tractor.get_registry(reg_addr) as portal: async with tractor.get_registry(reg_addr) as portal:
pass pass
def test_register_duplicate_name(daemon, reg_addr): def test_register_duplicate_name(
daemon: subprocess.Popen,
reg_addr: UnwrappedAddress,
):
async def main(): async def main():
async with tractor.open_nursery( async with tractor.open_nursery(
registry_addrs=[reg_addr], registry_addrs=[reg_addr],
) as n: ) as an:
assert not tractor.current_actor().is_arbiter assert not current_actor().is_registrar
p1 = await n.start_actor('doggy') p1 = await an.start_actor('doggy')
p2 = await n.start_actor('doggy') p2 = await an.start_actor('doggy')
async with tractor.wait_for_actor('doggy') as portal: async with tractor.wait_for_actor('doggy') as portal:
assert portal.channel.uid in (p2.channel.uid, p1.channel.uid) assert portal.channel.uid in (p2.channel.uid, p1.channel.uid)
await n.cancel() await an.cancel()
# run it manually since we want to start **after** # XXX, run manually since we want to start this root **after**
# the other "daemon" program # the other "daemon" program with it's own root.
trio.run(main)
@tractor.context
async def get_root_portal(
ctx: Context,
):
'''
Connect back to the root actor manually (using `._discovery` API)
and ensure it's contact info is the same as our immediate parent.
'''
sub: Actor = current_actor()
rtvs: dict = _state._runtime_vars
raddrs: list[UnwrappedAddress] = rtvs['_root_addrs']
# await tractor.pause()
# XXX, in case the sub->root discovery breaks you might need
# this (i know i did Xp)!!
# from tractor.devx import mk_pdb
# mk_pdb().set_trace()
assert (
len(raddrs) == 1
and
list(sub._parent_chan.raddr.unwrap()) in raddrs
)
# connect back to our immediate parent which should also
# be the actor-tree's root.
from tractor.discovery._api import get_root
ptl: Portal
async with get_root() as ptl:
root_aid: Aid = ptl.chan.aid
parent_ptl: Portal = current_actor().get_parent()
assert (
root_aid.name == 'root'
and
parent_ptl.chan.aid == root_aid
)
await ctx.started()
def test_non_registrar_spawns_child(
daemon: subprocess.Popen,
reg_addr: UnwrappedAddress,
loglevel: str,
debug_mode: bool,
ci_env: bool,
):
'''
Ensure a non-regristar (serving) root actor can spawn a sub and
that sub can connect back (manually) to it's rent that is the
root without issue.
More or less this audits the global contact info in
`._state._runtime_vars`.
'''
async def main():
# XXX, since apparently on macos in GH's CI it can be a race
# with the `daemon` registrar on grabbing the socket-addr..
if ci_env and _non_linux:
await trio.sleep(.5)
async with tractor.open_nursery(
registry_addrs=[reg_addr],
loglevel=loglevel,
debug_mode=debug_mode,
) as an:
actor: Actor = tractor.current_actor()
assert not actor.is_registrar
sub_ptl: Portal = await an.start_actor(
name='sub',
enable_modules=[__name__],
)
async with sub_ptl.open_context(
get_root_portal,
) as (ctx, _):
print('Waiting for `sub` to connect back to us..')
await an.cancel()
# XXX, run manually since we want to start this root **after**
# the other "daemon" program with it's own root.
trio.run(main) trio.run(main)

View File

@ -0,0 +1,333 @@
'''
Verify that externally registered remote actor error
types are correctly relayed, boxed, and re-raised across
IPC actor hops via `reg_err_types()`.
Also ensure that when custom error types are NOT registered
the framework indicates the lookup failure to the user.
'''
import pytest
import trio
import tractor
from tractor import (
Context,
Portal,
RemoteActorError,
)
from tractor._exceptions import (
get_err_type,
reg_err_types,
)
# -- custom app-level errors for testing --
class CustomAppError(Exception):
'''
A hypothetical user-app error that should be
boxed+relayed by `tractor` IPC when registered.
'''
class AnotherAppError(Exception):
'''
A second custom error for multi-type registration.
'''
class UnregisteredAppError(Exception):
'''
A custom error that is intentionally NEVER
registered via `reg_err_types()` so we can
verify the framework's failure indication.
'''
# -- remote-task endpoints --
@tractor.context
async def raise_custom_err(
ctx: Context,
) -> None:
'''
Remote ep that raises a `CustomAppError`
after sync-ing with the caller.
'''
await ctx.started()
raise CustomAppError(
'the app exploded remotely'
)
@tractor.context
async def raise_another_err(
ctx: Context,
) -> None:
'''
Remote ep that raises `AnotherAppError`.
'''
await ctx.started()
raise AnotherAppError(
'another app-level kaboom'
)
@tractor.context
async def raise_unreg_err(
ctx: Context,
) -> None:
'''
Remote ep that raises an `UnregisteredAppError`
which has NOT been `reg_err_types()`-registered.
'''
await ctx.started()
raise UnregisteredAppError(
'this error type is unknown to tractor'
)
# -- unit tests for the type-registry plumbing --
class TestRegErrTypesPlumbing:
'''
Low-level checks on `reg_err_types()` and
`get_err_type()` without requiring IPC.
'''
def test_unregistered_type_returns_none(self):
'''
An unregistered custom error name should yield
`None` from `get_err_type()`.
'''
result = get_err_type('CustomAppError')
assert result is None
def test_register_and_lookup(self):
'''
After `reg_err_types()`, the custom type should
be discoverable via `get_err_type()`.
'''
reg_err_types([CustomAppError])
result = get_err_type('CustomAppError')
assert result is CustomAppError
def test_register_multiple_types(self):
'''
Registering a list of types should make each
one individually resolvable.
'''
reg_err_types([
CustomAppError,
AnotherAppError,
])
assert (
get_err_type('CustomAppError')
is CustomAppError
)
assert (
get_err_type('AnotherAppError')
is AnotherAppError
)
def test_builtin_types_always_resolve(self):
'''
Builtin error types like `RuntimeError` and
`ValueError` should always be found without
any prior registration.
'''
assert (
get_err_type('RuntimeError')
is RuntimeError
)
assert (
get_err_type('ValueError')
is ValueError
)
def test_tractor_native_types_resolve(self):
'''
`tractor`-internal exc types (e.g.
`ContextCancelled`) should always resolve.
'''
assert (
get_err_type('ContextCancelled')
is tractor.ContextCancelled
)
def test_boxed_type_str_without_ipc_msg(self):
'''
When a `RemoteActorError` is constructed
without an IPC msg (and no resolvable type),
`.boxed_type_str` should return `'<unknown>'`.
'''
rae = RemoteActorError('test')
assert rae.boxed_type_str == '<unknown>'
# -- IPC-level integration tests --
def test_registered_custom_err_relayed(
debug_mode: bool,
tpt_proto: str,
):
'''
When a custom error type is registered via
`reg_err_types()` on BOTH sides of an IPC dialog,
the parent should receive a `RemoteActorError`
whose `.boxed_type` matches the original custom
error type.
'''
reg_err_types([CustomAppError])
async def main():
async with tractor.open_nursery(
debug_mode=debug_mode,
enable_transports=[tpt_proto],
) as an:
ptl: Portal = await an.start_actor(
'custom-err-raiser',
enable_modules=[__name__],
)
async with ptl.open_context(
raise_custom_err,
) as (ctx, sent):
assert not sent
try:
await ctx.wait_for_result()
except RemoteActorError as rae:
assert rae.boxed_type is CustomAppError
assert rae.src_type is CustomAppError
assert 'the app exploded remotely' in str(
rae.tb_str
)
raise
with pytest.raises(RemoteActorError) as excinfo:
trio.run(main)
rae = excinfo.value
assert rae.boxed_type is CustomAppError
def test_registered_another_err_relayed(
debug_mode: bool,
tpt_proto: str,
):
'''
Same as above but for a different custom error
type to verify multi-type registration works
end-to-end over IPC.
'''
reg_err_types([AnotherAppError])
async def main():
async with tractor.open_nursery(
debug_mode=debug_mode,
enable_transports=[tpt_proto],
) as an:
ptl: Portal = await an.start_actor(
'another-err-raiser',
enable_modules=[__name__],
)
async with ptl.open_context(
raise_another_err,
) as (ctx, sent):
assert not sent
try:
await ctx.wait_for_result()
except RemoteActorError as rae:
assert (
rae.boxed_type
is AnotherAppError
)
raise
await an.cancel()
with pytest.raises(RemoteActorError) as excinfo:
trio.run(main)
rae = excinfo.value
assert rae.boxed_type is AnotherAppError
def test_unregistered_err_still_relayed(
debug_mode: bool,
tpt_proto: str,
):
'''
Verify that even when a custom error type is NOT registered via
`reg_err_types()`, the remote error is still relayed as
a `RemoteActorError` with all string-level info preserved
(traceback, type name, source actor uid).
The `.boxed_type` will be `None` (type obj can't be resolved) but
`.boxed_type_str` and `.src_type_str` still report the original
type name from the IPC msg.
This documents the expected limitation: without `reg_err_types()`
the `.boxed_type` property can NOT resolve to the original Python
type.
'''
# NOTE: intentionally do NOT call
# `reg_err_types([UnregisteredAppError])`
async def main():
async with tractor.open_nursery(
debug_mode=debug_mode,
enable_transports=[tpt_proto],
) as an:
ptl: Portal = await an.start_actor(
'unreg-err-raiser',
enable_modules=[__name__],
)
async with ptl.open_context(
raise_unreg_err,
) as (ctx, sent):
assert not sent
await ctx.wait_for_result()
await an.cancel()
with pytest.raises(RemoteActorError) as excinfo:
trio.run(main)
rae = excinfo.value
# the error IS relayed even without
# registration; type obj is unresolvable but
# all string-level info is preserved.
assert rae.boxed_type is None # NOT `UnregisteredAppError`
assert rae.src_type is None
# string names survive the IPC round-trip
# via the `Error` msg fields.
assert (
rae.src_type_str
==
'UnregisteredAppError'
)
assert (
rae.boxed_type_str
==
'UnregisteredAppError'
)
# original traceback content is preserved
assert 'this error type is unknown' in rae.tb_str
assert 'UnregisteredAppError' in rae.tb_str

View File

@ -12,14 +12,14 @@ import trio
import tractor import tractor
from tractor.trionics import ( from tractor.trionics import (
maybe_open_context, maybe_open_context,
collapse_eg,
) )
from tractor.log import ( from tractor.log import (
get_console_log, get_console_log,
get_logger, get_logger,
) )
log = get_logger(__name__)
log = get_logger()
_resource: int = 0 _resource: int = 0
@ -213,9 +213,12 @@ def test_open_local_sub_to_stream(
N local tasks using `trionics.maybe_open_context()`. N local tasks using `trionics.maybe_open_context()`.
''' '''
timeout: float = 3.6 from .conftest import cpu_scaling_factor
if platform.system() == "Windows": timeout: float = (
timeout: float = 10 4
if not platform.system() == "Windows"
else 10
) * cpu_scaling_factor()
if debug_mode: if debug_mode:
timeout = 999 timeout = 999
@ -319,7 +322,7 @@ def test_open_local_sub_to_stream(
@acm @acm
async def cancel_outer_cs( async def maybe_cancel_outer_cs(
cs: trio.CancelScope|None = None, cs: trio.CancelScope|None = None,
delay: float = 0, delay: float = 0,
): ):
@ -333,12 +336,31 @@ async def cancel_outer_cs(
if cs: if cs:
log.info('task calling cs.cancel()') log.info('task calling cs.cancel()')
cs.cancel() cs.cancel()
trio.lowlevel.checkpoint()
yield yield
if cs:
await trio.sleep_forever() await trio.sleep_forever()
# XXX, if not cancelled we'll leak this inf-blocking
# subtask to the actor's service tn..
else:
await trio.lowlevel.checkpoint()
@pytest.mark.parametrize(
'delay',
[0.05, 0.5, 1],
ids="pre_sleep_delay={}".format,
)
@pytest.mark.parametrize(
'cancel_by_cs',
[True, False],
ids="cancel_by_cs={}".format,
)
def test_lock_not_corrupted_on_fast_cancel( def test_lock_not_corrupted_on_fast_cancel(
delay: float,
cancel_by_cs: bool,
debug_mode: bool, debug_mode: bool,
loglevel: str, loglevel: str,
): ):
@ -355,17 +377,14 @@ def test_lock_not_corrupted_on_fast_cancel(
due to it having erronously exited without calling due to it having erronously exited without calling
`lock.release()`. `lock.release()`.
''' '''
delay: float = 1.
async def use_moc( async def use_moc(
cs: trio.CancelScope|None,
delay: float, delay: float,
cs: trio.CancelScope|None = None,
): ):
log.info('task entering moc') log.info('task entering moc')
async with maybe_open_context( async with maybe_open_context(
cancel_outer_cs, maybe_cancel_outer_cs,
kwargs={ kwargs={
'cs': cs, 'cs': cs,
'delay': delay, 'delay': delay,
@ -376,8 +395,14 @@ def test_lock_not_corrupted_on_fast_cancel(
else: else:
log.info('1st task entered') log.info('1st task entered')
if cs:
await trio.sleep_forever() await trio.sleep_forever()
else:
await trio.sleep(delay)
# ^END, exit shared ctx.
async def main(): async def main():
with trio.fail_after(delay + 2): with trio.fail_after(delay + 2):
async with ( async with (
@ -385,6 +410,7 @@ def test_lock_not_corrupted_on_fast_cancel(
debug_mode=debug_mode, debug_mode=debug_mode,
loglevel=loglevel, loglevel=loglevel,
), ),
# ?TODO, pass this as the parent tn?
trio.open_nursery() as tn, trio.open_nursery() as tn,
): ):
get_console_log('info') get_console_log('info')
@ -392,15 +418,206 @@ def test_lock_not_corrupted_on_fast_cancel(
cs = tn.cancel_scope cs = tn.cancel_scope
tn.start_soon( tn.start_soon(
use_moc, use_moc,
cs,
delay, delay,
cs if cancel_by_cs else None,
name='child', name='child',
) )
with trio.CancelScope() as rent_cs: with trio.CancelScope() as rent_cs:
await use_moc( await use_moc(
cs=rent_cs,
delay=delay, delay=delay,
cs=rent_cs if cancel_by_cs else None,
) )
trio.run(main)
@acm
async def acm_with_resource(resource_id: str):
'''
Yield `resource_id` as the cached value.
Used to verify per-`ctx_key` isolation when the same
`acm_func` is called with different kwargs.
'''
yield resource_id
def test_per_ctx_key_resource_lifecycle(
debug_mode: bool,
loglevel: str,
):
'''
Verify that `maybe_open_context()` correctly isolates resource
lifecycle **per `ctx_key`** when the same `acm_func` is called
with different kwargs.
Previously `_Cache.users` was a single global `int` and
`_Cache.locks` was keyed on `fid` (function ID), so calling
the same `acm_func` with different kwargs (producing different
`ctx_key`s) meant:
- teardown for one key was skipped bc the *other* key's users
kept the global count > 0,
- and re-entry could hit the old
`assert not resources.get(ctx_key)` crash during the
teardown window.
This was the root cause of a long-standing bug in piker's
`brokerd.kraken` backend.
'''
timeout: float = 6
if debug_mode:
timeout = 999
async def main():
a_ready = trio.Event()
a_exit = trio.Event()
async def hold_resource_a():
'''
Open resource 'a' and keep it alive until signalled.
'''
async with maybe_open_context(
acm_with_resource,
kwargs={'resource_id': 'a'},
) as (cache_hit, value):
assert not cache_hit
assert value == 'a'
log.info("resource 'a' entered (holding)")
a_ready.set()
await a_exit.wait()
log.info("resource 'a' exiting")
with trio.fail_after(timeout):
async with (
tractor.open_root_actor(
debug_mode=debug_mode,
loglevel=loglevel,
),
trio.open_nursery() as tn,
):
# Phase 1: bg task holds resource 'a' open.
tn.start_soon(hold_resource_a)
await a_ready.wait()
# Phase 2: open resource 'b' (different kwargs,
# same acm_func) then exit it while 'a' is still
# alive.
async with maybe_open_context(
acm_with_resource,
kwargs={'resource_id': 'b'},
) as (cache_hit, value):
assert not cache_hit
assert value == 'b'
log.info("resource 'b' entered")
log.info("resource 'b' exited, waiting for teardown")
await trio.lowlevel.checkpoint()
# Phase 3: re-open 'b'; must be a fresh cache MISS
# proving 'b' was torn down independently of 'a'.
#
# With the old global `_Cache.users` counter this
# would be a stale cache HIT (leaked resource) or
# trigger `assert not resources.get(ctx_key)`.
async with maybe_open_context(
acm_with_resource,
kwargs={'resource_id': 'b'},
) as (cache_hit, value):
assert not cache_hit, (
"resource 'b' was NOT torn down despite "
"having zero users! (global user count bug)"
)
assert value == 'b'
log.info(
"resource 'b' re-entered "
"(cache miss, correct)"
)
# Phase 4: let 'a' exit, clean shutdown.
a_exit.set()
trio.run(main)
def test_moc_reentry_during_teardown(
debug_mode: bool,
loglevel: str,
):
'''
Reproduce the piker `open_cached_client('kraken')` race:
- same `acm_func`, NO kwargs (identical `ctx_key`)
- multiple tasks share the cached resource
- all users exit -> teardown starts
- a NEW task enters during `_Cache.run_ctx.__aexit__`
- `values[ctx_key]` is gone (popped in inner finally)
but `resources[ctx_key]` still exists (outer finally
hasn't run yet bc the acm cleanup has checkpoints)
- old code: `assert not resources.get(ctx_key)` FIRES
This models the real-world scenario where `brokerd.kraken`
tasks concurrently call `open_cached_client('kraken')`
(same `acm_func`, empty kwargs, shared `ctx_key`) and
the teardown/re-entry race triggers intermittently.
'''
async def main():
in_aexit = trio.Event()
@acm
async def cached_client():
'''
Simulates `kraken.api.get_client()`:
- no params (all callers share one `ctx_key`)
- slow-ish cleanup to widen the race window
between `values.pop()` and `resources.pop()`
inside `_Cache.run_ctx`.
'''
yield 'the-client'
# Signal that we're in __aexit__ — at this
# point `values` has already been popped by
# `run_ctx`'s inner finally, but `resources`
# is still alive (outer finally hasn't run).
in_aexit.set()
await trio.sleep(10)
first_done = trio.Event()
async def use_and_exit():
async with maybe_open_context(
cached_client,
) as (cache_hit, value):
assert value == 'the-client'
first_done.set()
async def reenter_during_teardown():
'''
Wait for the acm's `__aexit__` to start (meaning
`values` is popped but `resources` still exists),
then re-enter triggering the assert.
'''
await in_aexit.wait()
async with maybe_open_context(
cached_client,
) as (cache_hit, value):
assert value == 'the-client'
with trio.fail_after(5):
async with (
tractor.open_root_actor(
debug_mode=debug_mode,
loglevel=loglevel,
),
collapse_eg(),
trio.open_nursery() as tn,
):
tn.start_soon(use_and_exit)
tn.start_soon(reenter_during_teardown)
trio.run(main) trio.run(main)

View File

@ -49,7 +49,7 @@ def test_infected_root_actor(
), ),
to_asyncio.open_channel_from( to_asyncio.open_channel_from(
aio_echo_server, aio_echo_server,
) as (first, chan), ) as (chan, first),
): ):
assert first == 'start' assert first == 'start'
@ -91,13 +91,12 @@ def test_infected_root_actor(
async def sync_and_err( async def sync_and_err(
# just signature placeholders for compat with # just signature placeholders for compat with
# ``to_asyncio.open_channel_from()`` # ``to_asyncio.open_channel_from()``
to_trio: trio.MemorySendChannel, chan: tractor.to_asyncio.LinkedTaskChannel,
from_trio: asyncio.Queue,
ev: asyncio.Event, ev: asyncio.Event,
): ):
if to_trio: if chan:
to_trio.send_nowait('start') chan.started_nowait('start')
await ev.wait() await ev.wait()
raise RuntimeError('asyncio-side') raise RuntimeError('asyncio-side')
@ -174,7 +173,7 @@ def test_trio_prestarted_task_bubbles(
sync_and_err, sync_and_err,
ev=aio_ev, ev=aio_ev,
) )
) as (first, chan), ) as (chan, first),
): ):
for i in range(5): for i in range(5):

View File

@ -94,15 +94,15 @@ def test_runtime_vars_unset(
after the root actor-runtime exits! after the root actor-runtime exits!
''' '''
assert not tractor._state._runtime_vars['_debug_mode'] assert not tractor.runtime._state._runtime_vars['_debug_mode']
async def main(): async def main():
assert not tractor._state._runtime_vars['_debug_mode'] assert not tractor.runtime._state._runtime_vars['_debug_mode']
async with tractor.open_nursery( async with tractor.open_nursery(
debug_mode=True, debug_mode=True,
): ):
assert tractor._state._runtime_vars['_debug_mode'] assert tractor.runtime._state._runtime_vars['_debug_mode']
# after runtime closure, should be reverted! # after runtime closure, should be reverted!
assert not tractor._state._runtime_vars['_debug_mode'] assert not tractor.runtime._state._runtime_vars['_debug_mode']
trio.run(main) trio.run(main)

View File

@ -110,7 +110,7 @@ def test_rpc_errors(
) as n: ) as n:
actor = tractor.current_actor() actor = tractor.current_actor()
assert actor.is_arbiter assert actor.is_registrar
await n.run_in_actor( await n.run_in_actor(
sleep_back_actor, sleep_back_actor,
actor_name=subactor_requests_to, actor_name=subactor_requests_to,

View File

@ -22,6 +22,10 @@ def unlink_file():
async def crash_and_clean_tmpdir( async def crash_and_clean_tmpdir(
tmp_file_path: str, tmp_file_path: str,
error: bool = True, error: bool = True,
rent_cancel: bool = True,
# XXX unused, but do we really need to test these cases?
self_cancel: bool = False,
): ):
global _file_path global _file_path
_file_path = tmp_file_path _file_path = tmp_file_path
@ -32,43 +36,75 @@ async def crash_and_clean_tmpdir(
assert os.path.isfile(tmp_file_path) assert os.path.isfile(tmp_file_path)
await trio.sleep(0.1) await trio.sleep(0.1)
if error: if error:
print('erroring in subactor!')
assert 0 assert 0
else:
elif self_cancel:
print('SELF-cancelling subactor!')
actor.cancel_soon() actor.cancel_soon()
elif rent_cancel:
await trio.sleep_forever()
print('subactor exiting task!')
@pytest.mark.parametrize( @pytest.mark.parametrize(
'error_in_child', 'error_in_child',
[True, False], [True, False],
ids='error_in_child={}'.format,
) )
@tractor_test @tractor_test
async def test_lifetime_stack_wipes_tmpfile( async def test_lifetime_stack_wipes_tmpfile(
tmp_path, tmp_path,
error_in_child: bool, error_in_child: bool,
loglevel: str,
# log: tractor.log.StackLevelAdapter,
# ^TODO, once landed via macos support!
): ):
child_tmp_file = tmp_path / "child.txt" child_tmp_file = tmp_path / "child.txt"
child_tmp_file.touch() child_tmp_file.touch()
assert child_tmp_file.exists() assert child_tmp_file.exists()
path = str(child_tmp_file) path = str(child_tmp_file)
# NOTE, this is expected to cancel the sub
# in the `error_in_child=False` case!
timeout: float = (
1.6 if error_in_child
else 1
)
try: try:
with trio.move_on_after(0.5): with trio.move_on_after(timeout) as cs:
async with tractor.open_nursery() as n: async with tractor.open_nursery(
await ( # inlined portal loglevel=loglevel,
await n.run_in_actor( ) as an:
await ( # inlined `tractor.Portal`
await an.run_in_actor(
crash_and_clean_tmpdir, crash_and_clean_tmpdir,
tmp_file_path=path, tmp_file_path=path,
error=error_in_child, error=error_in_child,
) )
).result() ).result()
except ( except (
tractor.RemoteActorError, tractor.RemoteActorError,
# tractor.BaseExceptionGroup,
BaseExceptionGroup, BaseExceptionGroup,
): ) as _exc:
pass exc = _exc
from tractor.log import get_console_log
log = get_console_log(
level=loglevel,
name=__name__,
)
log.exception(
f'Subactor failed as expected with {type(exc)!r}\n'
)
# tmp file should have been wiped by # tmp file should have been wiped by
# teardown stack. # teardown stack.
assert not child_tmp_file.exists() assert not child_tmp_file.exists()
if error_in_child:
assert not cs.cancel_called
else:
# expect timeout in some cases?
assert cs.cancel_called

View File

@ -2,6 +2,7 @@
Shared mem primitives and APIs. Shared mem primitives and APIs.
""" """
import platform
import uuid import uuid
# import numpy # import numpy
@ -53,7 +54,18 @@ def test_child_attaches_alot():
shm_key=shml.key, shm_key=shml.key,
) as (ctx, start_val), ) as (ctx, start_val),
): ):
assert start_val == key assert (_key := shml.key) == start_val
if platform.system() != 'Darwin':
# XXX, macOS has a char limit..
# see `ipc._shm._shorten_key_for_macos`
assert (
start_val
==
key
==
_key
)
await ctx.result() await ctx.result()
await portal.cancel_actor() await portal.cancel_actor()

View File

@ -1,5 +1,12 @@
""" """
Spawning basics Spawning basics including audit of,
- subproc bootstrap, such as subactor runtime-data/config inheritance,
- basic (and mostly legacy) `ActorNursery` subactor starting and
cancel APIs.
Simple (and generally legacy) examples from the original
API design.
""" """
from functools import partial from functools import partial
@ -33,13 +40,13 @@ async def spawn(
assert actor is None # no runtime yet assert actor is None # no runtime yet
async with ( async with (
tractor.open_root_actor( tractor.open_root_actor(
arbiter_addr=reg_addr, registry_addrs=[reg_addr],
), ),
tractor.open_nursery() as an, tractor.open_nursery() as an,
): ):
# now runtime exists # now runtime exists
actor: tractor.Actor = tractor.current_actor() actor: tractor.Actor = tractor.current_actor()
assert actor.is_arbiter == should_be_root assert actor.is_registrar == should_be_root
# spawns subproc here # spawns subproc here
portal: tractor.Portal = await an.run_in_actor( portal: tractor.Portal = await an.run_in_actor(
@ -68,7 +75,7 @@ async def spawn(
assert result == 10 assert result == 10
return result return result
else: else:
assert actor.is_arbiter == should_be_root assert actor.is_registrar == should_be_root
return 10 return 10
@ -98,7 +105,9 @@ async def movie_theatre_question():
@tractor_test @tractor_test
async def test_movie_theatre_convo(start_method): async def test_movie_theatre_convo(
start_method: str,
):
''' '''
The main ``tractor`` routine. The main ``tractor`` routine.
@ -151,13 +160,16 @@ async def test_most_beautiful_word(
name='some_linguist', name='some_linguist',
) )
print(await portal.result()) res: Any = await portal.wait_for_result()
assert res == return_value
# The ``async with`` will unblock here since the 'some_linguist' # The ``async with`` will unblock here since the 'some_linguist'
# actor has completed its main task ``cellar_door``. # actor has completed its main task ``cellar_door``.
# this should pull the cached final result already captured during # this should pull the cached final result already captured during
# the nursery block exit. # the nursery block exit.
print(await portal.result()) res: Any = await portal.wait_for_result()
assert res == return_value
print(res)
async def check_loglevel(level): async def check_loglevel(level):
@ -168,22 +180,30 @@ async def check_loglevel(level):
log.critical('yoyoyo') log.critical('yoyoyo')
@pytest.mark.parametrize(
'level', [
'debug',
'cancel',
'critical'
],
ids='loglevel={}'.format,
)
def test_loglevel_propagated_to_subactor( def test_loglevel_propagated_to_subactor(
start_method, capfd: pytest.CaptureFixture,
capfd, start_method: str,
reg_addr, reg_addr: tuple,
level: str,
): ):
if start_method == 'mp_forkserver': if start_method == 'mp_forkserver':
pytest.skip( pytest.skip(
"a bug with `capfd` seems to make forkserver capture not work?") "a bug with `capfd` seems to make forkserver capture not work?"
)
level = 'critical'
async def main(): async def main():
async with tractor.open_nursery( async with tractor.open_nursery(
name='arbiter', name='registrar',
start_method=start_method, start_method=start_method,
arbiter_addr=reg_addr, registry_addrs=[reg_addr],
) as tn: ) as tn:
await tn.run_in_actor( await tn.run_in_actor(
@ -197,3 +217,121 @@ def test_loglevel_propagated_to_subactor(
# ensure subactor spits log message on stderr # ensure subactor spits log message on stderr
captured = capfd.readouterr() captured = capfd.readouterr()
assert 'yoyoyo' in captured.err assert 'yoyoyo' in captured.err
async def check_parent_main_inheritance(
expect_inherited: bool,
) -> bool:
'''
Assert that the child actor's ``_parent_main_data`` matches the
``inherit_parent_main`` flag it was spawned with.
With the trio spawn backend the parent's ``__main__`` bootstrap
data is captured and forwarded to each child so it can replay
the parent's ``__main__`` as ``__mp_main__``, mirroring the
stdlib ``multiprocessing`` bootstrap:
https://docs.python.org/3/library/multiprocessing.html#the-spawn-and-forkserver-start-methods
When ``inherit_parent_main=False`` the data dict is empty
(``{}``) so no fixup ever runs and the child keeps its own
``__main__`` untouched.
NOTE: under `pytest` the parent ``__main__`` is
``pytest.__main__`` whose ``_fixup_main_from_name()`` is a no-op
(the name ends with ``.__main__``), so we cannot observe
a difference in ``sys.modules['__main__'].__name__`` between the
two modes. Checking ``_parent_main_data`` directly is the most
reliable verification that the flag is threaded through
correctly; a ``RemoteActorError[AssertionError]`` propagates on
mismatch.
'''
import tractor
actor: tractor.Actor = tractor.current_actor()
has_data: bool = bool(actor._parent_main_data)
assert has_data == expect_inherited, (
f'Expected _parent_main_data to be '
f'{"non-empty" if expect_inherited else "empty"}, '
f'got: {actor._parent_main_data!r}'
)
return has_data
def test_run_in_actor_can_skip_parent_main_inheritance(
start_method: str, # <- only support on `trio` backend rn.
):
'''
Verify ``inherit_parent_main=False`` on ``run_in_actor()``
prevents parent ``__main__`` data from reaching the child.
'''
if start_method != 'trio':
pytest.skip(
'parent main-inheritance opt-out only affects the trio backend'
)
async def main():
async with tractor.open_nursery(start_method='trio') as an:
# Default: child receives parent __main__ bootstrap data
replaying = await an.run_in_actor(
check_parent_main_inheritance,
name='replaying-parent-main',
expect_inherited=True,
)
await replaying.result()
# Opt-out: child gets no parent __main__ data
isolated = await an.run_in_actor(
check_parent_main_inheritance,
name='isolated-parent-main',
inherit_parent_main=False,
expect_inherited=False,
)
await isolated.result()
trio.run(main)
def test_start_actor_can_skip_parent_main_inheritance(
start_method: str, # <- only support on `trio` backend rn.
):
'''
Verify ``inherit_parent_main=False`` on ``start_actor()``
prevents parent ``__main__`` data from reaching the child.
'''
if start_method != 'trio':
pytest.skip(
'parent main-inheritance opt-out only affects the trio backend'
)
async def main():
async with tractor.open_nursery(start_method='trio') as an:
# Default: child receives parent __main__ bootstrap data
replaying = await an.start_actor(
'replaying-parent-main',
enable_modules=[__name__],
)
result = await replaying.run(
check_parent_main_inheritance,
expect_inherited=True,
)
assert result is True
await replaying.cancel_actor()
# Opt-out: child gets no parent __main__ data
isolated = await an.start_actor(
'isolated-parent-main',
enable_modules=[__name__],
inherit_parent_main=False,
)
result = await isolated.run(
check_parent_main_inheritance,
expect_inherited=False,
)
assert result is False
await isolated.cancel_actor()
trio.run(main)

View File

@ -75,7 +75,7 @@ async def open_sequence_streamer(
) -> tractor.MsgStream: ) -> tractor.MsgStream:
async with tractor.open_nursery( async with tractor.open_nursery(
arbiter_addr=reg_addr, registry_addrs=[reg_addr],
start_method=start_method, start_method=start_method,
) as an: ) as an:

View File

@ -30,21 +30,23 @@ from ._streaming import (
MsgStream as MsgStream, MsgStream as MsgStream,
stream as stream, stream as stream,
) )
from ._discovery import ( from .discovery._api import (
get_registry as get_registry, get_registry as get_registry,
find_actor as find_actor, find_actor as find_actor,
wait_for_actor as wait_for_actor, wait_for_actor as wait_for_actor,
query_actor as query_actor, query_actor as query_actor,
) )
from ._supervise import ( from .runtime._supervise import (
open_nursery as open_nursery, open_nursery as open_nursery,
ActorNursery as ActorNursery, ActorNursery as ActorNursery,
) )
from ._state import ( from .runtime._state import (
RuntimeVars as RuntimeVars,
current_actor as current_actor, current_actor as current_actor,
is_root_process as is_root_process,
current_ipc_ctx as current_ipc_ctx, current_ipc_ctx as current_ipc_ctx,
debug_mode as debug_mode debug_mode as debug_mode,
get_runtime_vars as get_runtime_vars,
is_root_process as is_root_process,
) )
from ._exceptions import ( from ._exceptions import (
ContextCancelled as ContextCancelled, ContextCancelled as ContextCancelled,
@ -65,6 +67,10 @@ from ._root import (
open_root_actor as open_root_actor, open_root_actor as open_root_actor,
) )
from .ipc import Channel as Channel from .ipc import Channel as Channel
from ._portal import Portal as Portal from .runtime._portal import Portal as Portal
from ._runtime import Actor as Actor from .runtime._runtime import Actor as Actor
from .discovery._registry import (
Registrar as Registrar,
Arbiter as Arbiter,
)
# from . import hilevel as hilevel # from . import hilevel as hilevel

View File

@ -22,8 +22,8 @@ import argparse
from ast import literal_eval from ast import literal_eval
from ._runtime import Actor from .runtime._runtime import Actor
from ._entry import _trio_main from .spawn._entry import _trio_main
def parse_uid(arg): def parse_uid(arg):

View File

@ -66,7 +66,7 @@ async def open_actor_cluster(
trio.open_nursery() as tn, trio.open_nursery() as tn,
tractor.trionics.maybe_raise_from_masking_exc() tractor.trionics.maybe_raise_from_masking_exc()
): ):
uid = tractor.current_actor().uid uid = tractor.current_actor().aid.uid
async def _start(name: str) -> None: async def _start(name: str) -> None:
name = f'{uid[0]}.{name}' name = f'{uid[0]}.{name}'

View File

@ -0,0 +1,48 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
(Hot) coad (re-)load utils for python.
'''
import importlib
from pathlib import Path
import sys
from types import ModuleType
# ?TODO, move this into internal libs?
# -[ ] we already use it in `modden.config._pymod` as well
def load_module_from_path(
path: Path,
module_name: str|None = None,
) -> ModuleType:
'''
Taken from SO,
https://stackoverflow.com/a/67208147
which is based on stdlib docs,
https://docs.python.org/3/library/importlib.html#importing-a-source-file-directly
'''
module_name = module_name or path.stem
spec = importlib.util.spec_from_file_location(
module_name,
str(path),
)
module = importlib.util.module_from_spec(spec)
sys.modules[module_name] = module
spec.loader.exec_module(module)
return module

View File

@ -70,6 +70,7 @@ from ._exceptions import (
MsgTypeError, MsgTypeError,
RemoteActorError, RemoteActorError,
StreamOverrun, StreamOverrun,
TransportClosed,
pack_from_raise, pack_from_raise,
unpack_error, unpack_error,
) )
@ -96,7 +97,7 @@ from ._streaming import (
MsgStream, MsgStream,
open_stream_from_ctx, open_stream_from_ctx,
) )
from ._state import ( from .runtime._state import (
current_actor, current_actor,
debug_mode, debug_mode,
_ctxvar_Context, _ctxvar_Context,
@ -106,14 +107,14 @@ from .trionics import (
) )
# ------ - ------ # ------ - ------
if TYPE_CHECKING: if TYPE_CHECKING:
from ._portal import Portal from .runtime._portal import Portal
from ._runtime import Actor from .runtime._runtime import Actor
from .ipc._transport import MsgTransport from .ipc._transport import MsgTransport
from .devx._frame_stack import ( from .devx._frame_stack import (
CallerInfo, CallerInfo,
) )
log = get_logger(__name__) log = get_logger()
class Unresolved: class Unresolved:
@ -462,10 +463,11 @@ class Context:
# self._cancel_called = val # self._cancel_called = val
# TODO, use the `Actor.aid: Aid` instead!
@property @property
def canceller(self) -> tuple[str, str]|None: def canceller(self) -> tuple[str, str]|None:
''' '''
`Actor.uid: tuple[str, str]` of the (remote) `Actor.aid.uid: tuple[str, str]` of the (remote)
actor-process who's task was cancelled thus causing this actor-process who's task was cancelled thus causing this
(side of the) context to also be cancelled. (side of the) context to also be cancelled.
@ -498,12 +500,12 @@ class Context:
if from_uid := re.src_uid: if from_uid := re.src_uid:
from_uid: tuple = tuple(from_uid) from_uid: tuple = tuple(from_uid)
our_uid: tuple = self._actor.uid our_uid: tuple = self._actor.aid.uid
our_canceller = self.canceller our_canceller = self.canceller
return bool( return bool(
isinstance((ctxc := re), ContextCancelled) isinstance((ctxc := re), ContextCancelled)
and from_uid == self.chan.uid and from_uid == self.chan.aid.uid
and ctxc.canceller == our_uid and ctxc.canceller == our_uid
and our_canceller == our_uid and our_canceller == our_uid
) )
@ -514,7 +516,7 @@ class Context:
Records whether the task on the remote side of this IPC Records whether the task on the remote side of this IPC
context acknowledged a cancel request via a relayed context acknowledged a cancel request via a relayed
`ContextCancelled` with the `.canceller` attr set to the `ContextCancelled` with the `.canceller` attr set to the
`Actor.uid` of the local actor who's task entered `Actor.aid.uid` of the local actor who's task entered
`Portal.open_context()`. `Portal.open_context()`.
This will only be `True` when `.cancel()` is called and This will only be `True` when `.cancel()` is called and
@ -788,8 +790,8 @@ class Context:
# appropriately. # appropriately.
log.runtime( log.runtime(
'Setting remote error for ctx\n\n' 'Setting remote error for ctx\n\n'
f'<= {self.peer_side!r}: {self.chan.uid}\n' f'<= {self.peer_side!r}: {self.chan.aid.reprol()}\n'
f'=> {self.side!r}: {self._actor.uid}\n\n' f'=> {self.side!r}: {self._actor.aid.reprol()}\n\n'
f'{error!r}' f'{error!r}'
) )
self._remote_error: BaseException = error self._remote_error: BaseException = error
@ -810,7 +812,7 @@ class Context:
# cancelled. # cancelled.
# #
# !TODO, switching to `Actor.aid` here! # !TODO, switching to `Actor.aid` here!
if (canc := error.canceller) == self._actor.uid: if (canc := error.canceller) == self._actor.aid.uid:
whom: str = 'us' whom: str = 'us'
self._canceller = canc self._canceller = canc
else: else:
@ -1035,7 +1037,7 @@ class Context:
--------- ---------
- after the far end cancels, the `.cancel()` calling side - after the far end cancels, the `.cancel()` calling side
should receive a `ContextCancelled` with the should receive a `ContextCancelled` with the
`.canceller: tuple` uid set to the current `Actor.uid`. `.canceller: tuple` uid set to the current `Actor.aid.uid`.
- timeout (quickly) on failure to rx this ACK error-msg in - timeout (quickly) on failure to rx this ACK error-msg in
an attempt to sidestep 2-generals when the transport an attempt to sidestep 2-generals when the transport
@ -1064,9 +1066,9 @@ class Context:
) )
reminfo: str = ( reminfo: str = (
# ' =>\n' # ' =>\n'
# f'Context.cancel() => {self.chan.uid}\n' # f'Context.cancel() => {self.chan.aid.uid}\n'
f'\n' f'\n'
f'c)=> {self.chan.uid}\n' f'c)=> {self.chan.aid.reprol()}\n'
f' |_[{self.dst_maddr}\n' f' |_[{self.dst_maddr}\n'
f' >> {self.repr_rpc}\n' f' >> {self.repr_rpc}\n'
# f' >> {self._nsf}() -> {codec}[dict]:\n\n' # f' >> {self._nsf}() -> {codec}[dict]:\n\n'
@ -1210,7 +1212,7 @@ class Context:
''' '''
__tracebackhide__: bool = hide_tb __tracebackhide__: bool = hide_tb
peer_uid: tuple = self.chan.uid peer_uid: tuple = self.chan.aid.uid
# XXX NOTE XXX: `ContextCancelled`/`StreamOverrun` absorption # XXX NOTE XXX: `ContextCancelled`/`StreamOverrun` absorption
# for "graceful cancellation" case(s): # for "graceful cancellation" case(s):
@ -1227,7 +1229,7 @@ class Context:
# (`ContextCancelled`) as an expected # (`ContextCancelled`) as an expected
# error-msg-is-cancellation-ack IFF said # error-msg-is-cancellation-ack IFF said
# `remote_error: ContextCancelled` has `.canceller` # `remote_error: ContextCancelled` has `.canceller`
# set to the `Actor.uid` of THIS task (i.e. the # set to the `Actor.aid.uid` of THIS task (i.e. the
# cancellation requesting task's actor is the actor # cancellation requesting task's actor is the actor
# checking whether it should absorb the ctxc). # checking whether it should absorb the ctxc).
self_ctxc: bool = self._is_self_cancelled(remote_error) self_ctxc: bool = self._is_self_cancelled(remote_error)
@ -1678,7 +1680,7 @@ class Context:
elif self._started_called: elif self._started_called:
raise RuntimeError( raise RuntimeError(
f'called `.started()` twice on context with {self.chan.uid}' f'called `.started()` twice on context with {self.chan.aid.uid}'
) )
started_msg = Started( started_msg = Started(
@ -1811,7 +1813,7 @@ class Context:
''' '''
cid: str = self.cid cid: str = self.cid
chan: Channel = self.chan chan: Channel = self.chan
from_uid: tuple[str, str] = chan.uid from_uid: tuple[str, str] = chan.aid.uid
send_chan: trio.MemorySendChannel = self._send_chan send_chan: trio.MemorySendChannel = self._send_chan
nsf: NamespacePath = self._nsf nsf: NamespacePath = self._nsf
@ -1952,20 +1954,22 @@ class Context:
# overrun state and that msg isn't stuck in an # overrun state and that msg isn't stuck in an
# overflow queue what happens?!? # overflow queue what happens?!?
local_uid = self._actor.uid local_aid = self._actor.aid
txt: str = ( txt: str = (
'on IPC context:\n' 'on IPC context:\n'
f'<= sender: {from_uid}\n' f'<= sender: {from_uid}\n'
f' |_ {self._nsf}()\n\n' f' |_ {self._nsf}()\n\n'
f'=> overrun: {local_uid}\n' f'=> overrun: {local_aid.reprol()!r}\n'
f' |_cid: {cid}\n' f' |_cid: {cid}\n'
f' |_task: {self._task}\n' f' |_task: {self._task}\n'
) )
if not self._stream_opened: if not self._stream_opened:
txt += ( txt += (
f'\n*** No stream open on `{local_uid[0]}` side! ***\n\n' f'\n'
f'*** No stream open on `{local_aid.name}` side! ***\n'
f'\n'
f'{msg}\n' f'{msg}\n'
) )
@ -2114,7 +2118,11 @@ async def open_context_from_portal(
# XXX NOTE XXX: currenly we do NOT allow opening a contex # XXX NOTE XXX: currenly we do NOT allow opening a contex
# with "self" since the local feeder mem-chan processing # with "self" since the local feeder mem-chan processing
# is not built for it. # is not built for it.
if (uid := portal.channel.uid) == portal.actor.uid: if (
(uid := portal.channel.aid.uid)
==
portal.actor.aid.uid
):
raise RuntimeError( raise RuntimeError(
'** !! Invalid Operation !! **\n' '** !! Invalid Operation !! **\n'
'Can not open an IPC ctx with the local actor!\n' 'Can not open an IPC ctx with the local actor!\n'
@ -2328,7 +2336,7 @@ async def open_context_from_portal(
and and
ctxc is ctx._remote_error ctxc is ctx._remote_error
and and
ctxc.canceller == portal.actor.uid ctxc.canceller == portal.actor.aid.uid
): ):
log.cancel( log.cancel(
f'Context (cid=[{ctx.cid[-6:]}..] cancelled gracefully with:\n' f'Context (cid=[{ctx.cid[-6:]}..] cancelled gracefully with:\n'
@ -2391,19 +2399,21 @@ async def open_context_from_portal(
case trio.Cancelled(): case trio.Cancelled():
logmeth = log.cancel logmeth = log.cancel
cause: str = 'cancelled' cause: str = 'cancelled'
msg: str = (
f'ctx {ctx.side!r}-side {cause!r} with,\n'
f'{ctx.repr_outcome()!r}\n'
)
# XXX explicitly report on any non-graceful-taskc cases # XXX explicitly report on any non-graceful-taskc cases
case _: case _:
cause: str = 'errored' cause: str = 'errored'
logmeth = log.exception logmeth = log.exception
msg: str = f'ctx {ctx.side!r}-side {cause!r} with,\n'
logmeth( logmeth(msg)
f'ctx {ctx.side!r}-side {cause!r} with,\n'
f'{ctx.repr_outcome()!r}\n'
)
if debug_mode(): if debug_mode():
# async with debug.acquire_debug_lock(portal.actor.uid): # async with debug.acquire_debug_lock(portal.actor.aid.uid):
# pass # pass
# TODO: factor ^ into below for non-root cases? # TODO: factor ^ into below for non-root cases?
# #
@ -2426,10 +2436,7 @@ async def open_context_from_portal(
try: try:
# await pause(shield=True) # await pause(shield=True)
await ctx.cancel() await ctx.cancel()
except ( except TransportClosed:
trio.BrokenResourceError,
trio.ClosedResourceError,
):
log.warning( log.warning(
'IPC connection for context is broken?\n' 'IPC connection for context is broken?\n'
f'task: {ctx.cid}\n' f'task: {ctx.cid}\n'

View File

@ -43,7 +43,7 @@ from msgspec import (
ValidationError, ValidationError,
) )
from tractor._state import current_actor from tractor.runtime._state import current_actor
from tractor.log import get_logger from tractor.log import get_logger
from tractor.msg import ( from tractor.msg import (
Error, Error,
@ -187,7 +187,31 @@ _body_fields: list[str] = list(
) )
def get_err_type(type_name: str) -> BaseException|None: def reg_err_types(
exc_types: list[Type[Exception]],
) -> None:
'''
Register custom exception types for local lookup.
Such that error types can be registered by an external
`tractor`-use-app code base which are expected to be raised
remotely; enables them being re-raised on the receiver side of
some inter-actor IPC dialog.
'''
for exc_type in exc_types:
log.debug(
f'Register custom exception,\n'
f'{exc_type!r}\n'
)
setattr(
_this_mod,
exc_type.__name__,
exc_type,
)
def get_err_type(type_name: str) -> Type[BaseException]|None:
''' '''
Look up an exception type by name from the set of locally known Look up an exception type by name from the set of locally known
namespaces: namespaces:
@ -301,7 +325,8 @@ class RemoteActorError(Exception):
# also pertains to our long long oustanding issue XD # also pertains to our long long oustanding issue XD
# https://github.com/goodboy/tractor/issues/5 # https://github.com/goodboy/tractor/issues/5
self._boxed_type: BaseException = boxed_type self._boxed_type: BaseException = boxed_type
self._src_type: BaseException|None = None self._src_type: Type[BaseException]|None = None
self._src_type_resolved: bool = False
self._ipc_msg: Error|None = ipc_msg self._ipc_msg: Error|None = ipc_msg
self._extra_msgdata = extra_msgdata self._extra_msgdata = extra_msgdata
@ -410,24 +435,41 @@ class RemoteActorError(Exception):
return self._ipc_msg.src_type_str return self._ipc_msg.src_type_str
@property @property
def src_type(self) -> str: def src_type(self) -> Type[BaseException]|None:
''' '''
Error type raised by original remote faulting actor. Error type raised by original remote faulting
actor.
When the error has only been relayed a single actor-hop When the error has only been relayed a single
this will be the same as the `.boxed_type`. actor-hop this will be the same as
`.boxed_type`.
If the type can not be resolved locally (i.e.
it was not registered via `reg_err_types()`)
a warning is logged and `None` is returned;
all string-level error info (`.src_type_str`,
`.tb_str`, etc.) remains available.
''' '''
if self._src_type is None: if not self._src_type_resolved:
self._src_type_resolved = True
if self._ipc_msg is None:
return None
self._src_type = get_err_type( self._src_type = get_err_type(
self._ipc_msg.src_type_str self._ipc_msg.src_type_str
) )
if not self._src_type: if not self._src_type:
raise TypeError( log.warning(
f'Failed to lookup src error type with ' f'Failed to lookup src error type via\n'
f'`tractor._exceptions.get_err_type()`:\n' f'`tractor._exceptions.get_err_type()`:\n'
f'{self.src_type_str}' f'\n'
f'`{self._ipc_msg.src_type_str}`'
f' is not registered!\n'
f'\n'
f'Call `reg_err_types()` to enable'
f' full type reconstruction.\n'
) )
return self._src_type return self._src_type
@ -435,20 +477,30 @@ class RemoteActorError(Exception):
@property @property
def boxed_type_str(self) -> str: def boxed_type_str(self) -> str:
''' '''
String-name of the (last hop's) boxed error type. String-name of the (last hop's) boxed error
type.
Falls back to the IPC-msg-encoded type-name
str when the type can not be resolved locally
(e.g. unregistered custom errors).
''' '''
# TODO, maybe support also serializing the # TODO, maybe support also serializing the
# `ExceptionGroup.exeptions: list[BaseException]` set under # `ExceptionGroup.exceptions: list[BaseException]`
# certain conditions? # set under certain conditions?
bt: Type[BaseException] = self.boxed_type bt: Type[BaseException] = self.boxed_type
if bt: if bt:
return str(bt.__name__) return str(bt.__name__)
return '' # fallback to the str name from the IPC msg
# when the type obj can't be resolved.
if self._ipc_msg:
return self._ipc_msg.boxed_type_str
return '<unknown>'
@property @property
def boxed_type(self) -> Type[BaseException]: def boxed_type(self) -> Type[BaseException]|None:
''' '''
Error type boxed by last actor IPC hop. Error type boxed by last actor IPC hop.
@ -677,10 +729,22 @@ class RemoteActorError(Exception):
failing actor's remote env. failing actor's remote env.
''' '''
# TODO: better tb insertion and all the fancier dunder # TODO: better tb insertion and all the fancier
# metadata stuff as per `.__context__` etc. and friends: # dunder metadata stuff as per `.__context__`
# etc. and friends:
# https://github.com/python-trio/trio/issues/611 # https://github.com/python-trio/trio/issues/611
src_type_ref: Type[BaseException] = self.src_type src_type_ref: Type[BaseException]|None = (
self.src_type
)
if src_type_ref is None:
# unresolvable type: fall back to
# a `RuntimeError` preserving original
# traceback + type name.
return RuntimeError(
f'{self.src_type_str}: '
f'{self.tb_str}'
)
return src_type_ref(self.tb_str) return src_type_ref(self.tb_str)
# TODO: local recontruction of nested inception for a given # TODO: local recontruction of nested inception for a given
@ -982,6 +1046,7 @@ class TransportClosed(Exception):
''' '''
__tracebackhide__: bool = hide_tb __tracebackhide__: bool = hide_tb
message: str = message or self.message message: str = message or self.message
# when a cause is set, slap it onto the log emission. # when a cause is set, slap it onto the log emission.
if cause := self.src_exc: if cause := self.src_exc:
cause_tb_str: str = ''.join( cause_tb_str: str = ''.join(
@ -989,7 +1054,7 @@ class TransportClosed(Exception):
) )
message += ( message += (
f'{cause_tb_str}\n' # tb f'{cause_tb_str}\n' # tb
f' {cause}\n' # exc repr f'{cause!r}\n' # exc repr
) )
getattr( getattr(
@ -1208,14 +1273,31 @@ def unpack_error(
if not isinstance(msg, Error): if not isinstance(msg, Error):
return None return None
# try to lookup a suitable error type from the local runtime # try to lookup a suitable error type from the
# env then use it to construct a local instance. # local runtime env then use it to construct a
# boxed_type_str: str = error_dict['boxed_type_str'] # local instance.
boxed_type_str: str = msg.boxed_type_str boxed_type_str: str = msg.boxed_type_str
boxed_type: Type[BaseException] = get_err_type(boxed_type_str) boxed_type: Type[BaseException]|None = get_err_type(
boxed_type_str
)
# retrieve the error's msg-encoded remotoe-env info if boxed_type is None:
message: str = f'remote task raised a {msg.boxed_type_str!r}\n' log.warning(
f'Failed to resolve remote error type\n'
f'`{boxed_type_str}` - boxing as\n'
f'`RemoteActorError` with original\n'
f'traceback preserved.\n'
f'\n'
f'Call `reg_err_types()` to enable\n'
f'full type reconstruction.\n'
)
# retrieve the error's msg-encoded remote-env
# info
message: str = (
f'remote task raised a '
f'{msg.boxed_type_str!r}\n'
)
# TODO: do we even really need these checks for RAEs? # TODO: do we even really need these checks for RAEs?
if boxed_type_str in [ if boxed_type_str in [

View File

@ -1,116 +0,0 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
"""
Helpers pulled mostly verbatim from ``multiprocessing.spawn``
to aid with "fixing up" the ``__main__`` module in subprocesses.
These helpers are needed for any spawing backend that doesn't already
handle this. For example when using ``trio_run_in_process`` it is needed
but obviously not when we're already using ``multiprocessing``.
"""
import os
import sys
import platform
import types
import runpy
ORIGINAL_DIR = os.path.abspath(os.getcwd())
def _mp_figure_out_main() -> dict[str, str]:
"""Taken from ``multiprocessing.spawn.get_preparation_data()``.
Retrieve parent actor `__main__` module data.
"""
d = {}
# Figure out whether to initialise main in the subprocess as a module
# or through direct execution (or to leave it alone entirely)
main_module = sys.modules['__main__']
main_mod_name = getattr(main_module.__spec__, "name", None)
if main_mod_name is not None:
d['init_main_from_name'] = main_mod_name
# elif sys.platform != 'win32' or (not WINEXE and not WINSERVICE):
elif platform.system() != 'Windows':
main_path = getattr(main_module, '__file__', None)
if main_path is not None:
if (
not os.path.isabs(main_path) and (
ORIGINAL_DIR is not None)
):
# process.ORIGINAL_DIR is not None):
# main_path = os.path.join(process.ORIGINAL_DIR, main_path)
main_path = os.path.join(ORIGINAL_DIR, main_path)
d['init_main_from_path'] = os.path.normpath(main_path)
return d
# Multiprocessing module helpers to fix up the main module in
# spawned subprocesses
def _fixup_main_from_name(mod_name: str) -> None:
# __main__.py files for packages, directories, zip archives, etc, run
# their "main only" code unconditionally, so we don't even try to
# populate anything in __main__, nor do we make any changes to
# __main__ attributes
current_main = sys.modules['__main__']
if mod_name == "__main__" or mod_name.endswith(".__main__"):
return
# If this process was forked, __main__ may already be populated
if getattr(current_main.__spec__, "name", None) == mod_name:
return
# Otherwise, __main__ may contain some non-main code where we need to
# support unpickling it properly. We rerun it as __mp_main__ and make
# the normal __main__ an alias to that
# old_main_modules.append(current_main)
main_module = types.ModuleType("__mp_main__")
main_content = runpy.run_module(mod_name,
run_name="__mp_main__",
alter_sys=True) # type: ignore
main_module.__dict__.update(main_content)
sys.modules['__main__'] = sys.modules['__mp_main__'] = main_module
def _fixup_main_from_path(main_path: str) -> None:
# If this process was forked, __main__ may already be populated
current_main = sys.modules['__main__']
# Unfortunately, the main ipython launch script historically had no
# "if __name__ == '__main__'" guard, so we work around that
# by treating it like a __main__.py file
# See https://github.com/ipython/ipython/issues/4698
main_name = os.path.splitext(os.path.basename(main_path))[0]
if main_name == 'ipython':
return
# Otherwise, if __file__ already has the setting we expect,
# there's nothing more to do
if getattr(current_main, '__file__', None) == main_path:
return
# If the parent process has sent a path through rather than a module
# name we assume it is an executable script that may contain
# non-main code that needs to be executed
# old_main_modules.append(current_main)
main_module = types.ModuleType("__mp_main__")
main_content = runpy.run_path(main_path,
run_name="__mp_main__") # type: ignore
main_module.__dict__.update(main_content)
sys.modules['__main__'] = sys.modules['__mp_main__'] = main_module

View File

@ -1,151 +0,0 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
Multiaddress parser and utils according the spec(s) defined by
`libp2p` and used in dependent project such as `ipfs`:
- https://docs.libp2p.io/concepts/fundamentals/addressing/
- https://github.com/libp2p/specs/blob/master/addressing/README.md
'''
from typing import Iterator
from bidict import bidict
# TODO: see if we can leverage libp2p ecosys projects instead of
# rolling our own (parser) impls of the above addressing specs:
# - https://github.com/libp2p/py-libp2p
# - https://docs.libp2p.io/concepts/nat/circuit-relay/#relay-addresses
# prots: bidict[int, str] = bidict({
prots: bidict[int, str] = {
'ipv4': 3,
'ipv6': 3,
'wg': 3,
'tcp': 4,
'udp': 4,
# TODO: support the next-gen shite Bo
# 'quic': 4,
# 'ssh': 7, # via rsyscall bootstrapping
}
prot_params: dict[str, tuple[str]] = {
'ipv4': ('addr',),
'ipv6': ('addr',),
'wg': ('addr', 'port', 'pubkey'),
'tcp': ('port',),
'udp': ('port',),
# 'quic': ('port',),
# 'ssh': ('port',),
}
def iter_prot_layers(
multiaddr: str,
) -> Iterator[
tuple[
int,
list[str]
]
]:
'''
Unpack a libp2p style "multiaddress" into multiple "segments"
for each "layer" of the protocoll stack (in OSI terms).
'''
tokens: list[str] = multiaddr.split('/')
root, tokens = tokens[0], tokens[1:]
assert not root # there is a root '/' on LHS
itokens = iter(tokens)
prot: str | None = None
params: list[str] = []
for token in itokens:
# every prot path should start with a known
# key-str.
if token in prots:
if prot is None:
prot: str = token
else:
yield prot, params
prot = token
params = []
elif token not in prots:
params.append(token)
else:
yield prot, params
def parse_maddr(
multiaddr: str,
) -> dict[str, str | int | dict]:
'''
Parse a libp2p style "multiaddress" into its distinct protocol
segments where each segment is of the form:
`../<protocol>/<param0>/<param1>/../<paramN>`
and is loaded into a (order preserving) `layers: dict[str,
dict[str, Any]` which holds each protocol-layer-segment of the
original `str` path as a separate entry according to its approx
OSI "layer number".
Any `paramN` in the path must be distinctly defined by a str-token in the
(module global) `prot_params` table.
For eg. for wireguard which requires an address, port number and publickey
the protocol params are specified as the entry:
'wg': ('addr', 'port', 'pubkey'),
and are thus parsed from a maddr in that order:
`'/wg/1.1.1.1/51820/<pubkey>'`
'''
layers: dict[str, str | int | dict] = {}
for (
prot_key,
params,
) in iter_prot_layers(multiaddr):
layer: int = prots[prot_key] # OSI layer used for sorting
ep: dict[str, int | str] = {'layer': layer}
layers[prot_key] = ep
# TODO; validation and resolving of names:
# - each param via a validator provided as part of the
# prot_params def? (also see `"port"` case below..)
# - do a resolv step that will check addrs against
# any loaded network.resolv: dict[str, str]
rparams: list = list(reversed(params))
for key in prot_params[prot_key]:
val: str | int = rparams.pop()
# TODO: UGHH, dunno what we should do for validation
# here, put it in the params spec somehow?
if key == 'port':
val = int(val)
ep[key] = val
return layers

View File

@ -37,19 +37,20 @@ import warnings
import trio import trio
from . import _runtime from .runtime import _runtime
from .discovery._registry import Registrar
from .devx import ( from .devx import (
debug, debug,
_frame_stack, _frame_stack,
pformat as _pformat, pformat as _pformat,
) )
from . import _spawn from .spawn import _spawn
from . import _state from .runtime import _state
from . import log from . import log
from .ipc import ( from .ipc import (
_connect_chan, _connect_chan,
) )
from ._addr import ( from .discovery._addr import (
Address, Address,
UnwrappedAddress, UnwrappedAddress,
default_lo_addrs, default_lo_addrs,
@ -88,7 +89,8 @@ async def maybe_block_bp(
bp_blocked: bool bp_blocked: bool
if ( if (
debug_mode debug_mode
and maybe_enable_greenback and
maybe_enable_greenback
and ( and (
maybe_mod := await debug.maybe_init_greenback( maybe_mod := await debug.maybe_init_greenback(
raise_not_found=False, raise_not_found=False,
@ -143,11 +145,16 @@ async def maybe_block_bp(
@acm @acm
async def open_root_actor( async def open_root_actor(
*, *,
# defaults are above tpt_bind_addrs: list[
registry_addrs: list[UnwrappedAddress]|None = None, Address # `Address.get_random()` case
|UnwrappedAddress # registrar case `= uw_reg_addrs`
]|None = None,
# defaults are above # defaults are above
arbiter_addr: tuple[UnwrappedAddress]|None = None, registry_addrs: list[
Address
|UnwrappedAddress
]|None = None,
enable_transports: list[ enable_transports: list[
# TODO, this should eventually be the pairs as # TODO, this should eventually be the pairs as
@ -266,16 +273,7 @@ async def open_root_actor(
if start_method is not None: if start_method is not None:
_spawn.try_set_start_method(start_method) _spawn.try_set_start_method(start_method)
# TODO! remove this ASAP! # XXX expect pre-unwrapped registrar addrs.
if arbiter_addr is not None:
warnings.warn(
'`arbiter_addr` is now deprecated\n'
'Use `registry_addrs: list[tuple]` instead..',
DeprecationWarning,
stacklevel=2,
)
uw_reg_addrs = [arbiter_addr]
uw_reg_addrs = registry_addrs uw_reg_addrs = registry_addrs
if not uw_reg_addrs: if not uw_reg_addrs:
uw_reg_addrs: list[UnwrappedAddress] = default_lo_addrs( uw_reg_addrs: list[UnwrappedAddress] = default_lo_addrs(
@ -288,11 +286,12 @@ async def open_root_actor(
wrap_address(uw_addr) wrap_address(uw_addr)
for uw_addr in uw_reg_addrs for uw_addr in uw_reg_addrs
] ]
loglevel: str = (
loglevel = (
loglevel loglevel
or log._default_loglevel or
).upper() log._default_loglevel
)
loglevel: str = loglevel.upper()
if ( if (
debug_mode debug_mode
@ -323,7 +322,10 @@ async def open_root_actor(
) )
assert loglevel assert loglevel
_log = log.get_console_log(loglevel) _log = log.get_console_log(
level=loglevel,
name='tractor',
)
assert _log assert _log
# TODO: factor this into `.devx._stackscope`!! # TODO: factor this into `.devx._stackscope`!!
@ -380,10 +382,17 @@ async def open_root_actor(
addr, addr,
) )
trans_bind_addrs: list[UnwrappedAddress] = [] if tpt_bind_addrs is None:
tpt_bind_addrs: list[Address] = []
else:
input_bind_addrs = list(tpt_bind_addrs)
tpt_bind_addrs: list[Address] = []
for addr in input_bind_addrs:
addr: Address = wrap_address(addr)
tpt_bind_addrs.append(addr)
# Create a new local root-actor instance which IS NOT THE # ------ NON-REGISTRAR ------
# REGISTRAR # create a new root-actor instance.
if ponged_addrs: if ponged_addrs:
if ensure_registry: if ensure_registry:
raise RuntimeError( raise RuntimeError(
@ -391,7 +400,7 @@ async def open_root_actor(
'registry socket(s) already bound' 'registry socket(s) already bound'
) )
# we were able to connect to an arbiter # we were able to connect to a registrar
logger.info( logger.info(
f'Registry(s) seem(s) to exist @ {ponged_addrs}' f'Registry(s) seem(s) to exist @ {ponged_addrs}'
) )
@ -408,14 +417,26 @@ async def open_root_actor(
# a new NON-registrar, ROOT-actor. # a new NON-registrar, ROOT-actor.
# #
# XXX INSTEAD, bind random addrs using the same tpt # XXX INSTEAD, bind random addrs using the same tpt
# proto. # proto if not already provided.
if not tpt_bind_addrs:
for addr in ponged_addrs: for addr in ponged_addrs:
trans_bind_addrs.append( tpt_bind_addrs.append(
# XXX, these are `Address` NOT `UnwrappedAddress`.
#
# NOTE, in the case of posix/berkley socket
# protos we allocate port=0 such that the system
# allocates a random value at bind time; this
# happens in the `.ipc.*` stack's backend.
addr.get_random( addr.get_random(
bindspace=addr.bindspace, bindspace=addr.bindspace,
) )
) )
header: str = '-> Contacting existing registry @ '
# ------ REGISTRAR ------
# create a new "registry providing" root-actor instance.
#
# Start this local actor as the "registrar", aka a regular # Start this local actor as the "registrar", aka a regular
# actor who manages the local registry of "mailboxes" of # actor who manages the local registry of "mailboxes" of
# other process-tree-local sub-actors. # other process-tree-local sub-actors.
@ -424,7 +445,11 @@ async def open_root_actor(
# following init steps are taken: # following init steps are taken:
# - the tranport layer server is bound to each addr # - the tranport layer server is bound to each addr
# pair defined in provided registry_addrs, or the default. # pair defined in provided registry_addrs, or the default.
trans_bind_addrs = uw_reg_addrs tpt_bind_addrs = list(set(
tpt_bind_addrs
+
[wrap_address(a) for a in uw_reg_addrs]
))
# - it is normally desirable for any registrar to stay up # - it is normally desirable for any registrar to stay up
# indefinitely until either all registered (child/sub) # indefinitely until either all registered (child/sub)
@ -435,8 +460,7 @@ async def open_root_actor(
# https://github.com/goodboy/tractor/pull/348 # https://github.com/goodboy/tractor/pull/348
# https://github.com/goodboy/tractor/issues/296 # https://github.com/goodboy/tractor/issues/296
# TODO: rename as `RootActor` or is that even necessary? actor = Registrar(
actor = _runtime.Arbiter(
name=name or 'registrar', name=name or 'registrar',
uuid=mk_uuid(), uuid=mk_uuid(),
registry_addrs=registry_addrs, registry_addrs=registry_addrs,
@ -444,19 +468,10 @@ async def open_root_actor(
enable_modules=enable_modules, enable_modules=enable_modules,
) )
# XXX, in case the root actor runtime was actually run from # XXX, in case the root actor runtime was actually run from
# `tractor.to_asyncio.run_as_asyncio_guest()` and NOt # `tractor.to_asyncio.run_as_asyncio_guest()` and NOT
# `.trio.run()`. # `.trio.run()`.
actor._infected_aio = _state._runtime_vars['_is_infected_aio'] actor._infected_aio = _state._runtime_vars['_is_infected_aio']
header: str = '-> Opening new registry @ '
# NOTE, only set the loopback addr for the
# process-tree-global "root" mailbox since all sub-actors
# should be able to speak to their root actor over that
# channel.
raddrs: list[Address] = _state._runtime_vars['_root_addrs']
raddrs.extend(trans_bind_addrs)
# TODO, remove once we have also removed all usage;
# eventually all (root-)registry apis should expect > 1 addr.
_state._runtime_vars['_root_mailbox'] = raddrs[0]
# Start up main task set via core actor-runtime nurseries. # Start up main task set via core actor-runtime nurseries.
try: try:
@ -468,7 +483,7 @@ async def open_root_actor(
report: str = f'Starting actor-runtime for {actor.aid.reprol()!r}\n' report: str = f'Starting actor-runtime for {actor.aid.reprol()!r}\n'
if reg_addrs := actor.registry_addrs: if reg_addrs := actor.registry_addrs:
report += ( report += (
'-> Opening new registry @ ' header
+ +
'\n'.join( '\n'.join(
f'{addr}' for addr in reg_addrs f'{addr}' for addr in reg_addrs
@ -494,14 +509,39 @@ async def open_root_actor(
# "actor runtime" primitives are SC-compat and thus all # "actor runtime" primitives are SC-compat and thus all
# transitively spawned actors/processes must be as # transitively spawned actors/processes must be as
# well. # well.
await root_tn.start( accept_addrs: list[UnwrappedAddress]
reg_addrs: list[UnwrappedAddress]
(
accept_addrs,
reg_addrs,
) = await root_tn.start(
partial( partial(
_runtime.async_main, _runtime.async_main,
actor, actor,
accept_addrs=trans_bind_addrs, accept_addrs=tpt_bind_addrs,
parent_addr=None parent_addr=None
) )
) )
# NOTE, only set a local-host addr (i.e. like
# `lo`-loopback for TCP) for the process-tree-global
# "root"-process (its tree-wide "mailbox") since all
# sub-actors should be able to speak to their root
# actor over that channel.
#
# ?TODO, per-OS non-network-proto alt options?
# -[ ] on linux we should be able to always use UDS?
#
raddrs: list[UnwrappedAddress] = _state._runtime_vars['_root_addrs']
raddrs.extend(
accept_addrs,
)
# TODO, remove once we have also removed all usage;
# eventually all (root-)registry apis should expect > 1 addr.
_state._runtime_vars['_root_mailbox'] = raddrs[0]
# if 'chart' in actor.aid.name:
# from tractor.devx import mk_pdb
# mk_pdb().set_trace()
try: try:
yield actor yield actor
except ( except (
@ -583,6 +623,13 @@ async def open_root_actor(
): ):
_state._runtime_vars['_debug_mode'] = False _state._runtime_vars['_debug_mode'] = False
# !XXX, clear ALL prior contact info state, this is MEGA
# important if you are opening the runtime multiple times
# from the same parent process (like in our test
# harness)!
_state._runtime_vars['_root_addrs'].clear()
_state._runtime_vars['_root_mailbox'] = None
_state._current_actor = None _state._current_actor = None
_state._last_actor_terminated = actor _state._last_actor_terminated = actor

View File

@ -1,843 +0,0 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
"""
Machinery for actor process spawning using multiple backends.
"""
from __future__ import annotations
import multiprocessing as mp
import sys
import platform
from typing import (
Any,
Awaitable,
Literal,
Callable,
TypeVar,
TYPE_CHECKING,
)
import trio
from trio import TaskStatus
from .devx import (
debug,
pformat as _pformat
)
from tractor._state import (
current_actor,
is_main_process,
is_root_process,
debug_mode,
_runtime_vars,
)
from tractor.log import get_logger
from tractor._addr import UnwrappedAddress
from tractor._portal import Portal
from tractor._runtime import Actor
from tractor._entry import _mp_main
from tractor._exceptions import ActorFailure
from tractor.msg import (
types as msgtypes,
pretty_struct,
)
if TYPE_CHECKING:
from ipc import (
_server,
Channel,
)
from ._supervise import ActorNursery
ProcessType = TypeVar('ProcessType', mp.Process, trio.Process)
log = get_logger('tractor')
# placeholder for an mp start context if so using that backend
_ctx: mp.context.BaseContext | None = None
SpawnMethodKey = Literal[
'trio', # supported on all platforms
'mp_spawn',
'mp_forkserver', # posix only
]
_spawn_method: SpawnMethodKey = 'trio'
if platform.system() == 'Windows':
_ctx = mp.get_context("spawn")
async def proc_waiter(proc: mp.Process) -> None:
await trio.lowlevel.WaitForSingleObject(proc.sentinel)
else:
# *NIX systems use ``trio`` primitives as our default as well
async def proc_waiter(proc: mp.Process) -> None:
await trio.lowlevel.wait_readable(proc.sentinel)
def try_set_start_method(
key: SpawnMethodKey
) -> mp.context.BaseContext | None:
'''
Attempt to set the method for process starting, aka the "actor
spawning backend".
If the desired method is not supported this function will error.
On Windows only the ``multiprocessing`` "spawn" method is offered
besides the default ``trio`` which uses async wrapping around
``subprocess.Popen``.
'''
import multiprocessing as mp
global _ctx
global _spawn_method
mp_methods = mp.get_all_start_methods()
if 'fork' in mp_methods:
# forking is incompatible with ``trio``s global task tree
mp_methods.remove('fork')
match key:
case 'mp_forkserver':
from . import _forkserver_override
_forkserver_override.override_stdlib()
_ctx = mp.get_context('forkserver')
case 'mp_spawn':
_ctx = mp.get_context('spawn')
case 'trio':
_ctx = None
case _:
raise ValueError(
f'Spawn method `{key}` is invalid!\n'
f'Please choose one of {SpawnMethodKey}'
)
_spawn_method = key
return _ctx
async def exhaust_portal(
portal: Portal,
actor: Actor
) -> Any:
'''
Pull final result from portal (assuming it has one).
If the main task is an async generator do our best to consume
what's left of it.
'''
__tracebackhide__ = True
try:
log.debug(
f'Waiting on final result from {actor.uid}'
)
# XXX: streams should never be reaped here since they should
# always be established and shutdown using a context manager api
final: Any = await portal.wait_for_result()
except (
Exception,
BaseExceptionGroup,
) as err:
# we reraise in the parent task via a ``BaseExceptionGroup``
return err
except trio.Cancelled as err:
# lol, of course we need this too ;P
# TODO: merge with above?
log.warning(
'Cancelled portal result waiter task:\n'
f'uid: {portal.channel.aid}\n'
f'error: {err}\n'
)
return err
else:
log.debug(
f'Returning final result from portal:\n'
f'uid: {portal.channel.aid}\n'
f'result: {final}\n'
)
return final
async def cancel_on_completion(
portal: Portal,
actor: Actor,
errors: dict[tuple[str, str], Exception],
) -> None:
'''
Cancel actor gracefully once its "main" portal's
result arrives.
Should only be called for actors spawned via the
`Portal.run_in_actor()` API.
=> and really this API will be deprecated and should be
re-implemented as a `.hilevel.one_shot_task_nursery()`..)
'''
# if this call errors we store the exception for later
# in ``errors`` which will be reraised inside
# an exception group and we still send out a cancel request
result: Any|Exception = await exhaust_portal(
portal,
actor,
)
if isinstance(result, Exception):
errors[actor.uid]: Exception = result
log.cancel(
'Cancelling subactor runtime due to error:\n\n'
f'Portal.cancel_actor() => {portal.channel.uid}\n\n'
f'error: {result}\n'
)
else:
log.runtime(
'Cancelling subactor gracefully:\n\n'
f'Portal.cancel_actor() => {portal.channel.uid}\n\n'
f'result: {result}\n'
)
# cancel the process now that we have a final result
await portal.cancel_actor()
async def hard_kill(
proc: trio.Process,
terminate_after: int = 1.6,
# NOTE: for mucking with `.pause()`-ing inside the runtime
# whilst also hacking on it XD
# terminate_after: int = 99999,
) -> None:
'''
Un-gracefully terminate an OS level `trio.Process` after timeout.
Used in 2 main cases:
- "unknown remote runtime state": a hanging/stalled actor that
isn't responding after sending a (graceful) runtime cancel
request via an IPC msg.
- "cancelled during spawn": a process who's actor runtime was
cancelled before full startup completed (such that
cancel-request-handling machinery was never fully
initialized) and thus a "cancel request msg" is never going
to be handled.
'''
log.cancel(
'Terminating sub-proc\n'
f'>x)\n'
f' |_{proc}\n'
)
# NOTE: this timeout used to do nothing since we were shielding
# the ``.wait()`` inside ``new_proc()`` which will pretty much
# never release until the process exits, now it acts as
# a hard-kill time ultimatum.
with trio.move_on_after(terminate_after) as cs:
# NOTE: code below was copied verbatim from the now deprecated
# (in 0.20.0) ``trio._subrocess.Process.aclose()``, orig doc
# string:
#
# Close any pipes we have to the process (both input and output)
# and wait for it to exit. If cancelled, kills the process and
# waits for it to finish exiting before propagating the
# cancellation.
#
# This code was originally triggred by ``proc.__aexit__()``
# but now must be called manually.
with trio.CancelScope(shield=True):
if proc.stdin is not None:
await proc.stdin.aclose()
if proc.stdout is not None:
await proc.stdout.aclose()
if proc.stderr is not None:
await proc.stderr.aclose()
try:
await proc.wait()
finally:
if proc.returncode is None:
proc.kill()
with trio.CancelScope(shield=True):
await proc.wait()
# XXX NOTE XXX: zombie squad dispatch:
# (should ideally never, but) If we do get here it means
# graceful termination of a process failed and we need to
# resort to OS level signalling to interrupt and cancel the
# (presumably stalled or hung) actor. Since we never allow
# zombies (as a feature) we ask the OS to do send in the
# removal swad as the last resort.
if cs.cancelled_caught:
# TODO? attempt at intermediary-rent-sub
# with child in debug lock?
# |_https://github.com/goodboy/tractor/issues/320
#
# if not is_root_process():
# log.warning(
# 'Attempting to acquire debug-REPL-lock before zombie reap!'
# )
# with trio.CancelScope(shield=True):
# async with debug.acquire_debug_lock(
# subactor_uid=current_actor().uid,
# ) as _ctx:
# log.warning(
# 'Acquired debug lock, child ready to be killed ??\n'
# )
# TODO: toss in the skynet-logo face as ascii art?
log.critical(
# 'Well, the #ZOMBIE_LORD_IS_HERE# to collect\n'
'#T-800 deployed to collect zombie B0\n'
f'>x)\n'
f' |_{proc}\n'
)
proc.kill()
async def soft_kill(
proc: ProcessType,
wait_func: Callable[
[ProcessType],
Awaitable,
],
portal: Portal,
) -> None:
'''
Wait for proc termination but **don't yet** teardown
std-streams since it will clobber any ongoing pdb REPL
session.
This is our "soft"/graceful, and thus itself also cancellable,
join/reap on an actor-runtime-in-process shutdown; it is
**not** the same as a "hard kill" via an OS signal (for that
see `.hard_kill()`).
'''
chan: Channel = portal.channel
peer_aid: msgtypes.Aid = chan.aid
try:
log.cancel(
f'Soft killing sub-actor via portal request\n'
f'\n'
f'c)=> {peer_aid.reprol()}@[{chan.maddr}]\n'
f' |_{proc}\n'
)
# wait on sub-proc to signal termination
await wait_func(proc)
except trio.Cancelled:
with trio.CancelScope(shield=True):
await debug.maybe_wait_for_debugger(
child_in_debug=_runtime_vars.get(
'_debug_mode', False
),
header_msg=(
'Delaying `soft_kill()` subproc reaper while debugger locked..\n'
),
# TODO: need a diff value then default?
# poll_steps=9999999,
)
# if cancelled during a soft wait, cancel the child
# actor before entering the hard reap sequence
# below. This means we try to do a graceful teardown
# via sending a cancel message before getting out
# zombie killing tools.
async with trio.open_nursery() as n:
n.cancel_scope.shield = True
async def cancel_on_proc_deth():
'''
"Cancel-the-cancel" request: if we detect that the
underlying sub-process exited prior to
a `Portal.cancel_actor()` call completing .
'''
await wait_func(proc)
n.cancel_scope.cancel()
# start a task to wait on the termination of the
# process by itself waiting on a (caller provided) wait
# function which should unblock when the target process
# has terminated.
n.start_soon(cancel_on_proc_deth)
# send the actor-runtime a cancel request.
await portal.cancel_actor()
if proc.poll() is None: # type: ignore
log.warning(
'Subactor still alive after cancel request?\n\n'
f'uid: {peer_aid}\n'
f'|_{proc}\n'
)
n.cancel_scope.cancel()
raise
async def new_proc(
name: str,
actor_nursery: ActorNursery,
subactor: Actor,
errors: dict[tuple[str, str], Exception],
# passed through to actor main
bind_addrs: list[UnwrappedAddress],
parent_addr: UnwrappedAddress,
_runtime_vars: dict[str, Any], # serialized and sent to _child
*,
infect_asyncio: bool = False,
task_status: TaskStatus[Portal] = trio.TASK_STATUS_IGNORED,
proc_kwargs: dict[str, any] = {}
) -> None:
# lookup backend spawning target
target: Callable = _methods[_spawn_method]
# mark the new actor with the global spawn method
subactor._spawn_method = _spawn_method
await target(
name,
actor_nursery,
subactor,
errors,
bind_addrs,
parent_addr,
_runtime_vars, # run time vars
infect_asyncio=infect_asyncio,
task_status=task_status,
proc_kwargs=proc_kwargs
)
async def trio_proc(
name: str,
actor_nursery: ActorNursery,
subactor: Actor,
errors: dict[tuple[str, str], Exception],
# passed through to actor main
bind_addrs: list[UnwrappedAddress],
parent_addr: UnwrappedAddress,
_runtime_vars: dict[str, Any], # serialized and sent to _child
*,
infect_asyncio: bool = False,
task_status: TaskStatus[Portal] = trio.TASK_STATUS_IGNORED,
proc_kwargs: dict[str, any] = {}
) -> None:
'''
Create a new ``Process`` using a "spawn method" as (configured using
``try_set_start_method()``).
This routine should be started in a actor runtime task and the logic
here is to be considered the core supervision strategy.
'''
spawn_cmd = [
sys.executable,
"-m",
# Hardcode this (instead of using ``_child.__name__`` to avoid a
# double import warning: https://stackoverflow.com/a/45070583
"tractor._child",
# We provide the child's unique identifier on this exec/spawn
# line for debugging purposes when viewing the process tree from
# the OS; it otherwise can be passed via the parent channel if
# we prefer in the future (for privacy).
"--uid",
# TODO, how to pass this over "wire" encodings like
# cmdline args?
# -[ ] maybe we can add an `msgtypes.Aid.min_tuple()` ?
str(subactor.uid),
# Address the child must connect to on startup
"--parent_addr",
str(parent_addr)
]
if subactor.loglevel:
spawn_cmd += [
"--loglevel",
subactor.loglevel
]
# Tell child to run in guest mode on top of ``asyncio`` loop
if infect_asyncio:
spawn_cmd.append("--asyncio")
cancelled_during_spawn: bool = False
proc: trio.Process|None = None
ipc_server: _server.Server = actor_nursery._actor.ipc_server
try:
try:
proc: trio.Process = await trio.lowlevel.open_process(spawn_cmd, **proc_kwargs)
log.runtime(
f'Started new child subproc\n'
f'(>\n'
f' |_{proc}\n'
)
# wait for actor to spawn and connect back to us
# channel should have handshake completed by the
# local actor by the time we get a ref to it
event, chan = await ipc_server.wait_for_peer(
subactor.uid
)
except trio.Cancelled:
cancelled_during_spawn = True
# we may cancel before the child connects back in which
# case avoid clobbering the pdb tty.
if debug_mode():
with trio.CancelScope(shield=True):
# don't clobber an ongoing pdb
if is_root_process():
await debug.maybe_wait_for_debugger()
elif proc is not None:
async with debug.acquire_debug_lock(subactor.uid):
# soft wait on the proc to terminate
with trio.move_on_after(0.5):
await proc.wait()
raise
# a sub-proc ref **must** exist now
assert proc
portal = Portal(chan)
actor_nursery._children[subactor.uid] = (
subactor,
proc,
portal,
)
# send a "spawning specification" which configures the
# initial runtime state of the child.
sspec = msgtypes.SpawnSpec(
_parent_main_data=subactor._parent_main_data,
enable_modules=subactor.enable_modules,
reg_addrs=subactor.reg_addrs,
bind_addrs=bind_addrs,
_runtime_vars=_runtime_vars,
)
log.runtime(
f'Sending spawn spec to child\n'
f'{{}}=> {chan.aid.reprol()!r}\n'
f'\n'
f'{pretty_struct.pformat(sspec)}\n'
)
await chan.send(sspec)
# track subactor in current nursery
curr_actor: Actor = current_actor()
curr_actor._actoruid2nursery[subactor.uid] = actor_nursery
# resume caller at next checkpoint now that child is up
task_status.started(portal)
# wait for ActorNursery.wait() to be called
with trio.CancelScope(shield=True):
await actor_nursery._join_procs.wait()
async with trio.open_nursery() as nursery:
if portal in actor_nursery._cancel_after_result_on_exit:
nursery.start_soon(
cancel_on_completion,
portal,
subactor,
errors
)
# This is a "soft" (cancellable) join/reap which
# will remote cancel the actor on a ``trio.Cancelled``
# condition.
await soft_kill(
proc,
trio.Process.wait, # XXX, uses `pidfd_open()` below.
portal
)
# cancel result waiter that may have been spawned in
# tandem if not done already
log.cancel(
'Cancelling portal result reaper task\n'
f'c)> {subactor.aid.reprol()!r}\n'
)
nursery.cancel_scope.cancel()
finally:
# XXX NOTE XXX: The "hard" reap since no actor zombies are
# allowed! Do this **after** cancellation/teardown to avoid
# killing the process too early.
if proc:
reap_repr: str = _pformat.nest_from_op(
input_op='>x)',
text=subactor.pformat(),
)
log.cancel(
f'Hard reap sequence starting for subactor\n'
f'{reap_repr}'
)
with trio.CancelScope(shield=True):
# don't clobber an ongoing pdb
if cancelled_during_spawn:
# Try again to avoid TTY clobbering.
async with debug.acquire_debug_lock(subactor.uid):
with trio.move_on_after(0.5):
await proc.wait()
await debug.maybe_wait_for_debugger(
child_in_debug=_runtime_vars.get(
'_debug_mode', False
),
header_msg=(
'Delaying subproc reaper while debugger locked..\n'
),
# TODO: need a diff value then default?
# poll_steps=9999999,
)
# TODO: solve the following issue where we need
# to do a similar wait like this but in an
# "intermediary" parent actor that itself isn't
# in debug but has a child that is, and we need
# to hold off on relaying SIGINT until that child
# is complete.
# https://github.com/goodboy/tractor/issues/320
# -[ ] we need to handle non-root parent-actors specially
# by somehow determining if a child is in debug and then
# avoiding cancel/kill of said child by this
# (intermediary) parent until such a time as the root says
# the pdb lock is released and we are good to tear down
# (our children)..
#
# -[ ] so maybe something like this where we try to
# acquire the lock and get notified of who has it,
# check that uid against our known children?
# this_uid: tuple[str, str] = current_actor().uid
# await debug.acquire_debug_lock(this_uid)
if proc.poll() is None:
log.cancel(f"Attempting to hard kill {proc}")
await hard_kill(proc)
log.debug(f"Joined {proc}")
else:
log.warning('Nursery cancelled before sub-proc started')
if not cancelled_during_spawn:
# pop child entry to indicate we no longer managing this
# subactor
actor_nursery._children.pop(subactor.uid)
async def mp_proc(
name: str,
actor_nursery: ActorNursery, # type: ignore # noqa
subactor: Actor,
errors: dict[tuple[str, str], Exception],
# passed through to actor main
bind_addrs: list[UnwrappedAddress],
parent_addr: UnwrappedAddress,
_runtime_vars: dict[str, Any], # serialized and sent to _child
*,
infect_asyncio: bool = False,
task_status: TaskStatus[Portal] = trio.TASK_STATUS_IGNORED,
proc_kwargs: dict[str, any] = {}
) -> None:
# uggh zone
try:
from multiprocessing import semaphore_tracker # type: ignore
resource_tracker = semaphore_tracker
resource_tracker._resource_tracker = resource_tracker._semaphore_tracker # noqa
except ImportError:
# 3.8 introduces a more general version that also tracks shared mems
from multiprocessing import resource_tracker # type: ignore
assert _ctx
start_method = _ctx.get_start_method()
if start_method == 'forkserver':
from multiprocessing import forkserver # type: ignore
# XXX do our hackery on the stdlib to avoid multiple
# forkservers (one at each subproc layer).
fs = forkserver._forkserver
curr_actor = current_actor()
if is_main_process() and not curr_actor._forkserver_info:
# if we're the "main" process start the forkserver
# only once and pass its ipc info to downstream
# children
# forkserver.set_forkserver_preload(enable_modules)
forkserver.ensure_running()
fs_info = (
fs._forkserver_address, # type: ignore # noqa
fs._forkserver_alive_fd, # type: ignore # noqa
getattr(fs, '_forkserver_pid', None),
getattr(
resource_tracker._resource_tracker, '_pid', None),
resource_tracker._resource_tracker._fd,
)
else: # request to forkerserver to fork a new child
assert curr_actor._forkserver_info
fs_info = (
fs._forkserver_address, # type: ignore # noqa
fs._forkserver_alive_fd, # type: ignore # noqa
fs._forkserver_pid, # type: ignore # noqa
resource_tracker._resource_tracker._pid,
resource_tracker._resource_tracker._fd,
) = curr_actor._forkserver_info
else:
# spawn method
fs_info = (None, None, None, None, None)
proc: mp.Process = _ctx.Process( # type: ignore
target=_mp_main,
args=(
subactor,
bind_addrs,
fs_info,
_spawn_method,
parent_addr,
infect_asyncio,
),
# daemon=True,
name=name,
)
# `multiprocessing` only (since no async interface):
# register the process before start in case we get a cancel
# request before the actor has fully spawned - then we can wait
# for it to fully come up before sending a cancel request
actor_nursery._children[subactor.uid] = (subactor, proc, None)
proc.start()
if not proc.is_alive():
raise ActorFailure("Couldn't start sub-actor?")
log.runtime(f"Started {proc}")
ipc_server: _server.Server = actor_nursery._actor.ipc_server
try:
# wait for actor to spawn and connect back to us
# channel should have handshake completed by the
# local actor by the time we get a ref to it
event, chan = await ipc_server.wait_for_peer(
subactor.uid,
)
# XXX: monkey patch poll API to match the ``subprocess`` API..
# not sure why they don't expose this but kk.
proc.poll = lambda: proc.exitcode # type: ignore
# except:
# TODO: in the case we were cancelled before the sub-proc
# registered itself back we must be sure to try and clean
# any process we may have started.
portal = Portal(chan)
actor_nursery._children[subactor.uid] = (subactor, proc, portal)
# unblock parent task
task_status.started(portal)
# wait for ``ActorNursery`` block to signal that
# subprocesses can be waited upon.
# This is required to ensure synchronization
# with user code that may want to manually await results
# from nursery spawned sub-actors. We don't want the
# containing nurseries here to collect results or error
# while user code is still doing it's thing. Only after the
# nursery block closes do we allow subactor results to be
# awaited and reported upwards to the supervisor.
with trio.CancelScope(shield=True):
await actor_nursery._join_procs.wait()
async with trio.open_nursery() as nursery:
if portal in actor_nursery._cancel_after_result_on_exit:
nursery.start_soon(
cancel_on_completion,
portal,
subactor,
errors
)
# This is a "soft" (cancellable) join/reap which
# will remote cancel the actor on a ``trio.Cancelled``
# condition.
await soft_kill(
proc,
proc_waiter,
portal
)
# cancel result waiter that may have been spawned in
# tandem if not done already
log.warning(
"Cancelling existing result waiter task for "
f"{subactor.uid}")
nursery.cancel_scope.cancel()
finally:
# hard reap sequence
if proc.is_alive():
log.cancel(f"Attempting to hard kill {proc}")
with trio.move_on_after(0.1) as cs:
cs.shield = True
await proc_waiter(proc)
if cs.cancelled_caught:
proc.terminate()
proc.join()
log.debug(f"Joined {proc}")
# pop child entry to indicate we are no longer managing subactor
actor_nursery._children.pop(subactor.uid)
# TODO: prolly report to ``mypy`` how this causes all sorts of
# false errors..
# subactor, proc, portal = actor_nursery._children.pop(subactor.uid)
# proc spawning backend target map
_methods: dict[SpawnMethodKey, Callable] = {
'trio': trio_proc,
'mp_spawn': mp_proc,
'mp_forkserver': mp_proc,
}

View File

@ -38,6 +38,7 @@ import trio
from ._exceptions import ( from ._exceptions import (
ContextCancelled, ContextCancelled,
RemoteActorError, RemoteActorError,
TransportClosed,
) )
from .log import get_logger from .log import get_logger
from .trionics import ( from .trionics import (
@ -54,12 +55,12 @@ from tractor.msg import (
) )
if TYPE_CHECKING: if TYPE_CHECKING:
from ._runtime import Actor from .runtime._runtime import Actor
from ._context import Context from ._context import Context
from .ipc import Channel from .ipc import Channel
log = get_logger(__name__) log = get_logger()
# TODO: the list # TODO: the list
@ -409,10 +410,8 @@ class MsgStream(trio.abc.Channel):
# it). # it).
with trio.CancelScope(shield=True): with trio.CancelScope(shield=True):
await self._ctx.send_stop() await self._ctx.send_stop()
except ( except (
trio.BrokenResourceError, TransportClosed,
trio.ClosedResourceError
) as re: ) as re:
# the underlying channel may already have been pulled # the underlying channel may already have been pulled
# in which case our stop message is meaningless since # in which case our stop message is meaningless since
@ -593,9 +592,8 @@ class MsgStream(trio.abc.Channel):
), ),
) )
except ( except (
trio.ClosedResourceError,
trio.BrokenResourceError,
BrokenPipeError, BrokenPipeError,
TransportClosed,
) as _trans_err: ) as _trans_err:
trans_err = _trans_err trans_err = _trans_err
if ( if (

View File

@ -26,9 +26,7 @@ import random
from typing import ( from typing import (
Type, Type,
) )
from tractor import ( from tractor.discovery import _addr
_addr,
)
def get_rando_addr( def get_rando_addr(
@ -61,7 +59,11 @@ def get_rando_addr(
# NOTE, file-name uniqueness (no-collisions) will be based on # NOTE, file-name uniqueness (no-collisions) will be based on
# the runtime-directory and root (pytest-proc's) pid. # the runtime-directory and root (pytest-proc's) pid.
case 'uds': case 'uds':
testrun_reg_addr = addr_type.get_random().unwrap() from tractor.ipc._uds import UDSAddress
addr: UDSAddress = addr_type.get_random()
assert addr.is_valid
assert addr.sockpath.resolve()
testrun_reg_addr = addr.unwrap()
# XXX, as sanity it should never the same as the default for the # XXX, as sanity it should never the same as the default for the
# host-singleton registry actor. # host-singleton registry actor.

View File

@ -25,13 +25,23 @@ from functools import (
) )
import inspect import inspect
import platform import platform
from typing import (
Callable,
get_args,
)
import pytest import pytest
import tractor import tractor
import trio import trio
def tractor_test(fn): def tractor_test(
wrapped: Callable|None = None,
*,
# @tractor_test(<deco-params>)
timeout: float = 30,
hide_tb: bool = True,
):
''' '''
Decorator for async test fns to decorator-wrap them as "native" Decorator for async test fns to decorator-wrap them as "native"
looking sync funcs runnable by `pytest` and auto invoked with looking sync funcs runnable by `pytest` and auto invoked with
@ -45,8 +55,18 @@ def tractor_test(fn):
Basic deco use: Basic deco use:
--------------- ---------------
@tractor_test @tractor_test(
async def test_whatever(): timeout=10,
)
async def test_whatever(
# fixture param declarations
loglevel: str,
start_method: str,
reg_addr: tuple,
tpt_proto: str,
debug_mode: bool,
):
# already inside a root-actor runtime `trio.Task`
await ... await ...
@ -55,7 +75,7 @@ def tractor_test(fn):
If any of the following fixture are requested by the wrapped test If any of the following fixture are requested by the wrapped test
fn (via normal func-args declaration), fn (via normal func-args declaration),
- `reg_addr` (a socket addr tuple where arbiter is listening) - `reg_addr` (a socket addr tuple where registrar is listening)
- `loglevel` (logging level passed to tractor internals) - `loglevel` (logging level passed to tractor internals)
- `start_method` (subprocess spawning backend) - `start_method` (subprocess spawning backend)
@ -67,65 +87,96 @@ def tractor_test(fn):
`tractor.open_root_actor()` funcargs. `tractor.open_root_actor()` funcargs.
''' '''
@wraps(fn) __tracebackhide__: bool = hide_tb
def wrapper(
*args,
loglevel=None,
reg_addr=None,
start_method: str|None = None,
debug_mode: bool = False,
**kwargs
):
# __tracebackhide__ = True
# NOTE: inject ant test func declared fixture # handle @tractor_test (no parens) vs @tractor_test(timeout=10)
# names by manually checking! if wrapped is None:
if 'reg_addr' in inspect.signature(fn).parameters: return partial(
# injects test suite fixture value to test as well tractor_test,
# as `run()` timeout=timeout,
kwargs['reg_addr'] = reg_addr hide_tb=hide_tb,
)
if 'loglevel' in inspect.signature(fn).parameters: funcname: str = wrapped.__name__
# allows test suites to define a 'loglevel' fixture if not inspect.iscoroutinefunction(wrapped):
# that activates the internal logging raise TypeError(
kwargs['loglevel'] = loglevel f'Test-fn {funcname!r} must be an async-function !!'
)
# NOTE: we intentionally use `functools.wraps` instead of
# `@wrapt.decorator` here bc wrapt's transparent proxy makes
# `inspect.iscoroutinefunction(wrapper)` return `True` (it
# proxies `__code__` from the wrapped async fn), which causes
# pytest to skip the test as an "unhandled coroutine".
# `functools.wraps` preserves the signature for fixture
# injection (via `__wrapped__`) without leaking the async
# nature.
@wraps(wrapped)
def wrapper(**kwargs):
__tracebackhide__: bool = hide_tb
# NOTE, ensure we inject any test-fn declared fixture
# names.
for kw in [
'reg_addr',
'loglevel',
'start_method',
'debug_mode',
'tpt_proto',
'timeout',
]:
if kw in inspect.signature(wrapped).parameters:
assert kw in kwargs
# Extract runtime settings as locals for
# `open_root_actor()`; these must NOT leak into
# `kwargs` when the test fn doesn't declare them
# (the original pre-wrapt code had the same guard).
reg_addr = kwargs.get('reg_addr')
loglevel = kwargs.get('loglevel')
debug_mode = kwargs.get('debug_mode', False)
start_method = kwargs.get('start_method')
if platform.system() == 'Windows':
if start_method is None: if start_method is None:
if platform.system() == "Windows":
start_method = 'trio' start_method = 'trio'
elif start_method != 'trio':
raise ValueError(
'ONLY the `start_method="trio"` is supported on Windows.'
)
if 'start_method' in inspect.signature(fn).parameters: # Open a root-actor, passing runtime-settings
# set of subprocess spawning backends # extracted above as closure locals, then invoke
kwargs['start_method'] = start_method # the test-fn body as the root-most task.
#
# NOTE: `kwargs` is forwarded as-is to
# `wrapped()` — it only contains what pytest
# injected based on the test fn's signature.
async def _main(**kwargs):
__tracebackhide__: bool = hide_tb
if 'debug_mode' in inspect.signature(fn).parameters: with trio.fail_after(timeout):
# set of subprocess spawning backends
kwargs['debug_mode'] = debug_mode
if kwargs:
# use explicit root actor start
async def _main():
async with tractor.open_root_actor( async with tractor.open_root_actor(
# **kwargs, registry_addrs=(
registry_addrs=[reg_addr] if reg_addr else None, [reg_addr] if reg_addr else None
),
loglevel=loglevel, loglevel=loglevel,
start_method=start_method, start_method=start_method,
# TODO: only enable when pytest is passed --pdb # TODO: only enable when pytest is passed
# --pdb
debug_mode=debug_mode, debug_mode=debug_mode,
): ):
await fn(*args, **kwargs) # invoke test-fn body IN THIS task
await wrapped(**kwargs)
main = _main # invoke runtime via a root task.
return trio.run(
else: partial(
# use implicit root actor start _main,
main = partial(fn, *args, **kwargs) **kwargs,
)
return trio.run(main) )
return wrapper return wrapper
@ -175,7 +226,15 @@ def pytest_addoption(
def pytest_configure(config): def pytest_configure(config):
backend = config.option.spawn_backend backend = config.option.spawn_backend
tractor._spawn.try_set_start_method(backend) from tractor.spawn._spawn import try_set_start_method
try_set_start_method(backend)
# register custom marks to avoid warnings see,
# https://docs.pytest.org/en/stable/how-to/writing_plugins.html#registering-custom-markers
config.addinivalue_line(
'markers',
'no_tpt(proto_key): test will (likely) not behave with tpt backend'
)
@pytest.fixture(scope='session') @pytest.fixture(scope='session')
@ -214,7 +273,8 @@ def tpt_protos(request) -> list[str]:
# XXX ensure we support the protocol by name via lookup! # XXX ensure we support the protocol by name via lookup!
for proto_key in proto_keys: for proto_key in proto_keys:
addr_type = tractor._addr._address_types[proto_key] from tractor.discovery import _addr
addr_type = _addr._address_types[proto_key]
assert addr_type.proto_key == proto_key assert addr_type.proto_key == proto_key
yield proto_keys yield proto_keys
@ -225,13 +285,32 @@ def tpt_protos(request) -> list[str]:
autouse=True, autouse=True,
) )
def tpt_proto( def tpt_proto(
request,
tpt_protos: list[str], tpt_protos: list[str],
) -> str: ) -> str:
proto_key: str = tpt_protos[0] proto_key: str = tpt_protos[0]
from tractor import _state # ?TODO, but needs a fn-scoped tpt_proto fixture..
# @pytest.mark.no_tpt('uds')
# node = request.node
# markers = node.own_markers
# for mark in markers:
# if (
# mark.name == 'no_tpt'
# and
# proto_key in mark.args
# ):
# pytest.skip(
# f'Test {node} normally fails with '
# f'tpt-proto={proto_key!r}\n'
# )
from tractor.runtime import _state
if _state._def_tpt_proto != proto_key: if _state._def_tpt_proto != proto_key:
_state._def_tpt_proto = proto_key _state._def_tpt_proto = proto_key
_state._runtime_vars['_enable_tpts'] = [
proto_key,
]
yield proto_key yield proto_key
@ -263,12 +342,11 @@ def pytest_generate_tests(
# XXX some weird windows bug with `pytest`? # XXX some weird windows bug with `pytest`?
spawn_backend = 'trio' spawn_backend = 'trio'
# TODO: maybe just use the literal `._spawn.SpawnMethodKey`? # drive the valid-backend set from the canonical `Literal` so
assert spawn_backend in ( # adding a new spawn backend (e.g. `'subint'`) doesn't require
'mp_spawn', # touching the harness.
'mp_forkserver', from tractor.spawn._spawn import SpawnMethodKey
'trio', assert spawn_backend in get_args(SpawnMethodKey)
)
# NOTE: used-to-be-used-to dyanmically parametrize tests for when # NOTE: used-to-be-used-to dyanmically parametrize tests for when
# you just passed --spawn-backend=`mp` on the cli, but now we expect # you just passed --spawn-backend=`mp` on the cli, but now we expect

View File

@ -49,7 +49,7 @@ from tractor.msg import (
import wrapt import wrapt
log = get_logger(__name__) log = get_logger()
# TODO: yeah, i don't love this and we should prolly just # TODO: yeah, i don't love this and we should prolly just
# write a decorator that actually keeps a stupid ref to the func # write a decorator that actually keeps a stupid ref to the func

View File

@ -45,17 +45,15 @@ from typing import (
) )
import trio import trio
from tractor import ( from tractor.runtime import _state
_state, from tractor import log as logmod
log as logmod,
)
from tractor.devx import debug from tractor.devx import debug
log = logmod.get_logger(__name__) log = logmod.get_logger()
if TYPE_CHECKING: if TYPE_CHECKING:
from tractor._spawn import ProcessType from tractor.spawn._spawn import ProcessType
from tractor import ( from tractor import (
Actor, Actor,
ActorNursery, ActorNursery,

View File

@ -59,7 +59,7 @@ from ._sigint import (
_ctlc_ignore_header as _ctlc_ignore_header _ctlc_ignore_header as _ctlc_ignore_header
) )
log = get_logger(__name__) log = get_logger()
# ---------------- # ----------------
# XXX PKG TODO XXX # XXX PKG TODO XXX

View File

@ -53,8 +53,8 @@ import trio
from tractor._exceptions import ( from tractor._exceptions import (
NoRuntime, NoRuntime,
) )
from tractor import _state from tractor.runtime import _state
from tractor._state import ( from tractor.runtime._state import (
current_actor, current_actor,
debug_mode, debug_mode,
) )
@ -76,7 +76,7 @@ from ._repl import (
if TYPE_CHECKING: if TYPE_CHECKING:
from trio.lowlevel import Task from trio.lowlevel import Task
from tractor._runtime import ( from tractor.runtime._runtime import (
Actor, Actor,
) )
@ -84,7 +84,7 @@ _crash_msg: str = (
'Opening a pdb REPL in crashed actor' 'Opening a pdb REPL in crashed actor'
) )
log = get_logger(__package__) log = get_logger()
class BoxedMaybeException(Struct): class BoxedMaybeException(Struct):

View File

@ -25,7 +25,7 @@ from functools import (
import os import os
import pdbp import pdbp
from tractor._state import ( from tractor.runtime._state import (
is_root_process, is_root_process,
) )

View File

@ -21,12 +21,13 @@ cancellation during REPL interaction.
''' '''
from __future__ import annotations from __future__ import annotations
import platform
from typing import ( from typing import (
TYPE_CHECKING, TYPE_CHECKING,
) )
import trio import trio
from tractor.log import get_logger from tractor.log import get_logger
from tractor._state import ( from tractor.runtime._state import (
current_actor, current_actor,
is_root_process, is_root_process,
) )
@ -43,12 +44,13 @@ if TYPE_CHECKING:
from tractor.ipc import ( from tractor.ipc import (
Channel, Channel,
) )
from tractor._runtime import ( from tractor.runtime._runtime import (
Actor, Actor,
) )
log = get_logger(__name__) log = get_logger()
_is_macos: bool = platform.system() == 'Darwin'
_ctlc_ignore_header: str = ( _ctlc_ignore_header: str = (
'Ignoring SIGINT while debug REPL in use' 'Ignoring SIGINT while debug REPL in use'
) )
@ -300,6 +302,11 @@ def sigint_shield(
# XXX: yah, mega hack, but how else do we catch this madness XD # XXX: yah, mega hack, but how else do we catch this madness XD
if ( if (
repl.shname == 'xonsh' repl.shname == 'xonsh'
or (
repl.shname == 'bash'
and
_is_macos
)
): ):
flush_status += ( flush_status += (
'-> ALSO re-flushing due to `xonsh`..\n' '-> ALSO re-flushing due to `xonsh`..\n'

View File

@ -40,7 +40,7 @@ from trio.lowlevel import (
Task, Task,
) )
from tractor._context import Context from tractor._context import Context
from tractor._state import ( from tractor.runtime._state import (
current_actor, current_actor,
debug_mode, debug_mode,
is_root_process, is_root_process,
@ -58,7 +58,7 @@ from ._sigint import (
_ctlc_ignore_header as _ctlc_ignore_header _ctlc_ignore_header as _ctlc_ignore_header
) )
log = get_logger(__package__) log = get_logger()
async def maybe_wait_for_debugger( async def maybe_wait_for_debugger(

View File

@ -55,12 +55,12 @@ import tractor
from tractor.log import get_logger from tractor.log import get_logger
from tractor.to_asyncio import run_trio_task_in_future from tractor.to_asyncio import run_trio_task_in_future
from tractor._context import Context from tractor._context import Context
from tractor import _state from tractor.runtime import _state
from tractor._exceptions import ( from tractor._exceptions import (
NoRuntime, NoRuntime,
InternalError, InternalError,
) )
from tractor._state import ( from tractor.runtime._state import (
current_actor, current_actor,
current_ipc_ctx, current_ipc_ctx,
is_root_process, is_root_process,
@ -87,13 +87,13 @@ from ..pformat import (
if TYPE_CHECKING: if TYPE_CHECKING:
from trio.lowlevel import Task from trio.lowlevel import Task
from threading import Thread from threading import Thread
from tractor._runtime import ( from tractor.runtime._runtime import (
Actor, Actor,
) )
# from ._post_mortem import BoxedMaybeException # from ._post_mortem import BoxedMaybeException
from ._repl import PdbREPL from ._repl import PdbREPL
log = get_logger(__package__) log = get_logger()
_pause_msg: str = 'Opening a pdb REPL in paused actor' _pause_msg: str = 'Opening a pdb REPL in paused actor'
_repl_fail_msg: str|None = ( _repl_fail_msg: str|None = (
@ -628,7 +628,7 @@ def _set_trace(
log.pdb( log.pdb(
f'{_pause_msg}\n' f'{_pause_msg}\n'
f'>(\n' f'>(\n'
f'|_{actor.uid}\n' f'|_{actor.aid.uid}\n'
f' |_{task}\n' # @ {actor.uid}\n' f' |_{task}\n' # @ {actor.uid}\n'
# f'|_{task}\n' # f'|_{task}\n'
# ^-TODO-^ more compact pformating? # ^-TODO-^ more compact pformating?
@ -1257,3 +1257,26 @@ async def breakpoint(
api_frame=inspect.currentframe(), api_frame=inspect.currentframe(),
**kwargs, **kwargs,
) )
async def maybe_pause_bp():
'''
Internal (ONLY for now) `breakpoint()`-er fn which only tries to
use the multi-actor `.pause()` API when the current actor is the
root.
?! BUT WHY !?
-------
This is useful when debugging cases where the tpt layer breaks
(or is intentionally broken, say during resiliency testing) in
the case where a child can no longer contact the root process to
acquire the process-tree-singleton TTY lock.
'''
import tractor
actor = tractor.current_actor()
if actor.aid.name == 'root':
await tractor.pause(shield=True)
else:
tractor.devx.mk_pdb().set_trace()

View File

@ -55,12 +55,12 @@ import tractor
from tractor.to_asyncio import run_trio_task_in_future from tractor.to_asyncio import run_trio_task_in_future
from tractor.log import get_logger from tractor.log import get_logger
from tractor._context import Context from tractor._context import Context
from tractor import _state from tractor.runtime import _state
from tractor._exceptions import ( from tractor._exceptions import (
DebugRequestError, DebugRequestError,
InternalError, InternalError,
) )
from tractor._state import ( from tractor.runtime._state import (
current_actor, current_actor,
is_root_process, is_root_process,
) )
@ -71,7 +71,7 @@ if TYPE_CHECKING:
from tractor.ipc import ( from tractor.ipc import (
IPCServer, IPCServer,
) )
from tractor._runtime import ( from tractor.runtime._runtime import (
Actor, Actor,
) )
from ._repl import ( from ._repl import (
@ -81,7 +81,7 @@ if TYPE_CHECKING:
BoxedMaybeException, BoxedMaybeException,
) )
log = get_logger(__name__) log = get_logger()
class LockStatus( class LockStatus(
@ -1013,7 +1013,7 @@ async def request_root_stdio_lock(
DebugStatus.req_task = current_task() DebugStatus.req_task = current_task()
req_err: BaseException|None = None req_err: BaseException|None = None
try: try:
from tractor._discovery import get_root from tractor.discovery._api import get_root
# NOTE: we need this to ensure that this task exits # NOTE: we need this to ensure that this task exits
# BEFORE the REPl instance raises an error like # BEFORE the REPl instance raises an error like
# `bdb.BdbQuit` directly, OW you get a trio cs stack # `bdb.BdbQuit` directly, OW you get a trio cs stack

Some files were not shown because too many files have changed in this diff Show More