That is, in `.msg._codec.mk_dec()` to ensure we actually still respect
the provided `spec: Union[Type[Struct]]|Type|None` alongside any
"custom" extension-types expected to be `dec_hook()` pre-processed.
Notes,
- previously when `dec_hook()` was provided we were merging with
a `msgspec.Raw` instead of `spec` which **is entirely wrong**; it was
likely leftover code from the sloppy/naive first draft of extension
types support.
- notice the `spec: Union[Type[Struct]]|Type|None` type annotation (and
it appears as though a `test_ext_types_msgspec` suite actually passes
the value `spec=None` fyi) with a value of `None` to imply merging as
`Union[ext_types]|None` (or equivalently a `Optional[Union]`), due
to the incorrect `Raw`-default usage this was actually being ignored..
-> this case has now been clarified via comment in the fn-signature.
Such that decoded output equivalent to `str|None` can actually be
unpacked from a `type_names = ['str', 'NoneType]` without just
ignoring the null-type entry.. Previously, the loop would fall through
silently ignoring the `None` -> `NoneType` string representation mapped
by `.enc_type_union()` and the output union would be incorrect.
Deats,
- include the stdlib's `types` in the lookup loop, obvi changing the
output var's name to `_types` to not collide.
- add output checking versus input `type_names` such that we raise
a value-error with a case specific `report: str` when either,
* the output `_types: list[Type]` is empty,
* the `len(_types) != len(type_names)`.
- Use `Type[BaseException]` (not bare `BaseException`)
for all err-type references: `get_err_type()` return,
`._src_type`, `boxed_type` in `unpack_error()`.
- Add `|None` where types can be unresolvable
(`get_err_type()`, `.boxed_type` property).
- Add `._src_type_resolved` flag to prevent repeated
lookups and guard against `._ipc_msg is None`.
- Fix `recevier` and `exeptions` typos.
Review: PR #426 (Copilot)
https://github.com/goodboy/tractor/pull/426
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Add a teensie unit test to match.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Make `RemoteActorError` resilient to unresolved
custom error types so that errors from remote actors
always relay back to the caller - even when the user
hasn't called `reg_err_types()` to register the exc type.
Deats,
- `.src_type`: log warning + return `None` instead
of raising `TypeError` which was crashing the
entire `_deliver_msg()` -> `pformat()` chain
before the error could be relayed.
- `.boxed_type_str`: fallback to `_ipc_msg.boxed_type_str`
when the type obj can't be resolved so the type *name* is always
available.
- `unwrap_src_err()`: fallback to `RuntimeError` preserving
original type name + traceback.
- `unpack_error()`: log warning when `get_err_type()` returns
`None` telling the user to call `reg_err_types()`.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Move the `Arbiter` class out of `runtime._runtime` into its
logical home at `discovery._registry` as `Registrar(Actor)`.
This completes the long-standing terminology migration from
"arbiter" to "registrar/registry" throughout the codebase.
Deats,
- add new `discovery/_registry.py` mod with `Registrar`
class + backward-compat `Arbiter = Registrar` alias.
- rename `Actor.is_arbiter` attr -> `.is_registrar`;
old attr now a `@property` with `DeprecationWarning`.
- `_root.py` imports `Registrar` directly for
root-actor instantiation.
- export `Registrar` + `Arbiter` from `tractor.__init__`.
- `_runtime.py` re-imports from `discovery._registry`
for backward compat.
Also,
- update all test files to use `.is_registrar`
(`test_local`, `test_rpc`, `test_spawning`,
`test_discovery`, `test_multi_program`).
- update "arbiter" -> "registrar" in comments/docstrings
across `_discovery.py`, `_server.py`, `_transport.py`,
`_testing/pytest.py`, and examples.
- drop resolved TODOs from `_runtime.py` and `_root.py`.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Restructure the flat `tractor/` top-level private mods
into (more nested) subpackages:
- `runtime/`: `_runtime`, `_portal`, `_rpc`, `_state`,
`_supervise`
- `spawn/`: `_spawn`, `_entry`, `_forkserver_override`,
`_mp_fixup_main`
- `discovery/`: `_addr`, `_discovery`, `_multiaddr`
Each subpkg `__init__.py` is kept lazy (no eager
imports) to avoid circular import issues.
Also,
- update all intra-pkg imports across ~35 mods to use
the new subpkg paths (e.g. `from .runtime._state`
instead of `from ._state`)
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Refactor the test-fn deco to use `wrapt.decorator`
instead of `functools.wraps` for better fn-sig
preservation and optional-args support via
`PartialCallableObjectProxy`.
Deats,
- add `timeout` and `hide_tb` deco params
- wrap test-fn body with `trio.fail_after(timeout)`
- consolidate per-fixture `if` checks into a loop
- add `iscoroutinefunction()` type-check on wrapped fn
- set `__tracebackhide__` at each wrapper level
Also,
- update imports for new subpkg paths:
`tractor.spawn._spawn`, `tractor.discovery._addr`,
`tractor.runtime._state`
(see upcoming, likely large patch commit ;)
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Export the new `RuntimeVars` struct and `get_runtime_vars()`
from `tractor.__init__` and improve the accessor to
optionally return the struct form.
Deats,
- add `RuntimeVars` and `get_runtime_vars` to
`__init__.py` exports; alphabetize `_state` imports.
- move `get_runtime_vars()` up in `_state.py` to sit
right below `_runtime_vars` dict definition.
- add `as_dict: bool = True` param so callers can get
either the legacy `dict` or the new `RuntimeVars`
struct.
- drop the old stub fn at bottom of `_state.py`.
- rm stale `from .msg.pretty_struct import Struct` comment.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
So we can start transition from runtime-vars `dict` to a typed struct
for better clarity and wire-ready monitoring potential, as well as
better traceability when .
Deats,
- add a new `RuntimeVars(Struct)` with all fields from `_runtime_vars`
dict typed out
- include `__setattr__()` with `breakpoint()` for debugging
any unexpected mutations.
- add `.update()` method for batch-updating compat with `dict`.
- keep old `_runtime_vars: dict` in place (we need to port a ton of
stuff to adjust..).
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Allow external app code to register custom exception types
on `._exceptions` so they can be re-raised on the receiver
side of an IPC dialog via `get_err_type()`.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Expose a copy of the current actor's `_runtime_vars` dict
via a public fn; TODO to convert to `RuntimeVars` struct.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
- Add `LocalPortal` union to `query_actor()` return
type and `reg_portal` var annotation since the
registrar yields a `LocalPortal` instance.
- Update docstring to note the `LocalPortal` case.
- Widen `.delete_addr()` `addr` param to accept
`list[str|int]` bc msgpack deserializes tuples as
lists over IPC.
- Tighten `uid` annotation to `tuple[str, str]|None`.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
`msgpack` deserializes tuples as lists over IPC so
the `bidict.inverse.pop()` needs a `tuple`-cast to
match registry keys.
Regressed-by: 85457cb (`registry_addrs` change)
Found-via: `/run-tests` test_stale_entry_is_deleted
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
- Use `bidict.forceput()` in `register_actor()` to handle
duplicate addr values from stale entries or actor restarts.
- Fix `uid` annotation to `tuple[str, str]|None` in
`maybe_open_portal()` and handle the `None` return from
`delete_addr()` in log output.
- Pass explicit `registry_addrs=[reg_addr]` to `open_nursery()`
and `find_actor()` in `test_stale_entry_is_deleted` to ensure
the test uses the remote registrar.
- Update `query_actor()` docstring to document the new
`(addr, reg_portal)` yield shape.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix potential `AttributeError` when `query_actor()` yields
a `None` portal (peer-found-locally path) and an `OSError`
is raised during transport connect.
Also,
- fix `Arbiter.delete_addr()` return type to
`tuple[str, str]|None` bc it can return `None`.
- fix "registar" typo -> "registrar" in comment.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
In cases where an actor's transport server task (by default handling new
TCP connections) terminates early but does not de-register from the
pertaining registry (aka the registrar) actor's address table, the
trying-to-connect client actor will get a connection error on that
address. In the case where client handles a (local) `OSError` (meaning
the target actor address is likely being contacted over `localhost`)
exception, make a further call to the registrar to delete the stale
entry and `yield None` gracefully indicating to calling code that no
`Portal` can be delivered to the target address.
This issue was originally discovered in `piker` where the `emsd`
(clearing engine) actor would sometimes crash on rapid client
re-connects and then leave a `pikerd` stale entry. With this fix new
clients will attempt connect via an endpoint which will re-spawn the
`emsd` when a `None` portal is delivered (via `maybe_spawn_em()`).
Since stale addrs can be leaked where the actor transport server task
crashes but doesn't (successfully) unregister from the registrar, we
need a remote way to remove such entries; hence this new (registrar)
method.
To implement this make use of the `bidict` lib for the `._registry`
table thus making it super simple to do reverse uuid lookups from an
input socket-address.
- add `is_valid` and `sockpath.resolve()` asserts in
`get_rando_addr()` for the `'uds'` case plus an
explicit `UDSAddress` type annotation.
- rename no-runtime sockname prefixes from
`'<unknown-actor>'`/`'root'` to
`'no_runtime_root'`/`'no_runtime_actor'` with a proper
if/else branch in `UDSAddress.get_random()`.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Add UDS skip-guard to `test_streaming_to_actor_cluster()`
and plumb `tpt_proto` through the `@tractor_test` wrapper
so transport-parametrized tests can receive it.
Deats,
- skip cluster test when `tpt_proto == 'uds'` with
descriptive msg, add TODO about `@pytest.mark.no_tpt`.
- add `tpt_proto: str|None` param to inner wrapper in
`tractor_test()`, forward to decorated fn when its sig
accepts it.
- register custom `no_tpt` marker via `pytest_configure()`
to avoid unknown-marker warnings.
- add masked todo for `no_tpt` marker-check code in `tpt_proto` fixture
(needs fn-scope to work, left as TODO).
- add `request` param to `tpt_proto` fixture for future
marker inspection.
Also,
- add doc-string to `test_streaming_to_actor_cluster()`.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Namely the workaround expected exc branches added in ef7ed7a for the UDS
parametrization. With the new boxing of the underlying CREs as
tpt-closed, we can expect the same exc outcomes as in the TCP cases.
Also this tweaks some error report logging content used while debugging
this,
- properly `repr()` the `TransportClosed.src_exc`-type from
the maybe emit in `.report_n_maybe_raise()`.
- remove the redudant `chan.raddr` from the "closed abruptly"
header in the tpt-closed handler of `._rpc.process_messages()`,
the `Channel.__repr__()` now contains it by default.
Forward the `tpt_proto` fixture val into spawned daemon
subprocesses via `run_daemon(enable_transports=..)` and
sync `_runtime_vars['_enable_tpts']` in the `tpt_proto`
fixture so sub-actors inherit the transport setting.
Deats,
- add `enable_transports={enable_tpts}` to the daemon
spawn-cmd template in `tests/conftest.py`.
- set `_state._runtime_vars['_enable_tpts']` in the
`tpt_proto` fixture in `_testing/pytest.py`.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
I started getting annoyed by all the warnings from `pytest` during work
on macos suport in CI, so this replaces all `Actor.uid`/`Channel.uid`
accesses with `.aid.uid` (or `.aid.reprol()` for log msgs) across the
core runtime and IPC subsystems to avoid the noise.
This also provides incentive to start the adjustment to all
`.uid`-holding/tracking internal `dict`-tables/data-structures to
instead use `.msg.types.Aid`. Hopefully that will come a (vibed?) follow
up shortly B)
Deats,
- `._context`: swap all `self._actor.uid`, `self.chan.uid`,
and `portal.actor.uid` refs to `.aid.uid`; use
`.aid.reprol()` for log/error formatting.
- `._rpc`: same treatment for `actor.uid`, `chan.uid` in
log msgs and cancel-scope handling; fix `str(err)` typo
in `ContextCancelled` log.
- `._runtime`: update `chan.uid` -> `chan.aid.uid` in ctx
cache lookups, RPC `Start` msg, registration and
cancel-request handling; improve ctxc log formatting.
- `._spawn`: replace all `subactor.uid` with
`.aid.uid` for child-proc tracking, IPC peer waiting,
debug-lock acquisition, and nursery child dict ops.
- `._supervise`: same for `subactor.uid` in cancel and
portal-wait paths; use `actor.aid.uid` for error dict.
- `._state`: fix `last.uid` -> `last.aid.uid` in
`current_actor()` error msg.
Also,
- `._chan`: make `Channel.aid` a proper `@property` backed
by `._aid` so we can add validation/typing later.
- `.log`: use `current_actor().aid.uuid` instead of
`.uid[1]` for actor-uid log field.
- `.msg.types`: add TODO comment for `Start.aid` field
conversion.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Deliver `(LinkedTaskChannel, Any)` instead of the prior `(first, chan)`
order from `open_channel_from()` to match the type annotation and be
consistent with `trio.open_*_channel()` style where the channel obj
comes first.
- flip `yield first, chan` -> `yield chan, first`
- update type annotation + docstring to match
- swap all unpack sites in tests and examples
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Address valid findings from copilot's PR #413 review
(https://github.com/goodboy/tractor/pull/413
#pullrequestreview-3925876037):
- `.get()` docstring referenced non-existent
`._from_trio` attr, correct to `._to_aio`.
- `.send()` docstring falsely claimed error-raising
on missing `from_trio` arg; reword to describe the
actual `.put_nowait()` enqueue behaviour.
- `.open_channel_from()` return type annotation had
`tuple[LinkedTaskChannel, Any]` but `yield` order
is `(first, chan)`; fix annotation + docstring to
match actual `tuple[Any, LinkedTaskChannel]`.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
This change is masked out now BUT i'm leaving it in for reference.
I was debugging a multi-actor fault where the primary source actor was
an infected-aio-subactor (`brokerd.ib`) and it seemed like the REPL was only
entering on the `trio` side (at a `.open_channel_from()`) and not
eventually breaking in the `asyncio.Task`. But, since (changing
something?) it seems to be working now, it's just that the `trio` side
seems to sometimes handle before the (source/causing and more
child-ish) `asyncio`-task, which is a bit odd and not expected..
We could likely refine (maybe with an inter-loop-task REPL lock?) this
at some point and ensure a child-`asyncio` task which errors always
grabs the REPL **first**?
Lowlevel deats/further-todos,
- add (masked) `maybe_open_crash_handler()` block around
`asyncio.Task` execution with notes about weird parent-addr
delivery bug in `test_sync_pause_from_aio_task`
* yeah dunno what that's about but made a bug; seems to be IPC
serialization of the `TCPAddress` struct somewhere??
- add inter-loop lock TODO for avoiding aio-task clobbering
trio-tasks when both crash in debug-mode
Also,
- change import from `tractor.devx.debug` to `tractor.devx`
- adjust `get_logger()` call to use new implicit mod-name detection
added to `.log.get_logger()`, i.e. sin `name=__name__`.
- some teensie refinements to `open_channel_from()`:
* swap return type annotation for to `tuple[LinkedTaskChannel, Any]`
(was `Any`).
* update doc-string to clarify started-value delivery
* add err-log before `.pause()` in what should be an unreachable path.
* add todo to swap the `(first, chan)` pair to match that of ctx..
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
With methods to comms similar to those that exist for the `trio` side,
- `.get()` which proxies verbatim to the `._to_aio: asyncio.Queue`,
- `.send_nowait()` which thin-wraps to `._to_trio: trio.MemorySendChannel`.
Obviously the more correct design is to break up the channel type into
a pair of handle types, one for each "side's" task in each event-loop,
that's hopefully coming shortly in a follow up patch B)
Also,
- fill in some missing doc strings, tweak some explanation comments and
update todos.
- adjust the `test_aio_errors_and_channel_propagates_and_closes()` suite
to use the new `chan` fn-sig-API with `.open_channel_from()` including
the new methods for msg comms; ensures everything added here works e2e.
Skip fields starting with `_` in pretty-printed struct output
to avoid cluttering displays with internal/private state (and/or accessing
private properties which have errors Bp).
Deats,
- add `if k[0] == '_': continue` check to skip private fields
- change nested `if isinstance(v, Struct)` to `elif` since we
now have early-continue for private fields
- mv `else:` comment to clarify it handles top-level fields
- fix indentation of `yield` statement to only output
non-private, non-nested fields
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Per the questionable `copilot` review which is detailed for follow up in
https://github.com/goodboy/tractor/issues/418. These constants are
directly linked from the kernel sources fwiw.
Though it was a good (vibed) try by @dnks, the previous "fix" was not
actually adding unix socket support but merely sidestepping a crash due
to `get_peer_info()`'s impl never going to work on MacOS (and it was
never intended to).
This patch instead solves the underlying issue by implementing a new
`get_peer_pid()` helper which does in fact retrieve the peer's PID in
a more generic/cross-platform way (:fingers_crossed:); much thanks to
the linked SO answer for this solution!
Impl deats,
- add `get_peer_pid()` and call it from
`MsgpackUDSStream.get_stream_addrs()` when we detect a non-'linux'
platform, OW use the original soln: `get_stream_addrs()`.
- add a new case for the `match (peername, sockname)` with a
`case (str(), str()):` which seems to at least work on macos.
- drop all the `LOCAL_PEERCRED` dynamic import branching since it was
never needed and was never going to work.
Same problem as for the `ShmArray` tokens, so tweak and reuse
the `_shorten_key_for_macos()` helper and call it from
`open_shm_list()` similarly.
Some tweaks/updates to the various helpers,
- support `prefix/suffix` inputs and if provided take their lengths and
subtract them from the known *macOS shm_open() has a 31 char limit
(PSHMNAMLEN)* when generating and using the `hashlib.sha256()` value
which overrides (for now..) wtv `key` is passed by the caller.
- pass the appropriate `suffix='_first/_last'` values for the `ShmArray`
token generators case.
- add a `prefix: str = 'shml_'` param to `open_shm_list()`.
- better log formatting with `!r` to report any key shortening.
Adapt the `PSHMNAMLEN` fix from `piker.data._sharedmem` (orig commit
96fb79ec thx @dnks!) to `tractor.ipc._shm` accounting for the
module-local differences:
- Add `hashlib` import for sha256 key hashing
- Add `key: str|None` field to `NDToken` for storing
the original descriptive key separate from the
(possibly shortened) OS-level `shm_name`
- Add `__eq__()`/`__hash__()` to `NDToken` excluding
the `key` field from identity comparison
- Add `_shorten_key_for_macos()` using `t_` prefix
(vs piker's `p_`) with 16 hex chars of sha256
- Use `platform.system() == 'Darwin'` in `_make_token()`
(tractor already imports the `platform` module vs
piker's `sys.platform`)
- Wrap `shm_unlink()` in `ShmArray.destroy()` with
`try/except FileNotFoundError` for teardown races
(was already done in `SharedInt.destroy()`)
- Move token creation before `SharedMemory()` alloc in
`open_shm_ndarray()` so `token.shm_name` is used
as the OS-level name
- Use `lookup_key` pattern in `attach_shm_ndarray()`
to decouple `_known_tokens` dict key from OS name
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Move the multi-platorm-supporting conditional/dynamic `socket` constant
imports to *after* the main cross-platform ones.
Also add constant typing and reformat comments a bit for the macOS case.
Make socket credential imports platform-conditional in `.ipc._uds`.
- Linux: use `SO_PASSCRED`/`SO_PEERCRED` from socket module
- macOS: use `LOCAL_PEERCRED` (0x0001) instead, no need for `SO_PASSCRED`
- Conditionally call `setsockopt(SO_PASSCRED)` only on Linux
Fixes AttributeError on macOS where SO_PASSCRED doesn't exist.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>