Commit Graph

86 Commits (09a61dbd8aa691ee3f66e8e45fdf646560f80b19)

Author SHA1 Message Date
Tyler Goodlet 112ed27cda Move peer-tracking attrs from `Actor` -> `IPCServer`
Namely transferring the `Actor` peer-`Channel` tracking attrs,
- `._peers` which maps the uids to client channels (with duplicates
  apparently..)
- the `._peer_connected: dict[tuple[str, str], trio.Event]` child-peer
  syncing table mostly used by parent actors to wait on sub's to connect
  back during spawn.
- the `._no_more_peers = trio.Event()` level triggered state signal.

Further we move over with some minor reworks,
- `.wait_for_peer()` verbatim (adjusting all dependants).
- factor the no-more-peers shielded wait branch-block out of
  the end of `async_main()` into 2 new server meths,
  * `.has_peers()` with optional chan-connected checking flag.
  * `.wait_for_no_more_peers()` which *just* does the
    maybe-shielded `._no_more_peers.wait()`
2025-04-11 18:11:35 -04:00
Tyler Goodlet 42cf9e11a4 Mv `Actor._stream_handler()` to `.ipc._server` func
Call it `handle_stream_from_peer()` and bind in the `actor: Actor` via
a `handler=partial()` to `trio.serve_listeners()`.

With this (minus the `Actor._peers/._peer_connected/._no_more_peers`
attrs ofc) we get nearly full separation of IPC-connection-processing
(concerns) from `Actor` state. Thus it's a first look at modularizing
the low-level runtime into isolated subsystems which will hopefully
improve the entire code base's grok-ability and ease any new feature
design discussions especially pertaining to introducing and/or
composing-together any new transport protocols.
2025-04-11 14:51:52 -04:00
Tyler Goodlet c208bcbb1b Factor actor-embedded IPC-tpt-server to `ipc` subsys
Primarily moving the `Actor._serve_forever()`-task-as-method and
supporting actor-instance attributes to a new `.ipo._server` sub-mod
which now encapsulates,
- the coupling various `trio.Nursery`s (and their independent lifetime mgmt)
  to different `trio.serve_listener()`s tasks and `SocketStream`
  handler scopes.
- `Address` and `SocketListener` mgmt and tracking through the idea of
  an "IPC endpoint": each "bound-and-active instance" of a served-listener
  for some (varied transport protocol's socket) address.
- start and shutdown of the entire server's lifetime via an `@acm`.
- delegation of starting/stopping tpt-protocol-specific `trio.abc.Listener`s
  to the corresponding `.ipc._<proto_key>` sub-module (newly defined
  mod-top-level instead of `Address` method) `start/close_listener()`
  funcs.

Impl details of the `.ipc._server` sub-sys,
- add new `IPCServer`, allocated with `open_ipc_server()`, and which
  encapsulates starting multiple-transport-proto-`trio.abc.Listener`s
  from an input set of `._addr.Address`s using,
  |_`IPCServer.listen_on()` which internally spawns tasks that delegate to a new
    `_serve_ipc_eps()`, a rework of what was (effectively)
    `Actor._serve_forever()` and which now,
    * allocates a new `IPCEndpoint`-struct (see below) for each
      address-listener pair alongside the specified
      listener-serving/stream-handling `trio.Nursery`s provided by the
      caller.
    * starts and stops each transport (socket's) listener by calling
      `IPCEndpoint.start/close_listener()` which in turn delegates to
      the underlying `inspect.getmodule(IPCEndpoint.addr)` backend tpt
      module's equivalent impl.
    * tracks all created endpoints in a `._endpoints: list[IPCEndpoint]`
      which is further exposed through public properties for
      introspection of served transport-protocols and their addresses.
  |_`IPCServer._[parent/stream_handler]_tn: Nursery`s which are either
     allocated (in which case, as the same instance) or provided by the
     caller of `open_ipc_server()` such that the same nursery-cancel-scope
     controls offered by `trio.serve_listeners(handler_nursery=)` are
     offered where the `._parent_tn` is used to spawn `_serve_ipc_eps()`
     tasks, and `._stream_handler_tn` is passed verbatim as `handler_nursery`.
- a new `IPCEndpoint`-struct (as mentioned) which wraps each
  transport-proto's address + listener + allocated-supervising-nursery
  to encapsulate the "lifetime of a server IPC endpoint" such that
  eventually we can track and managed per-protocol/address/`.listen_on()`-call
  scoped starts/stops/restarts for the purposes of filtering/banning
  peer traffic.
  |_ also included is an unused `.peer_tpts` table which we can
    hopefully use to replace `Actor._peers` in a `Channel`-tracking
    transport-proto-aware way!

Surrounding changes to `.ipc.*` primitives to match,
- make `[TCP|UDS]Address` types `msgspec.Struct(frozen=True)` and thus
  drop any-and-all `addr._host =` style mutation throughout.
  |_ as such also drop their `.__init__()` and `.__eq__()` meths.
  |_ UDS tweaks to field names and thus `.__repr__()`.
- move `[TCP|UDS]Address.[start/close]_listener()` meths to be mod-level
  equiv `start|close_listener()` funcs.
- just hard code the `.ipc._types._key_to_transport/._addr_to_transport`
  table entries instead of all the prior fancy dynamic class property
  reading stuff (remember, "explicit is better then implicit").

Modified in `._runtime.Actor` internals,
- drop the `._serve_forever()` and `.cancel_server()`, methods and
  `._server_down` waiting logic from `.cancel_soon()`
- add `.[_]ipc_server` which is opened just after the `._service_n` and
  delegate to it for any equivalent publicly exposed instance
  attributes/properties.
2025-04-10 23:18:32 -04:00
Tyler Goodlet 3d3a1959ed s/`._addr.preferred_transport`/`_state._def_tpt_proto`
Such that the "global-ish" setting (actor-local) is managed with the
others per actor-process and type it as a `Literal['tcp', 'uds']` of the
currently support protocol keys.

Here obvi `_tpt` is some kinda shorthand for "transport" and `_proto` is
for "protocol" Bp

Change imports and refs in all dependent modules.

Oh right, and disable UDS in `wrap_address()` for the moment while
i figure out how to avoid the unwrapped type collision..
2025-04-06 22:06:42 -04:00
Tyler Goodlet 9e812d7793 Add `Arbiter.is_registry()` in prep for proper `.discovery._registry` 2025-04-06 22:06:42 -04:00
Tyler Goodlet c85606075d Mv `Actor._do_handshake()` to `Channel`, add `.aid`
Finally.. i've been meaning todo this for ages since the
actor-id-swap-as-handshake is better layered as part of the IPC msg-ing
machinery and then let's us encapsulate the connection-time-assignment
of a remote peer's `Aid` as a new `Channel.aid: Aid`. For now we
continue to offer the `.uid: tuple[str, str]` attr (by delegating to the
`.uid` field) since there's still a few things relying on it in the
runtime and ctx layers

Nice bonuses from this,
- it's very easy to get the peer's `Aid.pid: int` from anywhere in an
  IPC ctx by just reading it from the chan.
- we aren't saving more then the wire struct-msg received.

Also add deprecation warnings around usage to get us moving on porting
the rest of consuming runtime code to the new attr!
2025-04-06 22:06:42 -04:00
Tyler Goodlet 674a33e3b1 Add an `Actor.pformat()`
And map `.__repr__/__str__` to it and add various new fields to fill it
out,
- drop `self.uid` as var and instead add `Actor._aid: Aid` and proxy to
  it for the various `.name/.uid/.pid` properties as well as a new
  `.aid` field.
 |_ the `Aid.pid` addition is also included.

Other improvements,
- flip to a sync call to `Address.close_listener()`.
- track the `async_main()` parent task as `Actor._task`.
- add exception logging around failure to bind due to already-in-use
  when calling `add.open_listener()` in `._stream_forever()`; sometimes
  the error might be overridden by something else during the
  runtime-failure unwind..
2025-04-06 22:06:42 -04:00
Tyler Goodlet ddeab1355a Allocate bind-addrs in subactors
Previously whenever an `ActorNursery.start_actor()` call did not receive
a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs
in the parent, for UDS this obviously won't work nor is it ideal bc it's
nicer to have the actor to be a socket server (who calls
`Address.open_listener()`) define the socket-file-name containing their
unique ID info such as pid, actor-uuid etc.

As such this moves "random" generation of server addresses to the
child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`;
i.e. we do the allocation of the `Address.get_random()` addrs inside
`._runtime.async_main()` instead of `Portal.start_actor()` and **only
when** `accept_addrs`/`bind_addrs` was **not provided by the spawning
parent**.

Further this patch get's way more rigorous about the `SpawnSpec`
processing in the child inside `Actor._from_parent()` such that we
handle any invalid msgs **very loudly and pedantically!**

Impl deats,
- do the "random addr generation" in an explicit `for` loop (instead of
  prior comprehension) to allow for more detailed typing of the layered
  calls to the new `._addr` mod.
- use a `match:/case:` for process any invalid `SpawnSpec` payload case
  where we can instead receive a `MsgTypeError` from the `chan.recv()`
  call in `Actor._from_parent()` to raise it immediately instead of
  triggering downstream type-errors XD
  |_ as per the big `#TODO` we prolly want to take from other callers
     of `Channel.recv()` (like in the `._rpc.process_messages()` loop).
  |_ always raise `InternalError` on non-match/fall-through case!
  |_ add a note about not being able to use `breakpoint()` in this
     section due to causality of `SpawnSpec._runtime_vars` not having
     been processed yet..
  |_ always return a third element from `._from_rent()` eventually to be
     the `preferred_transports: list[str]` from the spawning rent.
- use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)`
  for all actor creation (including in all the mods tweaked here).
- Move to new type-alias-name `UnwrappedAddress` throughout.
2025-04-06 22:03:07 -04:00
Guillermo Rodriguez 1762b3eb64 Trying to make full suite pass with uds 2025-04-06 22:02:24 -04:00
Guillermo Rodriguez 486f4a3843 Finally switch to using address protocol in all runtime 2025-04-06 22:02:18 -04:00
Guillermo Rodriguez f80a47571a Starting to make `.ipc.Channel` work with multiple MsgTransports 2025-04-06 21:58:45 -04:00
Guillermo Rodriguez 7b8b9d6805
move tractor._ipc.py into tractor.ipc._chan.py 2025-03-27 20:36:45 -03:00
Tyler Goodlet efcf81bcad Add `.runtime()`-emit to `._invoke()` to report final result msg in the child 2025-03-27 15:58:03 -04:00
Tyler Goodlet c453623b9b Go to loose egs in `Actor` root & service nurseries (for now..) 2025-03-27 13:38:47 -04:00
Tyler Goodlet aa80b55567 Log format tweaks for sclang reprs
A space here, a newline there..
2025-03-27 13:38:47 -04:00
Tyler Goodlet 5cdfee3bcf Pass `infect_asyncio` setting via runtime-vars
The reason for this "duplication" with the `--asyncio` CLI flag (passed
to the child during spawn) is 2-fold:
- allows verifying inside `Actor._from_parent()` that the `trio` runtime was
  started via `.start_guest_run()` as well as if the
  `Actor._infected_aio` spawn-entrypoint value has been set (by the
  `._entry.<spawn-backend>_main()` whenever `--asyncio` is passed)
  such that any mismatch can be signaled via an `InternalError`.
- enables checking the `._state._runtime_vars['_is_infected_aio']` value
  directly (say from a non-actor/`trio`-thread) instead of calling
  `._state.current_actor(err_on_no_runtime=False)` in certain edge
  cases.

Impl/testing deats:
- add `._state._runtime_vars['_is_infected_aio'] = False` default.
- raise `InternalError` on any `--asyncio`-flag-passed vs.
  `_runtime_vars`-value-relayed-from-parent inside
  `Actor._from_parent()` and include a `Runner.is_guest` assert for good
  measure B)
- set and relay `infect_asyncio: bool` via runtime-vars to child in
  `ActorNursery.start_actor()`.
- verify `actor.is_infected_aio()`, `actor._infected_aio` and
  `_state._runtime_vars['_is_infected_aio']` are all set in test suite's
  `asyncio_actor()` endpoint.
2025-03-27 13:24:25 -04:00
Tyler Goodlet 4b92e14c92 Denoise duplicate chan logging for now 2025-03-24 14:04:52 -04:00
Tyler Goodlet 32f7742e53 Finally implement peer-lookup optimization..
There's a been a todo for soo long for this XD

Since all `Actor`'s store a set of `._peers` we can try a lookup on that
table as a shortcut before pinging the registry Bo

Impl deats:
- add a new `._discovery.get_peer_by_name()` routine which attempts the
  `._peers` lookup by combining a copy of that `dict` + an entry added
  for `Actor._parent_chan` (since all subs have a parent and often the
  desired contact is just that connection).
- change `.find_actor()` (for the `only_first == True` case),
  `.query_actor()` and `.wait_for_actor()` to call the new helper and
  deliver appropriate outputs if possible.

Other,
- deprecate `get_arbiter()` def and all usage in tests and examples.
- drop lingering use of `arbiter_sockaddr` arg to various routines.
- tweak the `Actor` doc str as well as some code fmting and a tweak to
  the `._stream_handler()`'s initial `con_status: str` logging value
  since the way it was could never be reached.. oh and `.warning()` on
  any new connections which already have a `_pre_chan: Channel` entry in
  `._peers` so we can start minimizing IPC duplications.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 46066c02e4 More-n-more scops annots in logging 2025-03-24 14:04:52 -04:00
Tyler Goodlet 950a2ec30f Use `._entry` proto-ed "lifetime ops" in logging
As per a WIP scribbled out TODO in `._entry.nest_from_op()`, change
a bunch of "supervisor/lifetime mgmt ops" related log messages to
contain some supervisor-annotation "headers" in an effort to give
a terser "visual indication" of how some execution/scope/storage
primitive entity (like an actor/task/ctx/connection) is being operated
on (like, opening/started/closed/cancelled/erroring) from a "supervisor
action" POV.

Also tweak a bunch more emissions to lower levels to reduce noise around
normal inter-actor operations like process and IPC ctx supervision.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 8ff682440d Further formalize `greenback` integration
Since we more or less require it for `tractor.pause_from_sync()` this
refines enable toggles and their relay down the actor tree as well as
more explicit logging around init and activation.

Tweaks summary:
- `.info()` report the module if discovered during root boot.
- use a `._state._runtime_vars['use_greenback']: bool` activation flag
  inside `Actor._from_parent()` to determine if the sub should try to
  use it and set to `False` if mod-loading fails / not installed.
- expose `maybe_init_greenback()` from `.devx` sugpkg.
- comment out RTE in `._pause()` for now since we already have it in
  `.pause_from_sync()`.
- always `.exception()` on `maybe_init_greenback()` import errors to
  clarify the underlying failure deats.
- always explicitly report if `._state._runtime_vars['use_greenback']`
  was NOT set when `.pause_from_sync()` is called.

Other `._runtime.async_main()` adjustments:
- combine the "internal error call ur parents" message and the failed
  registry contact status into one new `err_report: str`.
- drop the final exception handler's call to
  `Actor.lifetime_stack.close()` since we're already doing it in the
  `finally:` block and the earlier call has no currently known benefit.
- only report on the `.lifetime_stack()` callbacks if any are detected
  as registered.
2025-03-24 14:04:52 -04:00
Tyler Goodlet f64447148e Avoid actor-nursery-exit warns on registrees
Since a local-actor-nursery-parented subactor might also use the root as
its registry, we need to avoid warning when short lived IPC `Channel`
connections establish and then disconnect (quickly, bc the apparently
the subactor isn't re-using an already cached parente-peer<->child conn
as you'd expect efficiency..) since such cases currently considered
normal operation of our super shoddy/naive "discovery sys" XD

As such, (un)guard the whole local-actor-nursery OR channel-draining
waiting blocks with the additional `or Actor._cancel_called` branch
since really we should also be waiting on the parent nurse to exit (at
least, for sure and always) when the local `Actor` indeed has been
"globally" cancelled-called. Further add separate timeout warnings for
channel-draining vs. local-actor-nursery-exit waiting since they are
technically orthogonal cases (at least, afaik).

Also,
- adjust the `Actor._stream_handler()` connection status log-emit to
  `.runtime()`, especially to reduce noise around the aforementioned
  ephemeral registree connection-requests.
- if we do wait on a local actor-nurse to exit, report its `._children`
  table (which should help figure out going forward how useful the
  warning is, if at all).
2025-03-24 14:04:52 -04:00
Tyler Goodlet b3387aca61 Don't (noisly) log about runtime cancel RPC tasks
Since in the case of the `Actor._cancel_task()` related runtime eps we
actually don't EVER register them in `Actor._rpc_tasks`.. logging about
them is just needless noise, though maybe we should track them in a diff
table; something like a `._runtime_rpc_tasks`?

Drop the cancel-request-for-stale-RPC-task (`KeyError` case in
`Actor._cancel_task()`) log-emit level in to `.runtime()`; it's
generally not useful info other then for granular race condition eval
when hacking the runtime.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 50ed461996 Port `Actor._stream_handler()` to use `.has_outcome`, fix indent bug.. 2025-03-24 14:04:51 -04:00
Tyler Goodlet 429f8f4e13 Adjust `._runtime` to report `DebugStatus.req_ctx`
- inside the `Actor.cancel()`'s maybe-wait-on-debugger delay,
  report the full debug request status and it's affiliated lock request
  IPC ctx.
- use the new `.req_ctx.chan.uid` to do the local nursery lookup during
  channel teardown handling.
- another couple log fmt tweaks.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 74b6871bfd Mk `process_messages()` return last msg; summary logging
Not sure it's **that** useful (yet) but in theory would allow avoiding
certain log level usage around transient RPC requests for discovery methods
(like `.register_actor()` and friends); can't hurt to be able to
introspect that last message for other future cases I'd imagine as well.
Adjust the calling code in `._runtime` to match; other spots are using
the `trio.Nursery.start()` schedule style and are fine as is.

Improve a bunch more log messages throughout a few mods mostly by going
to a "summary" single-emission style where possible/appropriate:
- in `._runtime` more "single summary" status style log emissions:
 |_mk `Actor.load_modules()` render a single mod loaded summary.
 |_use a summary `con_status: str` for `Actor._stream_handler()` conn
   setup and an equiv (`con_teardown_status`) for connection teardowns.
 |_similar thing in `Actor.wait_for_actor()`.
- generally more usage of `.msg.pretty_struct` apis throughout `._runtime`.
2025-03-24 14:04:51 -04:00
Tyler Goodlet efb69f9bf9 Flip back `StartAck` timeout to `inf`.. 2025-03-24 14:04:51 -04:00
Tyler Goodlet e4e04c516f First draft "payload receiver in a new `.msg._ops`
As per much tinkering, re-designs and preceding rubber-ducking via many
"commit msg novelas", **finally** this adds the (hopefully) final
missing layer for typed msg safety: `tractor.msg._ops.PldRx`

(or `PayloadReceiver`? haven't decided how verbose to go..)

Design justification summary:
      ------ - ------
- need a way to be as-close-as-possible to the `tractor`-application
  such that when `MsgType.pld: PayloadT` validation takes place, it is
  straightforward and obvious how user code can decide to handle any
  resulting `MsgTypeError`.
- there should be a common and optional-yet-modular way to modify
  **how** data delivered via IPC (possibly embedded as user defined,
  type-constrained `.pld: msgspec.Struct`s) can be handled and processed
  during fault conditions and/or IPC "msg attacks".
- support for nested type constraints within a `MsgType.pld` field
  should be simple to define, implement and understand at runtime.
- a layer between the app-level IPC primitive APIs
  (`Context`/`MsgStream`) and application-task code (consumer code of
  those APIs) should be easily customized and prove-to-be-as-such
  through demonstrably rigorous internal (sub-sys) use!
  -> eg. via seemless runtime RPC eps support like `Actor.cancel()`
  -> by correctly implementing our `.devx._debug.Lock` REPL TTY mgmt
    dialog prot, via a dead simple payload-as-ctl-msg-spec.

There are some fairly detailed doc strings included so I won't duplicate
that content, the majority of the work here is actually somewhat of
a factoring of many similar blocks that are doing more or less the same
`msg = await Context._rx_chan.receive()` with boilerplate for
`Error`/`Stop` handling via `_raise_from_no_key_in_msg()`. The new
`PldRx` basically provides a shim layer for this common "receive msg,
decode its payload, yield it up to the consuming app task" by pairing
the RPC feeder mem-chan with a msg-payload decoder and expecting IPC API
internals to use **one** API instead of re-implementing the same pattern
all over the place XD

`PldRx` breakdown
 ------ - ------
- for now only expects a `._msgdec: MsgDec` which allows for
  override-able `MsgType.pld` validation and most obviously used in
  the impl of `.dec_msg()`, the decode message method.
- provides multiple mem-chan receive options including:
 |_ `.recv_pld()` which does the e2e operation of receiving a payload
    item.
 |_ a sync `.recv_pld_nowait()` version.
 |_ a `.recv_msg_w_pld()` which optionally allows retreiving both the
    shuttling `MsgType` as well as it's `.pld` body for use cases where
    info on both is important (eg. draining a `MsgStream`).

Dirty internal changeover/implementation deatz:
             ------ - ------
- obvi move over all the IPC "primitives" that previously had the duplicate recv-n-yield
  logic:
 - `MsgStream.receive[_nowait]()` delegating instead to the equivalent
   `PldRx.recv_pld[_nowait]()`.
 - add `Context._pld_rx: PldRx`, created and passed in by
   `mk_context()`; use it for the `.started()` -> `first: Started`
   retrieval inside `open_context_from_portal()`.
 - all the relevant `Portal` invocation methods: `.result()`,
   `.run_from_ns()`, `.run()`; also allows for dropping `_unwrap_msg()`
   and `.Portal_return_once()` outright Bo
- rename `Context.ctx._recv_chan` -> `._rx_chan`.
- add detailed `Context._scope` info for logging whether or not it's
  cancelled inside `_maybe_cancel_and_set_remote_error()`.
- move `._context._drain_to_final_msg()` -> `._ops.drain_to_final_msg()`
  since it's really not necessarily ctx specific per say, and it does
  kinda fit with "msg operations" more abstractly ;)
2025-03-24 14:04:51 -04:00
Tyler Goodlet dc31f0dac9 Use `DebugStatus` around subactor lock requests
Breaks out all the (sub)actor local conc primitives from `Lock` (which
is now only used in and by the root actor) such that there's an explicit
distinction between a task that's "consuming" the `Lock` (remotely) vs.
the root-side service tasks which do the actual acquire on behalf of the
requesters.

`DebugStatus` changeover deats:
------ - ------
- move all the actor-local vars over `DebugStatus` including:
  - move `_trio_handler` and `_orig_sigint_handler`
  - `local_task_in_debug` now `repl_task`
  - `_debugger_request_cs` now `req_cs`
  - `local_pdb_complete` now `repl_release`
- drop all ^ fields from `Lock.repr()` obvi..
- move over the `.[un]shield_sigint()` and
  `.is_main_trio_thread()` methods.
- add some new attrs/meths:
  - `DebugStatus.repl` for the currently running `Pdb` in-actor
    singleton.
  - `.repr()` for pprint of state (like `Lock`).
- Note: that even when a root-actor task is in REPL, the `DebugStatus`
  is still used for certain actor-local state mgmt, such as SIGINT
  handler shielding.
- obvi change all lock-requester code bits to now use a `DebugStatus` in
  their local actor-state instead of `Lock`, i.e. change usage from
  `Lock` in `._runtime` and `._root`.
- use new `Lock.get_locking_task_cs()` API in when checking for
  sub-in-debug from `._runtime.Actor._stream_handler()`.

Unrelated to topic-at-hand tweaks:
------ - ------
- drop the commented bits about hiding `@[a]cm` stack frames from
  `_debug.pause()` and simplify to only one block with the `shield`
  passthrough since we already solved the issue with cancel-scopes using
  `@pdbp.hideframe` B)
  - this includes all the extra logging about the extra frame for the
    user (good thing i put in that wasted effort back then eh..)
- put the `try/except BaseException` with `log.exception()` around the
  whole of `._pause()` to ensure we don't miss in-func errors which can
  cause hangs..
- allow passing in `portal: Portal` to
  `Actor.start_remote_task()` such that `Portal` task spawning methods
  are always denoted correctly in terms of `Context.side`.
- lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 648695a325 Start tidying up `._context`, use `pack_from_raise()`
Mostly removing commented (and replaced) code blocks lingering from the
ctxc semantics work and new typed-msg-spec `MsgType`s handling AND use
the new `._exceptions.pack_from_raise()` helper to construct
`StreamOverrun` msgs.

Deaterz:
- clean out the drain loop now that it's implemented to handle our
  struct msg types including the `dict`-msg bits left in as
  fallback-reminders, any notes/todos better summarized at the top of
  their blocks, remove any `_final_result_is_set()` related duplicate/legacy
  tidbits.
- use a `case Error()` block in drain loop with fallthrough to `_:`
  always resulting in an rte raise.
- move "XXX" notes into the doc-string for `._deliver_msg()` as
  a "rules" section.
- use `match:` syntax for logging the `result_or_err: MsgType` outcome
  from the final `.result()` call inside `open_context_from_portal()`.
- generally speaking use `MsgType` type annotations throughout!
2025-03-24 14:04:51 -04:00
Tyler Goodlet fb94ecd729 Rename `Actor._push_result()` -> `._deliver_ctx_payload()`
Better describes the internal RPC impl/latest-architecture with the msgs
delivered being those which either define a `.pld: PayloadT` that gets
passed up to user code, or the error-msg subset that similarly is raised
in a ctx-linked task.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 8ac9ccf65d Finally drop masked `chan.send(None)` related code blocks 2025-03-24 14:04:51 -04:00
Tyler Goodlet 939f198dd9 Drop `None`-sentinel cancels RPC loop mechanism
Pretty sure we haven't *needed it* for a while, it was always generally
hazardous in terms of IPC msg types, AND it's definitely incompatible
with a dynamically applied typed msg spec: you can't just expect
a `None` to be willy nilly handled all the time XD

For now I'm masking out all the code and leaving very detailed
surrounding notes but am not removing it quite yet in case for strange
reason it is needed by some edge case (though I haven't found according
to the test suite).

Backstory:
------ - ------
Originally (i'm pretty sure anyway) it was added as a super naive
"remote cancellation" mechanism (back before there were specific `Actor`
methods for such things) that was mostly (only?) used before IPC
`Channel` closures to "more gracefully cancel" the connection's parented
RPC tasks. Since we now have explicit runtime-RPC endpoints for
conducting remote cancellation of both tasks and full actors, it should
really be removed anyway, because:
- a `None`-msg setinel is inconsistent with other RPC endpoint handling
  input patterns which (even prior to typed msging) had specific
  msg-value triggers.
- the IPC endpoint's (block) implementation should use
  `Actor.cancel_rpc_tasks(parent_chan=chan)` instead of a manual loop
  through a `Actor._rpc_tasks.copy()`..

Deats:
- mask the `Channel.send(None)` calls from both the `Actor._stream_handler()` tail
  as well as from the `._portal.open_portal()` was connected block.
- mask the msg loop endpoint block and toss in lotsa notes.

Unrelated tweaks:
- drop `Actor._debug_mode`; unused.
- make `Actor.cancel_server()` return a `bool`.
- use `.msg.pretty_struct.Struct.pformat()` to show any msg that is
  ignored (bc invalid) in `._push_result()`.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 09eed9d7e1 WIP porting runtime to use `Msg`-spec 2025-03-24 14:04:51 -04:00
Tyler Goodlet 4621c8c1b9 Change all `| None` -> `|None` in `._runtime` 2025-03-20 22:37:51 -04:00
Tyler Goodlet 9082efbe68 Add a `._state._runtime_vars['_registry_addrs']`
Such that it's set to whatever `Actor.reg_addrs: list[tuple]` is during
the actor's init-after-spawn guaranteeing each actor has at least the
registry infos from its parent. Ensure we read this if defined over
`_root._default_lo_addrs` in `._discovery` routines, namely
`.find_actor()` since it's the one API normally used without expecting
the runtime's `current_actor()` to be up.

Update the latest inter-peer cancellation test to use the `reg_addr`
fixture (and thus test this new runtime-vars value via `find_actor()`
usage) since it was failing if run *after* the infected `asyncio` suite
due to registry contact failure.
2025-03-20 19:50:31 -04:00
Tyler Goodlet dbd79d8beb Log chan-server-startup failures via `.exception()` 2025-03-20 19:50:31 -04:00
Tyler Goodlet 51bd38976f Expose per-actor registry addrs via `.reg_addrs`
Since it's handy to be able to debug the *writing* of this instance var
(particularly when checking state passed down to a child in
`Actor._from_parent()`), rename and wrap the underlying
`Actor._reg_addrs` as a settable `@property` and add validation to
the `.setter` for sanity - actor discovery is a critical functionality.

Other tweaks:
- fix `.cancel_soon()` to pass expected argument..
- update internal runtime error message to be simpler and link to GH issues.
- use new `Actor.reg_addrs` throughout core.
2025-03-20 19:50:31 -04:00
Tyler Goodlet 7246749137 Add post-mortem catch around failed transport addr binds to aid with runtime debugging 2025-03-20 19:50:31 -04:00
Tyler Goodlet 7ce4bc489e Init-support for "multi homed" transports
Since we'd like to eventually allow a diverse set of transport
(protocol) methods and stacks, and a multi-peer discovery system for
distributed actor-tree applications, this reworks all runtime internals
to support multi-homing for any given tree on a logical host. In other
words any actor can now bind its transport server (currently only
unsecured TCP + `msgspec`) to more then one address available in its
(linux) network namespace. Further, registry actors (now dubbed
"registars" instead of "arbiters") can also similarly bind to multiple
network addresses and provide discovery services to remote actors via
multiple addresses which can now be provided at runtime startup.

Deats:
- adjust `._runtime` internals to use a `list[tuple[str, int]]` (and
  thus pluralized) socket address sequence where applicable for transport
  server socket binds, now exposed via `Actor.accept_addrs`:
  - `Actor.__init__()` now takes a `registry_addrs: list`.
  - `Actor.is_arbiter` -> `.is_registrar`.
  - `._arb_addr` -> `._reg_addrs: list[tuple]`.
  - always reg and de-reg from all registrars in `async_main()`.
  - only set the global runtime var `'_root_mailbox'` to the loopback
    address since normally all in-tree processes should have access to
    it, right?
  - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]`
- make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]`
  and defaults when not passed.
- change `ActorNursery.start_..()` methods take `bind_addrs: list` and
  pass down through the spawning layer(s) via the parent-seed-msg.
- generalize all `._discovery()` APIs to accept `registry_addrs`-like
  inputs and move all relevant subsystems to adopt the "registry" style
  naming instead of "arbiter":
  - make `find_actor()` support batched concurrent portal queries over
    all provided input addresses using `.trionics.gather_contexts()` Bo
  - syntax: move to using `async with <tuples>` 3.9+ style chained
    @acms.
  - a general modernization of the code to a python 3.9+ style.
  - start deprecation and change to "registry" naming / semantics:
    - `._discovery.get_arbiter()` -> `.get_registry()`
2025-03-20 19:50:31 -04:00
Tyler Goodlet b00ba158f1 Kick off `.devx` subpkg for our dev tools B)
Where `.devx` is "developer experience", a hopefully broad enough subpkg
name for all the slick stuff planned to augment working on the actor
runtime 💥

Move the `._debug` module into the new subpkg and adjust rest of core
code base to reflect import path change. Also add a new
`.devx._debug.open_crash_handler()` manager for wrapping any sync code
outside a `trio.run()` which is handy for eventual CLI addons for
popular frameworks like `click`/`typer`.
2025-03-20 15:07:27 -04:00
Tyler Goodlet 1deed8dbee ._runtime: log level tweaks, use crit for stale debug lock detection 2025-03-20 15:07:27 -04:00
Tyler Goodlet f0417d802b First proto: use `greenback` for sync func breakpointing
This works now for supporting a new `tractor.pause_from_sync()`
`tractor`-aware-replacement for `Pdb.set_trace()` from sync functions
which are also scheduled from our runtime. Uses `greenback` to do all
the magic of scheduling the bg `tractor._debug._pause()` task and
engaging the normal TTY locking machinery triggered by `await
tractor.breakpoint()`

Further this starts some public API renaming, making a switch to
`tractor.pause()` from `.breakpoint()` which IMO much better expresses
the semantics of the runtime intervention required to suffice
multi-process "breakpointing"; it also is an alternate name for the same
in computer science more generally: https://en.wikipedia.org/wiki/Breakpoint
It also avoids using the same name as the `breakpoint()` built-in which
is important since there **is alot more going on** when you call our
equivalent API.

Deats of that:
- add deprecation warning for `tractor.breakpoint()`
- add `tractor.pause()` and a shorthand, easier-to-type, alias `.pp()`
  for "pause-point" B)
- add `pause_from_sync()` as the new `breakpoint()`-from-sync-function
  hack which does all the `greenback` stuff for the user.

Still TODO:
- figure out where in the runtime and when to call
  `greenback.ensure_portal()`.
- fix the frame selection issue where
  `trio._core._ki._ki_protection_decorator:wrapper` seems to be always
  shown on REPL start as the selected frame..
2025-03-20 15:07:27 -04:00
Tyler Goodlet 11bab13a06 Various adjustments to fix breakage after rebase
- Remove `exceptiongroup` import,
- pin to py 3.11 in `setup.py`
- revert any lingering `tractor.devx` imports; sub-pkg is coming in
  a downstream PR!
- remove weird double `@property` lingering from conflict reso..
- modern `pytest` requires conftest mod mods to be  relative imported.
2025-03-19 15:30:59 -04:00
Tyler Goodlet 9a8cd13894 Another cancel-req-invalid log msg fmt tweak 2025-03-16 16:06:26 -04:00
Tyler Goodlet a87df3009f Drop now-deprecated deps on modern `trio`/Python
- `trio_typing` is nearly obsolete since `trio >= 0.23`
- `exceptiongroup` is built-in to python 3.11
- `async_generator` primitives have lived in `contextlib` for quite
  a while!
2025-03-16 16:06:24 -04:00
Tyler Goodlet a5bc113fde Start a `._rpc` module
Since `._runtime` was getting pretty long (> 2k LOC) and much of the RPC
low-level machinery is fairly isolated to a handful of task-funcs, it
makes sense to re-org the RPC task scheduling and driving msg loop to
its own code space.

The move includes:
- `process_messages()` which is the main IPC business logic.
- `try_ship_error_to_remote()` helper, to box local errors for the wire.
- `_invoke()`, the core task scheduler entrypoing used in the msg loop.
- `_invoke_non_context()`, holds impls for non-`@context` task starts.
- `_errors_relayed_via_ipc()` which does all error catch-n-boxing for
   wire-msg shipment using `try_ship_error_to_remote()` internally.

Also inside `._runtime` improve some `Actor` methods docs.
2025-03-16 15:52:53 -04:00
Tyler Goodlet 544cb40533 Attempt at better internal traceback hiding
Previously i was trying to approach this using lots of
`__tracebackhide__`'s in various internal funcs but since it's not
exactly straight forward to do this inside core deps like `trio` and the
stdlib, it makes a bit more sense to optionally catch and re-raise
certain classes of errors from their originals using `raise from` syntax
as per:
https://docs.python.org/3/library/exceptions.html#exception-context

Deats:
- litter `._context` methods with `__tracebackhide__`/`hide_tb` which
  were previously being shown but that don't need to be to application
  code now that cancel semantics testing is finished up.
- i originally did the same but later commented it all out in `._ipc`
  since error catch and re-raise instead in higher level layers
  (above the transport) seems to be a much saner approach.
- add catch-n-reraise-from in `MsgStream.send()`/.`receive()` to avoid
  seeing the depths of `trio` and/or our `._ipc` layers on comms errors.

Further this patch adds some refactoring to use the
same remote-error shipper routine from both the actor-core in the RPC
invoker:
- rename it as `try_ship_error_to_remote()` and call it from
  `._invoke()` as well as it's prior usage.
- make it optionally accept `cid: str` a `remote_descr: str` and of
  course a `hide_tb: bool`.

Other misc tweaks:
- add some todo notes around `Actor.load_modules()` debug hooking.
- tweak the zombie reaper log msg and timeout value ;)
2025-03-16 15:30:08 -04:00
Tyler Goodlet cbaf4fc05b Add a open-ctx-with-self test
Found exactly why trying this won't work when playing around with
opening workspaces in `modden` using a `Portal.open_context()` back to
the 'bigd' root actor: the RPC machinery only registers one entry in
`Actor._contexts` which will get overwritten by each task's side and
then experience race-based IPC msging errors (eg. rxing `{'started': _}`
on the callee side..). Instead make opening a ctx back to the self-actor
a runtime error describing it as an invalid op.

To match:
- add a new test `test_ctx_with_self_actor()` to the context semantics
  suite.
- tried out adding a new `side: str` to the `Actor.get_context()` (and
  callers) but ran into not being able to determine the value from in
  `._push_result()` where it's needed to figure out which side to push
  to.. So, just leaving the commented arg (passing) in the runtime core
  for now in case we can come back to trying to make it work, tho i'm
  thinking it's not the right hack anyway XD
2025-03-16 15:19:51 -04:00
Tyler Goodlet 8e3a2a9297 Make `Actor._cancel_task(requesting_uid: tuple)` required arg 2025-03-16 14:01:50 -04:00