Namely inside various bootup-sequences in `._root` and `._runtime`
particularly in the root actor to support both better tpt-address
denoting in our logging and as part of clarifying logic around setting
the root's registry addresses which is soon to be much better factored
out of the core and into an explicit subsystem + API.
Some `_root.open_root_actor()` deats,
- set `registry_addrs` to a new `uw_reg_addrs` (uw: unwrapped) to be
more explicit about wrapped addr types thoughout.
- instead ensure `registry_addrs` are the wrapped types and pass down
into the root `Actor` singleton-instance.
- factor the root-actor check + rt-vars update (updating the `'_root_addrs'`)
out of `._runtime.async_main()` into this fn.
- as previous, set `trans_bind_addrs = uw_reg_addrs` in unwrapped form since it will
be passed down both through rt-vars as `'_root_addrs'` and to
`._runtim.async_main()` as `accept_addrs` (which is then passed to the
IPC server).
- adjust/simplify much logging.
- shield the `await actor.cancel(None) # self cancel` to avoid any
finally-footguns.
- as mentioned convert the
For `_runtime.async_main()` tweaks,
- expect `registry_addrs: list[Address]|None = None` with appropriate
unwrapping prior to setting both `.reg_addrs` and the equiv rt-var.
- add a new `.registry_addrs` prop for the wrapped form.
- convert a final loose-eg for the `service_nursery` to use
`collapse_eg()`.
- simplify teardown report logging.
Since we're gonna prolly start using it for serious..
- drop `back_from_op`.
- rename `tree_str` -> `text`
- move the huge comment-doc for "sclang" into the fn body.
- change all usage to reflect.
Actually applying the input it in the root as well as all sub-actors by
passing it down to sub-actors through runtime-vars as delivered by the
initial `SpawnSpec` msg during child runtime init.
Impl deats,
- add a new `_state._runtime_vars['_enable_tpts']: list[str]` field set
by the input param (if provided) to `.open_root_actor()`.
- mk `current_ipc_protos()` return the runtime-var entry with instead
the default in the `_runtime_vars: dict` set to `[_def_tpt_proto]`.
- in `.open_root_actor()`, still error on this being a >1 `list[str]`
until we have more testing infra/suites to audit multi-protos per
actor.
- return the new value (as 3rd element) from `Actor._from_parent()` as
per the todo note; means `_runtime.async_main()` will allocate
`accept_addrs` as tpt-specific `Address` entries and pass them to
`IPCServer.listen_on()`.
Also,
- also add a new `_state._runtime_vars['_root_addrs']: list = []` field
with the intent of fully replacing the `'_root_mailbox'` field since,
* it will need to be a collection to support multi-tpt,
* it's a more cohesive field name alongside `_registry_addrs`,
* the root actor of every tree needs to have a dedicated addr set
(separate from any host-singleton registry actor) so that all its
subs can contact it for capabilities mgmt including debugger
access/locking.
- in the root, populate the field in `._runtime.async_main()` and for
now just set '_root_mailbox' to the first entry in that list in
anticipation of future multi-homing/transport support.
Since it's for beg filtering, the current impl should be renamed anyway;
it's not just for filtering cancelled excs.
Deats,
- added a real doc string, links to official eg docs and fixed the
return typing.
- adjust all internal imports to match.
Opting for performance over broad multi-actor "debug-ability" from
sync-function-contexts when `debug_mode=True` is set;
IOW prefer no behind-the-scenes `greenlet` perf impact over being
able to use an actor-safe `breakpoint()` wherever as per,
https://greenback.readthedocs.io/en/latest/principle.html#performance
Adjust the breakpoint restore ex script to match.
Discovered this bug while testing `modden`'s daemon under various
cancelled-while-booting race conditions where sequential tests would
fail a lingering `assert 0` inside `.to_asyncio.run_as_asyncio_guest()`
to (oddly) catch redundant greenback-re-inits..
XD
Needs a test likely ;P
Such that the "global-ish" setting (actor-local) is managed with the
others per actor-process and type it as a `Literal['tcp', 'uds']` of the
currently support protocol keys.
Here obvi `_tpt` is some kinda shorthand for "transport" and `_proto` is
for "protocol" Bp
Change imports and refs in all dependent modules.
Oh right, and disable UDS in `wrap_address()` for the moment while
i figure out how to avoid the unwrapped type collision..
There was a very strange legacy test
`test_spawning.test_local_arbiter_subactor_global_state` which was
causing unforseen hangs/errors on the UDS tpt and looking deeper this
test was already doing root-actor things that should never have been
valid XD
So rework that test to properly demonstrate something of value
(i guess..) and add a new suite which start more rigorously auditing our
`open_root_actor()` permitted usage.
For the old test,
- since the main point of this test seemed to be the ability to invoke
the same function in both the parent and child actor (using the very
legacy `ActorNursery.run_in_actor()`.. due to be deprecated) rename it
to `test_run_in_actor_same_func_in_child`,
- don't re-enter `.open_root_actor()` since that's invalid usage (tested
in new suite see below),
- adjust some `spawn()` arg/var naming and ensure we only return in the
child.
For the new suite add tests for,
- ensuring the implicit `open_root_actor()` call under `open_nursery()`.
- double open of `open_root_actor()` from within the same process tree
both from a root and sub.
Intro some new `_exceptions` used in the new suite,
- a top level `RuntimeFailure` for generically expressing faults not of
our own doing that prevent successful operation; this is what we now
(changed in this commit) raise on attempts to open a 2nd root.
- mk `ActorFailure` derive from the former; it's already used from
`._spawn` when subprocs fail to boot.
Call it `maybe_block_bp()` can wrap the `open_root_actor()` body with
it. Main reason is to guarantee we can bp inside actor runtime bootup as
needed when debugging internals! Prolly should factor this to another
module tho?
ALSO, ensure we RTE on recurrent entries to `open_root_actor()` from
within an existing tree! There was actually `test_spawning` test somehow
getting away with this!? Should never be possible or allowed!
Previously whenever an `ActorNursery.start_actor()` call did not receive
a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs
in the parent, for UDS this obviously won't work nor is it ideal bc it's
nicer to have the actor to be a socket server (who calls
`Address.open_listener()`) define the socket-file-name containing their
unique ID info such as pid, actor-uuid etc.
As such this moves "random" generation of server addresses to the
child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`;
i.e. we do the allocation of the `Address.get_random()` addrs inside
`._runtime.async_main()` instead of `Portal.start_actor()` and **only
when** `accept_addrs`/`bind_addrs` was **not provided by the spawning
parent**.
Further this patch get's way more rigorous about the `SpawnSpec`
processing in the child inside `Actor._from_parent()` such that we
handle any invalid msgs **very loudly and pedantically!**
Impl deats,
- do the "random addr generation" in an explicit `for` loop (instead of
prior comprehension) to allow for more detailed typing of the layered
calls to the new `._addr` mod.
- use a `match:/case:` for process any invalid `SpawnSpec` payload case
where we can instead receive a `MsgTypeError` from the `chan.recv()`
call in `Actor._from_parent()` to raise it immediately instead of
triggering downstream type-errors XD
|_ as per the big `#TODO` we prolly want to take from other callers
of `Channel.recv()` (like in the `._rpc.process_messages()` loop).
|_ always raise `InternalError` on non-match/fall-through case!
|_ add a note about not being able to use `breakpoint()` in this
section due to causality of `SpawnSpec._runtime_vars` not having
been processed yet..
|_ always return a third element from `._from_rent()` eventually to be
the `preferred_transports: list[str]` from the spawning rent.
- use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)`
for all actor creation (including in all the mods tweaked here).
- Move to new type-alias-name `UnwrappedAddress` throughout.
Such that it gets passed through to `.open_root_actor()` in the
`implicit_runtime==True` case - useful for debugging cases where
`.devx._debug` APIs might be used to avoid REPL clobbering in subactors.
Since it'll likely need a bit of detailing to get the test suite running
identically with strict egs (exception groups), i've opted to just flip
the switch on a few core nursery scopes for now until as such a time
i can focus enough to port the matching internals.. Xp
Such that you can use,
```python
tractor.to_asyncio.run_as_asyncio_guest(
trio_main=_trio_main,
)
```
to boostrap the root actor (and thus main parent process) to embed
the actor-rumtime into an `asyncio` loop. Prove it all works with an
subactor-free version of the aio echo-server test suite B)
By re-purposing our `pexpect`-based console matching with a new
`debugging/shield_hang_in_sub.py` example, this tests a few "hanging
actor" conditions more formally:
- that despite a hanging actor's task we can dump
a `stackscope.extract()` tree on relay of `SIGUSR1`.
- the actor tree will terminate despite a shielded forever-sleep by our
"T-800" zombie reaper machinery activating and hard killing the
underlying subprocess.
Some test deats:
- simulates the expect actions of a real user by manually using
`os.kill()` to send both signals to the actor-tree program.
- `pexpect`-matches against `log.devx()` emissions under normal
`debug_mode == True` usage.
- ensure we get the actual "T-800 deployed" `log.error()` msg and
that the actor tree eventually terminates!
Surrounding (re-org/impl/test-suite) changes:
- allow disabling usage via a `maybe_enable_greenback: bool` to
`open_root_actor()` but enable by def.
- pretty up the actual `.devx()` content from `.devx._stackscope`
including be extra pedantic about the conc-primitives for each signal
event.
- try to avoid double handles of `SIGUSR1` even though it seems the
original (what i thought was a) problem was actually just double
logging in the handler..
|_ avoid double applying the handler func via `signal.signal()`,
|_ use a global to avoid double handle func calls and,
|_ a `threading.RLock` around handling.
- move common fixtures and helper routines from `test_debugger` to
`tests/devx/conftest.py` and import them for use in both test mods.
Since we more or less require it for `tractor.pause_from_sync()` this
refines enable toggles and their relay down the actor tree as well as
more explicit logging around init and activation.
Tweaks summary:
- `.info()` report the module if discovered during root boot.
- use a `._state._runtime_vars['use_greenback']: bool` activation flag
inside `Actor._from_parent()` to determine if the sub should try to
use it and set to `False` if mod-loading fails / not installed.
- expose `maybe_init_greenback()` from `.devx` sugpkg.
- comment out RTE in `._pause()` for now since we already have it in
`.pause_from_sync()`.
- always `.exception()` on `maybe_init_greenback()` import errors to
clarify the underlying failure deats.
- always explicitly report if `._state._runtime_vars['use_greenback']`
was NOT set when `.pause_from_sync()` is called.
Other `._runtime.async_main()` adjustments:
- combine the "internal error call ur parents" message and the failed
registry contact status into one new `err_report: str`.
- drop the final exception handler's call to
`Actor.lifetime_stack.close()` since we're already doing it in the
`finally:` block and the earlier call has no currently known benefit.
- only report on the `.lifetime_stack()` callbacks if any are detected
as registered.
From both `open_root_actor()` and `._entry._trio_main()`.
Other `breakpoint()`-from-sync-func fixes:
- properly disable the default hook using `"0"` XD
- offer a `hide_tb: bool` from `open_root_actor()`.
- disable hiding the `._trio_main()` frame, bc pretty sure it doesn't
help anyone (either way) when REPL-ing/tb-ing from a subactor..?
- start tossing in `__tracebackhide__`s to various eps which don't need
to show in tbs or in the pdb REPL.
- port final `._maybe_enter_pm()` to pass a `api_frame`.
- start comment-marking up some API eps with `@api_frame`
in prep for actually using the new frame-stack tracing.
Breaks out all the (sub)actor local conc primitives from `Lock` (which
is now only used in and by the root actor) such that there's an explicit
distinction between a task that's "consuming" the `Lock` (remotely) vs.
the root-side service tasks which do the actual acquire on behalf of the
requesters.
`DebugStatus` changeover deats:
------ - ------
- move all the actor-local vars over `DebugStatus` including:
- move `_trio_handler` and `_orig_sigint_handler`
- `local_task_in_debug` now `repl_task`
- `_debugger_request_cs` now `req_cs`
- `local_pdb_complete` now `repl_release`
- drop all ^ fields from `Lock.repr()` obvi..
- move over the `.[un]shield_sigint()` and
`.is_main_trio_thread()` methods.
- add some new attrs/meths:
- `DebugStatus.repl` for the currently running `Pdb` in-actor
singleton.
- `.repr()` for pprint of state (like `Lock`).
- Note: that even when a root-actor task is in REPL, the `DebugStatus`
is still used for certain actor-local state mgmt, such as SIGINT
handler shielding.
- obvi change all lock-requester code bits to now use a `DebugStatus` in
their local actor-state instead of `Lock`, i.e. change usage from
`Lock` in `._runtime` and `._root`.
- use new `Lock.get_locking_task_cs()` API in when checking for
sub-in-debug from `._runtime.Actor._stream_handler()`.
Unrelated to topic-at-hand tweaks:
------ - ------
- drop the commented bits about hiding `@[a]cm` stack frames from
`_debug.pause()` and simplify to only one block with the `shield`
passthrough since we already solved the issue with cancel-scopes using
`@pdbp.hideframe` B)
- this includes all the extra logging about the extra frame for the
user (good thing i put in that wasted effort back then eh..)
- put the `try/except BaseException` with `log.exception()` around the
whole of `._pause()` to ensure we don't miss in-func errors which can
cause hangs..
- allow passing in `portal: Portal` to
`Actor.start_remote_task()` such that `Portal` task spawning methods
are always denoted correctly in terms of `Context.side`.
- lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.
Such that the top level `maybe_enable_greenback` from
`open_root_actor()` can toggle the entire actor tree's usage.
Read the rtv in `._rpc` tasks and only enable if set.
Also, rigor up the `._rpc.process_messages()` loop to handle `Error()`
and `case _:` separately such that we now raise an explicit rte for
unknown / invalid msgs. Use "parent" / "child" for side descriptions in
loop comments and put a fat comment before the `StartAck` in `_invoke()`.
Not sure if this is a good tactic (yet) but it at least covers us from
getting user's confused by `breakpoint()` usage causing REPL clobbering.
Always set an explicit rte raising breakpoint hook such that the user
realizes they can't use `.pause_from_sync()` without enabling debug
mode.
** CHERRY-PICK into `pause_from_sync_w_greenback` branch! **
Now supports use from any `trio` task, any sync thread started with
`trio.to_thread.run_sync()` AND also via `breakpoint()` builtin API!
The only bit missing now is support for `asyncio` tasks when in infected
mode.. Bo
`greenback` setup/API adjustments:
- move `._rpc.maybe_import_gb()` to -> `devx._debug` and factor out the cached
import checking into a sync func whilst placing the async `.ensure_portal()`
bootstrapping into a new async `maybe_init_greenback()`.
- use the new init-er func inside `open_root_actor()` with the output
predicating whether we override the `breakpoint()` hook.
core `devx._debug` implementation deatz:
- make `mk_mpdb()` only return the `pdp.Pdb` subtype instance since
the sigint unshielding func is now accessible from the `Lock`
singleton from anywhere.
- add non-main thread support (at least for `trio.to_thread` use cases)
to our `Lock` with a new `.is_trio_thread()` predicate that delegates
directly to `trio`'s internal version.
- do `Lock.is_trio_thread()` checks inside any methods which require
special provisions when invoked from a non-main `trio` thread:
- `.[un]shield_sigint()` methods since `signal.signal` usage is only
allowed from cpython's main thread.
- `.release()` since `trio.StrictFIFOLock` can only be called from
a `trio` task.
- rework `.pause_from_sync()` itself to directly call `._set_trace()`
and don't bother with `greenback._await()` when we're already calling
it from a `.to_thread.run_sync()` thread, oh and try to use the
thread/task name when setting `Lock.local_task_in_debug`.
- make it an RTE for now if you try to use `.pause_from_sync()` from any
infected-`asyncio` task, but support is (hopefully) coming soon!
For testing we add a new `test_debugger.py::test_pause_from_sync()`
which includes a ctrl-c parametrization around the
`examples/debugging/sync_bp.py` script which includes all currently
supported/working usages:
- `tractor.pause_from_sync()`.
- via `breakpoint()` overload.
- from a `trio.to_thread.run_sync()` spawn.
Since we'd like to eventually allow a diverse set of transport
(protocol) methods and stacks, and a multi-peer discovery system for
distributed actor-tree applications, this reworks all runtime internals
to support multi-homing for any given tree on a logical host. In other
words any actor can now bind its transport server (currently only
unsecured TCP + `msgspec`) to more then one address available in its
(linux) network namespace. Further, registry actors (now dubbed
"registars" instead of "arbiters") can also similarly bind to multiple
network addresses and provide discovery services to remote actors via
multiple addresses which can now be provided at runtime startup.
Deats:
- adjust `._runtime` internals to use a `list[tuple[str, int]]` (and
thus pluralized) socket address sequence where applicable for transport
server socket binds, now exposed via `Actor.accept_addrs`:
- `Actor.__init__()` now takes a `registry_addrs: list`.
- `Actor.is_arbiter` -> `.is_registrar`.
- `._arb_addr` -> `._reg_addrs: list[tuple]`.
- always reg and de-reg from all registrars in `async_main()`.
- only set the global runtime var `'_root_mailbox'` to the loopback
address since normally all in-tree processes should have access to
it, right?
- `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]`
- make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]`
and defaults when not passed.
- change `ActorNursery.start_..()` methods take `bind_addrs: list` and
pass down through the spawning layer(s) via the parent-seed-msg.
- generalize all `._discovery()` APIs to accept `registry_addrs`-like
inputs and move all relevant subsystems to adopt the "registry" style
naming instead of "arbiter":
- make `find_actor()` support batched concurrent portal queries over
all provided input addresses using `.trionics.gather_contexts()` Bo
- syntax: move to using `async with <tuples>` 3.9+ style chained
@acms.
- a general modernization of the code to a python 3.9+ style.
- start deprecation and change to "registry" naming / semantics:
- `._discovery.get_arbiter()` -> `.get_registry()`
If `stackscope` is importable and debug_mode is enabled then we by
default call and report `.devx.enable_stack_on_sig()` is set B)
This makes debugging unexpected (SIGINT ignoring) hangs a cinch!
Where `.devx` is "developer experience", a hopefully broad enough subpkg
name for all the slick stuff planned to augment working on the actor
runtime 💥
Move the `._debug` module into the new subpkg and adjust rest of core
code base to reflect import path change. Also add a new
`.devx._debug.open_crash_handler()` manager for wrapping any sync code
outside a `trio.run()` which is handy for eventual CLI addons for
popular frameworks like `click`/`typer`.