As per a WIP scribbled out TODO in `._entry.nest_from_op()`, change
a bunch of "supervisor/lifetime mgmt ops" related log messages to
contain some supervisor-annotation "headers" in an effort to give
a terser "visual indication" of how some execution/scope/storage
primitive entity (like an actor/task/ctx/connection) is being operated
on (like, opening/started/closed/cancelled/erroring) from a "supervisor
action" POV.
Also tweak a bunch more emissions to lower levels to reduce noise around
normal inter-actor operations like process and IPC ctx supervision.
This change is adding commentary about the upcoming API removal and
simplification of nursery + portal internals; no actual code changes are
included.
The plan to (re)move the old RPC methods:
- `ActorNursery.run_in_actor()`
- `Portal.run()`
- `Portal.run_from_ns()`
and any related impl internals out of each conc-primitive and instead
into something like a `.hilevel.rpc` set of APIs which then are all
implemented using the newer and more lowlevel `Context`/`MsgStream`
primitives instead Bo
Further,
- formally deprecate the `Portal.result()` meth for
`.wait_for_result()`.
- only `log.info()` about runtime shutdown in the implicit root case.
And IFF the `await wait_func(proc)` is cancelled such that we avoid
clobbering some subactor that might be REPL-ing even though its parent
actor is in the midst of (gracefully) cancelling it.
- start tossing in `__tracebackhide__`s to various eps which don't need
to show in tbs or in the pdb REPL.
- port final `._maybe_enter_pm()` to pass a `api_frame`.
- start comment-marking up some API eps with `@api_frame`
in prep for actually using the new frame-stack tracing.
Since we'd like to eventually allow a diverse set of transport
(protocol) methods and stacks, and a multi-peer discovery system for
distributed actor-tree applications, this reworks all runtime internals
to support multi-homing for any given tree on a logical host. In other
words any actor can now bind its transport server (currently only
unsecured TCP + `msgspec`) to more then one address available in its
(linux) network namespace. Further, registry actors (now dubbed
"registars" instead of "arbiters") can also similarly bind to multiple
network addresses and provide discovery services to remote actors via
multiple addresses which can now be provided at runtime startup.
Deats:
- adjust `._runtime` internals to use a `list[tuple[str, int]]` (and
thus pluralized) socket address sequence where applicable for transport
server socket binds, now exposed via `Actor.accept_addrs`:
- `Actor.__init__()` now takes a `registry_addrs: list`.
- `Actor.is_arbiter` -> `.is_registrar`.
- `._arb_addr` -> `._reg_addrs: list[tuple]`.
- always reg and de-reg from all registrars in `async_main()`.
- only set the global runtime var `'_root_mailbox'` to the loopback
address since normally all in-tree processes should have access to
it, right?
- `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]`
- make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]`
and defaults when not passed.
- change `ActorNursery.start_..()` methods take `bind_addrs: list` and
pass down through the spawning layer(s) via the parent-seed-msg.
- generalize all `._discovery()` APIs to accept `registry_addrs`-like
inputs and move all relevant subsystems to adopt the "registry" style
naming instead of "arbiter":
- make `find_actor()` support batched concurrent portal queries over
all provided input addresses using `.trionics.gather_contexts()` Bo
- syntax: move to using `async with <tuples>` 3.9+ style chained
@acms.
- a general modernization of the code to a python 3.9+ style.
- start deprecation and change to "registry" naming / semantics:
- `._discovery.get_arbiter()` -> `.get_registry()`
Allow callers to stick in a header to the `.pdb()` level emitted msg(s)
such that any "waiting status" content is only shown if the caller
actually get's blocked waiting for the debug lock; use it inside the
`._spawn` sub-process reaper call.
Also, return early if `Lock.global_actor_in_debug == None` and thus
only enter the poll loop when actually needed, consequently raise
if we fall through the loop without acquisition.
Where `.devx` is "developer experience", a hopefully broad enough subpkg
name for all the slick stuff planned to augment working on the actor
runtime 💥
Move the `._debug` module into the new subpkg and adjust rest of core
code base to reflect import path change. Also add a new
`.devx._debug.open_crash_handler()` manager for wrapping any sync code
outside a `trio.run()` which is handy for eventual CLI addons for
popular frameworks like `click`/`typer`.
- `trio_typing` is nearly obsolete since `trio >= 0.23`
- `exceptiongroup` is built-in to python 3.11
- `async_generator` primitives have lived in `contextlib` for quite
a while!
Previously i was trying to approach this using lots of
`__tracebackhide__`'s in various internal funcs but since it's not
exactly straight forward to do this inside core deps like `trio` and the
stdlib, it makes a bit more sense to optionally catch and re-raise
certain classes of errors from their originals using `raise from` syntax
as per:
https://docs.python.org/3/library/exceptions.html#exception-context
Deats:
- litter `._context` methods with `__tracebackhide__`/`hide_tb` which
were previously being shown but that don't need to be to application
code now that cancel semantics testing is finished up.
- i originally did the same but later commented it all out in `._ipc`
since error catch and re-raise instead in higher level layers
(above the transport) seems to be a much saner approach.
- add catch-n-reraise-from in `MsgStream.send()`/.`receive()` to avoid
seeing the depths of `trio` and/or our `._ipc` layers on comms errors.
Further this patch adds some refactoring to use the
same remote-error shipper routine from both the actor-core in the RPC
invoker:
- rename it as `try_ship_error_to_remote()` and call it from
`._invoke()` as well as it's prior usage.
- make it optionally accept `cid: str` a `remote_descr: str` and of
course a `hide_tb: bool`.
Other misc tweaks:
- add some todo notes around `Actor.load_modules()` debug hooking.
- tweak the zombie reaper log msg and timeout value ;)
- rename `.soft_wait()` -> `.soft_kill()`
- rename `.do_hard_kill()` -> `.hard_kill()`
- adjust any `trio.Process.__repr__()` log msg contents to have the
little tree branch prefix: `'|_'`
Given i just similarly revamped a buncha `._runtime` log msg formatting,
might as well do something similar inside the spawning machinery such
that groking teardown sequences of each supervising task is much more
sane XD
Mostly this includes doing similar `'<field>: <value>\n'` multi-line
formatting when reporting various subproc supervision steps as well as
showing a detailed `trio.Process.__repr__()` as appropriate.
Also adds a detailed #TODO according to the needs of #320 for which
we're going to need some internal mechanism for intermediary parent
actors to determine if a given debug tty locker (sub-actor) is one of
*their* (transitive) children and thus stall the normal
cancellation/teardown sequence until that locker is complete.
Instead of the logic branching create a table `._spawn._methods`
which is used to lookup the desired backend framework (in this case
still only one of `multiprocessing` or `trio`) and make the top level
`.new_proc()` do the lookup and any common logic. Use a `typing.Literal`
to define the lookup table's key set.
Repair and ignore a bunch of type-annot related stuff todo with `mypy`
updates and backend-specific process typing.
Adjust the `soft_wait()` strategy to avoid sending needless cancel
requests if it is known that a child process is already terminated or
does so before the cancel request times out. This should be no slower
and should avoid needless waits on either closure-in-progress or already
closed channels.
Basic strategy is,
- request child actor to cancel
- if process termination is detected, cancel the cancel
- if the process is still alive after a cancel request timeout warn the
user and yield back to the hard reap handling
This commit obviously denotes a re-license of all applicable parts of
the code base. Acknowledgement of this change was completed in #274 by
the majority of the current set of contributors. From here henceforth
all changes will be AGPL licensed and distributed. This is purely an
effort to maintain the same copy-left policy whilst closing the
(perceived) SaaS loophole the GPL allows for. It is merely for this
loophole: to avoid code hiding by any potential "network providers" who
are attempting to use the project to make a profit without either
compensating the authors or re-distributing their changes.
I thought quite a bit about this change and can't see a reason not to
close the SaaS loophole in our current license. We still are (hard)
copy-left and I plan to keep the code base this way for a couple
reasons:
- The code base produces income/profit through parent projects and is
demonstrably of high value.
- I believe firms should not get free lunch for the sake of
"contributions from their employees" or "usage as a service" which
I have found to be a dubious argument at best.
- If a firm who intends to profit from the code base wants to use it
they can propose a secondary commercial license to purchase with the
proceeds going to the project's authors under some form of well
defined contract.
- Many successful projects like Qt use this model; I see no reason it
can't work in this case until such a time as the authors feel it
should be loosened.
There has been detailed discussion in #103 on licensing alternatives.
The main point of this AGPL change is to protect the code base for the
time being from exploitation while it grows and as we move into the next
phase of development which will include extension into the multi-host
distributed software space.
We don't need to any more presuming you get ideal remote cancellation
conditions where the remote actor should teardown and kill the streams
from its end.
It's definitely possible to have a nursery spawn task be cancelled
before a `trio.Process` handle is ever returned; we now handle this
case as a cancelled-during-spawn scenario. Zombie collection logic
also is bypassed in this case.