tractor/tractor/_runtime.py

1815 lines
59 KiB
Python
Raw Normal View History

# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
The fundamental core machinery implementing every "actor"
including the process-local, or "python-interpreter (aka global)
singleton) `Actor` primitive(s) and its internal `trio` machinery
implementing the low level runtime system supporting the
discovery, communication, spawning, supervision and cancellation
of other actors in a hierarchincal process tree.
The runtime's main entry point: `async_main()` opens the top level
supervision and service `trio.Nursery`s which manage the tasks responsible
for running all lower level spawning, supervision and msging layers:
- lowlevel transport-protocol init and persistent connectivity on
top of `._ipc` primitives; the transport layer.
- bootstrapping of connection/runtime config from the spawning
parent (actor).
- starting and supervising IPC-channel msg processing loops around
tranport connections from parent/peer actors in order to deliver
SC-transitive RPC via scheduling of `trio` tasks.
- registration of newly spawned actors with the discovery sys.
'''
from __future__ import annotations
from contextlib import (
ExitStack,
)
from functools import partial
import importlib
import importlib.util
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
import os
from pathlib import Path
from pprint import pformat
import signal
import sys
from typing import (
Any,
Callable,
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
Type,
TYPE_CHECKING,
)
import uuid
from types import ModuleType
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
import warnings
import trio
from trio._core import _run as trio_runtime
from trio import (
CancelScope,
Nursery,
TaskStatus,
)
Drop `None`-sentinel cancels RPC loop mechanism Pretty sure we haven't *needed it* for a while, it was always generally hazardous in terms of IPC msg types, AND it's definitely incompatible with a dynamically applied typed msg spec: you can't just expect a `None` to be willy nilly handled all the time XD For now I'm masking out all the code and leaving very detailed surrounding notes but am not removing it quite yet in case for strange reason it is needed by some edge case (though I haven't found according to the test suite). Backstory: ------ - ------ Originally (i'm pretty sure anyway) it was added as a super naive "remote cancellation" mechanism (back before there were specific `Actor` methods for such things) that was mostly (only?) used before IPC `Channel` closures to "more gracefully cancel" the connection's parented RPC tasks. Since we now have explicit runtime-RPC endpoints for conducting remote cancellation of both tasks and full actors, it should really be removed anyway, because: - a `None`-msg setinel is inconsistent with other RPC endpoint handling input patterns which (even prior to typed msging) had specific msg-value triggers. - the IPC endpoint's (block) implementation should use `Actor.cancel_rpc_tasks(parent_chan=chan)` instead of a manual loop through a `Actor._rpc_tasks.copy()`.. Deats: - mask the `Channel.send(None)` calls from both the `Actor._stream_handler()` tail as well as from the `._portal.open_portal()` was connected block. - mask the msg loop endpoint block and toss in lotsa notes. Unrelated tweaks: - drop `Actor._debug_mode`; unused. - make `Actor.cancel_server()` return a `bool`. - use `.msg.pretty_struct.Struct.pformat()` to show any msg that is ignored (bc invalid) in `._push_result()`.
2024-04-05 23:07:12 +00:00
from tractor.msg import (
MsgType,
Drop `None`-sentinel cancels RPC loop mechanism Pretty sure we haven't *needed it* for a while, it was always generally hazardous in terms of IPC msg types, AND it's definitely incompatible with a dynamically applied typed msg spec: you can't just expect a `None` to be willy nilly handled all the time XD For now I'm masking out all the code and leaving very detailed surrounding notes but am not removing it quite yet in case for strange reason it is needed by some edge case (though I haven't found according to the test suite). Backstory: ------ - ------ Originally (i'm pretty sure anyway) it was added as a super naive "remote cancellation" mechanism (back before there were specific `Actor` methods for such things) that was mostly (only?) used before IPC `Channel` closures to "more gracefully cancel" the connection's parented RPC tasks. Since we now have explicit runtime-RPC endpoints for conducting remote cancellation of both tasks and full actors, it should really be removed anyway, because: - a `None`-msg setinel is inconsistent with other RPC endpoint handling input patterns which (even prior to typed msging) had specific msg-value triggers. - the IPC endpoint's (block) implementation should use `Actor.cancel_rpc_tasks(parent_chan=chan)` instead of a manual loop through a `Actor._rpc_tasks.copy()`.. Deats: - mask the `Channel.send(None)` calls from both the `Actor._stream_handler()` tail as well as from the `._portal.open_portal()` was connected block. - mask the msg loop endpoint block and toss in lotsa notes. Unrelated tweaks: - drop `Actor._debug_mode`; unused. - make `Actor.cancel_server()` return a `bool`. - use `.msg.pretty_struct.Struct.pformat()` to show any msg that is ignored (bc invalid) in `._push_result()`.
2024-04-05 23:07:12 +00:00
NamespacePath,
Stop,
pretty_struct,
Drop `None`-sentinel cancels RPC loop mechanism Pretty sure we haven't *needed it* for a while, it was always generally hazardous in terms of IPC msg types, AND it's definitely incompatible with a dynamically applied typed msg spec: you can't just expect a `None` to be willy nilly handled all the time XD For now I'm masking out all the code and leaving very detailed surrounding notes but am not removing it quite yet in case for strange reason it is needed by some edge case (though I haven't found according to the test suite). Backstory: ------ - ------ Originally (i'm pretty sure anyway) it was added as a super naive "remote cancellation" mechanism (back before there were specific `Actor` methods for such things) that was mostly (only?) used before IPC `Channel` closures to "more gracefully cancel" the connection's parented RPC tasks. Since we now have explicit runtime-RPC endpoints for conducting remote cancellation of both tasks and full actors, it should really be removed anyway, because: - a `None`-msg setinel is inconsistent with other RPC endpoint handling input patterns which (even prior to typed msging) had specific msg-value triggers. - the IPC endpoint's (block) implementation should use `Actor.cancel_rpc_tasks(parent_chan=chan)` instead of a manual loop through a `Actor._rpc_tasks.copy()`.. Deats: - mask the `Channel.send(None)` calls from both the `Actor._stream_handler()` tail as well as from the `._portal.open_portal()` was connected block. - mask the msg loop endpoint block and toss in lotsa notes. Unrelated tweaks: - drop `Actor._debug_mode`; unused. - make `Actor.cancel_server()` return a `bool`. - use `.msg.pretty_struct.Struct.pformat()` to show any msg that is ignored (bc invalid) in `._push_result()`.
2024-04-05 23:07:12 +00:00
types as msgtypes,
)
Factor actor-embedded IPC-tpt-server to `ipc` subsys Primarily moving the `Actor._serve_forever()`-task-as-method and supporting actor-instance attributes to a new `.ipo._server` sub-mod which now encapsulates, - the coupling various `trio.Nursery`s (and their independent lifetime mgmt) to different `trio.serve_listener()`s tasks and `SocketStream` handler scopes. - `Address` and `SocketListener` mgmt and tracking through the idea of an "IPC endpoint": each "bound-and-active instance" of a served-listener for some (varied transport protocol's socket) address. - start and shutdown of the entire server's lifetime via an `@acm`. - delegation of starting/stopping tpt-protocol-specific `trio.abc.Listener`s to the corresponding `.ipc._<proto_key>` sub-module (newly defined mod-top-level instead of `Address` method) `start/close_listener()` funcs. Impl details of the `.ipc._server` sub-sys, - add new `IPCServer`, allocated with `open_ipc_server()`, and which encapsulates starting multiple-transport-proto-`trio.abc.Listener`s from an input set of `._addr.Address`s using, |_`IPCServer.listen_on()` which internally spawns tasks that delegate to a new `_serve_ipc_eps()`, a rework of what was (effectively) `Actor._serve_forever()` and which now, * allocates a new `IPCEndpoint`-struct (see below) for each address-listener pair alongside the specified listener-serving/stream-handling `trio.Nursery`s provided by the caller. * starts and stops each transport (socket's) listener by calling `IPCEndpoint.start/close_listener()` which in turn delegates to the underlying `inspect.getmodule(IPCEndpoint.addr)` backend tpt module's equivalent impl. * tracks all created endpoints in a `._endpoints: list[IPCEndpoint]` which is further exposed through public properties for introspection of served transport-protocols and their addresses. |_`IPCServer._[parent/stream_handler]_tn: Nursery`s which are either allocated (in which case, as the same instance) or provided by the caller of `open_ipc_server()` such that the same nursery-cancel-scope controls offered by `trio.serve_listeners(handler_nursery=)` are offered where the `._parent_tn` is used to spawn `_serve_ipc_eps()` tasks, and `._stream_handler_tn` is passed verbatim as `handler_nursery`. - a new `IPCEndpoint`-struct (as mentioned) which wraps each transport-proto's address + listener + allocated-supervising-nursery to encapsulate the "lifetime of a server IPC endpoint" such that eventually we can track and managed per-protocol/address/`.listen_on()`-call scoped starts/stops/restarts for the purposes of filtering/banning peer traffic. |_ also included is an unused `.peer_tpts` table which we can hopefully use to replace `Actor._peers` in a `Channel`-tracking transport-proto-aware way! Surrounding changes to `.ipc.*` primitives to match, - make `[TCP|UDS]Address` types `msgspec.Struct(frozen=True)` and thus drop any-and-all `addr._host =` style mutation throughout. |_ as such also drop their `.__init__()` and `.__eq__()` meths. |_ UDS tweaks to field names and thus `.__repr__()`. - move `[TCP|UDS]Address.[start/close]_listener()` meths to be mod-level equiv `start|close_listener()` funcs. - just hard code the `.ipc._types._key_to_transport/._addr_to_transport` table entries instead of all the prior fancy dynamic class property reading stuff (remember, "explicit is better then implicit"). Modified in `._runtime.Actor` internals, - drop the `._serve_forever()` and `.cancel_server()`, methods and `._server_down` waiting logic from `.cancel_soon()` - add `.[_]ipc_server` which is opened just after the `._service_n` and delegate to it for any equivalent publicly exposed instance attributes/properties.
2025-04-10 22:06:12 +00:00
from .ipc import (
Channel,
# IPCServer, # causes cycles atm..
Factor actor-embedded IPC-tpt-server to `ipc` subsys Primarily moving the `Actor._serve_forever()`-task-as-method and supporting actor-instance attributes to a new `.ipo._server` sub-mod which now encapsulates, - the coupling various `trio.Nursery`s (and their independent lifetime mgmt) to different `trio.serve_listener()`s tasks and `SocketStream` handler scopes. - `Address` and `SocketListener` mgmt and tracking through the idea of an "IPC endpoint": each "bound-and-active instance" of a served-listener for some (varied transport protocol's socket) address. - start and shutdown of the entire server's lifetime via an `@acm`. - delegation of starting/stopping tpt-protocol-specific `trio.abc.Listener`s to the corresponding `.ipc._<proto_key>` sub-module (newly defined mod-top-level instead of `Address` method) `start/close_listener()` funcs. Impl details of the `.ipc._server` sub-sys, - add new `IPCServer`, allocated with `open_ipc_server()`, and which encapsulates starting multiple-transport-proto-`trio.abc.Listener`s from an input set of `._addr.Address`s using, |_`IPCServer.listen_on()` which internally spawns tasks that delegate to a new `_serve_ipc_eps()`, a rework of what was (effectively) `Actor._serve_forever()` and which now, * allocates a new `IPCEndpoint`-struct (see below) for each address-listener pair alongside the specified listener-serving/stream-handling `trio.Nursery`s provided by the caller. * starts and stops each transport (socket's) listener by calling `IPCEndpoint.start/close_listener()` which in turn delegates to the underlying `inspect.getmodule(IPCEndpoint.addr)` backend tpt module's equivalent impl. * tracks all created endpoints in a `._endpoints: list[IPCEndpoint]` which is further exposed through public properties for introspection of served transport-protocols and their addresses. |_`IPCServer._[parent/stream_handler]_tn: Nursery`s which are either allocated (in which case, as the same instance) or provided by the caller of `open_ipc_server()` such that the same nursery-cancel-scope controls offered by `trio.serve_listeners(handler_nursery=)` are offered where the `._parent_tn` is used to spawn `_serve_ipc_eps()` tasks, and `._stream_handler_tn` is passed verbatim as `handler_nursery`. - a new `IPCEndpoint`-struct (as mentioned) which wraps each transport-proto's address + listener + allocated-supervising-nursery to encapsulate the "lifetime of a server IPC endpoint" such that eventually we can track and managed per-protocol/address/`.listen_on()`-call scoped starts/stops/restarts for the purposes of filtering/banning peer traffic. |_ also included is an unused `.peer_tpts` table which we can hopefully use to replace `Actor._peers` in a `Channel`-tracking transport-proto-aware way! Surrounding changes to `.ipc.*` primitives to match, - make `[TCP|UDS]Address` types `msgspec.Struct(frozen=True)` and thus drop any-and-all `addr._host =` style mutation throughout. |_ as such also drop their `.__init__()` and `.__eq__()` meths. |_ UDS tweaks to field names and thus `.__repr__()`. - move `[TCP|UDS]Address.[start/close]_listener()` meths to be mod-level equiv `start|close_listener()` funcs. - just hard code the `.ipc._types._key_to_transport/._addr_to_transport` table entries instead of all the prior fancy dynamic class property reading stuff (remember, "explicit is better then implicit"). Modified in `._runtime.Actor` internals, - drop the `._serve_forever()` and `.cancel_server()`, methods and `._server_down` waiting logic from `.cancel_soon()` - add `.[_]ipc_server` which is opened just after the `._service_n` and delegate to it for any equivalent publicly exposed instance attributes/properties.
2025-04-10 22:06:12 +00:00
_server,
)
from ._addr import (
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
UnwrappedAddress,
Address,
Factor actor-embedded IPC-tpt-server to `ipc` subsys Primarily moving the `Actor._serve_forever()`-task-as-method and supporting actor-instance attributes to a new `.ipo._server` sub-mod which now encapsulates, - the coupling various `trio.Nursery`s (and their independent lifetime mgmt) to different `trio.serve_listener()`s tasks and `SocketStream` handler scopes. - `Address` and `SocketListener` mgmt and tracking through the idea of an "IPC endpoint": each "bound-and-active instance" of a served-listener for some (varied transport protocol's socket) address. - start and shutdown of the entire server's lifetime via an `@acm`. - delegation of starting/stopping tpt-protocol-specific `trio.abc.Listener`s to the corresponding `.ipc._<proto_key>` sub-module (newly defined mod-top-level instead of `Address` method) `start/close_listener()` funcs. Impl details of the `.ipc._server` sub-sys, - add new `IPCServer`, allocated with `open_ipc_server()`, and which encapsulates starting multiple-transport-proto-`trio.abc.Listener`s from an input set of `._addr.Address`s using, |_`IPCServer.listen_on()` which internally spawns tasks that delegate to a new `_serve_ipc_eps()`, a rework of what was (effectively) `Actor._serve_forever()` and which now, * allocates a new `IPCEndpoint`-struct (see below) for each address-listener pair alongside the specified listener-serving/stream-handling `trio.Nursery`s provided by the caller. * starts and stops each transport (socket's) listener by calling `IPCEndpoint.start/close_listener()` which in turn delegates to the underlying `inspect.getmodule(IPCEndpoint.addr)` backend tpt module's equivalent impl. * tracks all created endpoints in a `._endpoints: list[IPCEndpoint]` which is further exposed through public properties for introspection of served transport-protocols and their addresses. |_`IPCServer._[parent/stream_handler]_tn: Nursery`s which are either allocated (in which case, as the same instance) or provided by the caller of `open_ipc_server()` such that the same nursery-cancel-scope controls offered by `trio.serve_listeners(handler_nursery=)` are offered where the `._parent_tn` is used to spawn `_serve_ipc_eps()` tasks, and `._stream_handler_tn` is passed verbatim as `handler_nursery`. - a new `IPCEndpoint`-struct (as mentioned) which wraps each transport-proto's address + listener + allocated-supervising-nursery to encapsulate the "lifetime of a server IPC endpoint" such that eventually we can track and managed per-protocol/address/`.listen_on()`-call scoped starts/stops/restarts for the purposes of filtering/banning peer traffic. |_ also included is an unused `.peer_tpts` table which we can hopefully use to replace `Actor._peers` in a `Channel`-tracking transport-proto-aware way! Surrounding changes to `.ipc.*` primitives to match, - make `[TCP|UDS]Address` types `msgspec.Struct(frozen=True)` and thus drop any-and-all `addr._host =` style mutation throughout. |_ as such also drop their `.__init__()` and `.__eq__()` meths. |_ UDS tweaks to field names and thus `.__repr__()`. - move `[TCP|UDS]Address.[start/close]_listener()` meths to be mod-level equiv `start|close_listener()` funcs. - just hard code the `.ipc._types._key_to_transport/._addr_to_transport` table entries instead of all the prior fancy dynamic class property reading stuff (remember, "explicit is better then implicit"). Modified in `._runtime.Actor` internals, - drop the `._serve_forever()` and `.cancel_server()`, methods and `._server_down` waiting logic from `.cancel_soon()` - add `.[_]ipc_server` which is opened just after the `._service_n` and delegate to it for any equivalent publicly exposed instance attributes/properties.
2025-04-10 22:06:12 +00:00
# default_lo_addrs,
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
get_address_cls,
wrap_address,
)
from ._context import (
mk_context,
Context,
)
from .log import get_logger
from ._exceptions import (
ContextCancelled,
InternalError,
ModuleNotExposed,
MsgTypeError,
unpack_error,
)
from .devx import debug
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
from ._discovery import get_registry
from ._portal import Portal
from . import _state
from . import _mp_fixup_main
from . import _rpc
if TYPE_CHECKING:
from ._supervise import ActorNursery
from trio._channel import MemoryChannelState
log = get_logger('tractor')
2018-07-14 20:09:05 +00:00
def _get_mod_abspath(module: ModuleType) -> Path:
return Path(module.__file__).absolute()
def get_mod_nsps2fps(mod_ns_paths: list[str]) -> dict[str, str]:
'''
Deliver a table of py module namespace-path-`str`s mapped to
their "physical" `.py` file paths in the file-sys.
'''
nsp2fp: dict[str, str] = {}
for nsp in mod_ns_paths:
mod: ModuleType = importlib.import_module(nsp)
nsp2fp[nsp] = str(_get_mod_abspath(mod))
return nsp2fp
2018-07-14 20:09:05 +00:00
class Actor:
'''
The fundamental "runtime" concurrency primitive.
An "actor" is the combination of a regular Python process
executing a `trio.run()` task tree, communicating with other
"actors" through "memory boundary portals": `Portal`, which
provide a high-level async API around IPC "channels" (`Channel`)
which themselves encapsulate various (swappable) network
transport protocols for sending msgs between said memory domains
(processes, hosts, non-GIL threads).
Each "actor" is `trio.run()` scheduled "runtime" composed of many
concurrent tasks in a single thread. The "runtime" tasks conduct
a slew of low(er) level functions to make it possible for message
passing between actors as well as the ability to create new
actors (aka new "runtimes" in new processes which are supervised
via an "actor-nursery" construct). Each task which sends messages
to a task in a "peer" actor (not necessarily a parent-child,
2023-08-18 14:10:36 +00:00
depth hierarchy) is able to do so via an "address", which maps
IPC connections across memory boundaries, and a task request id
which allows for per-actor tasks to send and receive messages to
specific peer-actor tasks with which there is an ongoing RPC/IPC
dialog.
2018-07-14 20:09:05 +00:00
'''
# ugh, we need to get rid of this and replace with a "registry" sys
# https://github.com/goodboy/tractor/issues/216
2019-12-10 05:55:03 +00:00
is_arbiter: bool = False
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
@property
def is_registrar(self) -> bool:
return self.is_arbiter
Disable msg stream backpressure by default Half of portal API usage requires a 1 message response (`.run()`, `.run_in_actor()`) and the streaming APIs should probably be explicitly enabled for backpressure if desired by the user. This makes more sense in (psuedo) realtime systems where it's better to notify on a block then freeze without notice. Make this default behaviour with a new error to be raised: `tractor._exceptions.StreamOverrun` when a sender overruns a stream by the default size (2**6 for now). The old behavior can be enabled with `Context.open_stream(backpressure=True)` but now with warning log messages when there are overruns. Add task-linked-context error propagation using a "nursery raising" technique such that if either end of context linked pair of tasks errors, that error can be relayed to other side and raised as a form of interrupt at the receiving task's next `trio` checkpoint. This enables reliable error relay without expecting the (error) receiving task to call an API which would raise the remote exception (which it might never currently if using `tractor.MsgStream` APIs). Further internal implementation details: - define the default msg buffer size as `Actor.msg_buffer_size` - expose a `msg_buffer_size: int` kwarg from `Actor.get_context()` - maybe raise aforementioned context errors using `Context._maybe_error_from_remote_msg()` inside `Actor._push_result()` - support optional backpressure on a stream when pushing messages in `Actor._push_result()` - in `_invote()` handle multierrors raised from a `@tractor.context` entrypoint as being potentially caused by a relayed error from the remote caller task, if `Context._error` has been set then raise that error inside the `RemoteActorError` that will be relayed back to that caller more or less proxying through the source side error back to its origin.
2021-12-06 00:31:41 +00:00
msg_buffer_size: int = 2**6
2019-12-10 05:55:03 +00:00
2022-08-03 19:29:34 +00:00
# nursery placeholders filled in by `async_main()` after fork
_root_n: Nursery|None = None
_service_n: Nursery|None = None
Factor actor-embedded IPC-tpt-server to `ipc` subsys Primarily moving the `Actor._serve_forever()`-task-as-method and supporting actor-instance attributes to a new `.ipo._server` sub-mod which now encapsulates, - the coupling various `trio.Nursery`s (and their independent lifetime mgmt) to different `trio.serve_listener()`s tasks and `SocketStream` handler scopes. - `Address` and `SocketListener` mgmt and tracking through the idea of an "IPC endpoint": each "bound-and-active instance" of a served-listener for some (varied transport protocol's socket) address. - start and shutdown of the entire server's lifetime via an `@acm`. - delegation of starting/stopping tpt-protocol-specific `trio.abc.Listener`s to the corresponding `.ipc._<proto_key>` sub-module (newly defined mod-top-level instead of `Address` method) `start/close_listener()` funcs. Impl details of the `.ipc._server` sub-sys, - add new `IPCServer`, allocated with `open_ipc_server()`, and which encapsulates starting multiple-transport-proto-`trio.abc.Listener`s from an input set of `._addr.Address`s using, |_`IPCServer.listen_on()` which internally spawns tasks that delegate to a new `_serve_ipc_eps()`, a rework of what was (effectively) `Actor._serve_forever()` and which now, * allocates a new `IPCEndpoint`-struct (see below) for each address-listener pair alongside the specified listener-serving/stream-handling `trio.Nursery`s provided by the caller. * starts and stops each transport (socket's) listener by calling `IPCEndpoint.start/close_listener()` which in turn delegates to the underlying `inspect.getmodule(IPCEndpoint.addr)` backend tpt module's equivalent impl. * tracks all created endpoints in a `._endpoints: list[IPCEndpoint]` which is further exposed through public properties for introspection of served transport-protocols and their addresses. |_`IPCServer._[parent/stream_handler]_tn: Nursery`s which are either allocated (in which case, as the same instance) or provided by the caller of `open_ipc_server()` such that the same nursery-cancel-scope controls offered by `trio.serve_listeners(handler_nursery=)` are offered where the `._parent_tn` is used to spawn `_serve_ipc_eps()` tasks, and `._stream_handler_tn` is passed verbatim as `handler_nursery`. - a new `IPCEndpoint`-struct (as mentioned) which wraps each transport-proto's address + listener + allocated-supervising-nursery to encapsulate the "lifetime of a server IPC endpoint" such that eventually we can track and managed per-protocol/address/`.listen_on()`-call scoped starts/stops/restarts for the purposes of filtering/banning peer traffic. |_ also included is an unused `.peer_tpts` table which we can hopefully use to replace `Actor._peers` in a `Channel`-tracking transport-proto-aware way! Surrounding changes to `.ipc.*` primitives to match, - make `[TCP|UDS]Address` types `msgspec.Struct(frozen=True)` and thus drop any-and-all `addr._host =` style mutation throughout. |_ as such also drop their `.__init__()` and `.__eq__()` meths. |_ UDS tweaks to field names and thus `.__repr__()`. - move `[TCP|UDS]Address.[start/close]_listener()` meths to be mod-level equiv `start|close_listener()` funcs. - just hard code the `.ipc._types._key_to_transport/._addr_to_transport` table entries instead of all the prior fancy dynamic class property reading stuff (remember, "explicit is better then implicit"). Modified in `._runtime.Actor` internals, - drop the `._serve_forever()` and `.cancel_server()`, methods and `._server_down` waiting logic from `.cancel_soon()` - add `.[_]ipc_server` which is opened just after the `._service_n` and delegate to it for any equivalent publicly exposed instance attributes/properties.
2025-04-10 22:06:12 +00:00
_ipc_server: _server.IPCServer|None = None
@property
def ipc_server(self) -> _server.IPCServer:
'''
The IPC transport-server for this actor; normally
a process-singleton.
'''
return self._ipc_server
2018-07-14 20:09:05 +00:00
# Information about `__main__` from parent
2021-12-02 17:34:27 +00:00
_parent_main_data: dict[str, str]
_parent_chan_cs: CancelScope|None = None
Drop `None`-sentinel cancels RPC loop mechanism Pretty sure we haven't *needed it* for a while, it was always generally hazardous in terms of IPC msg types, AND it's definitely incompatible with a dynamically applied typed msg spec: you can't just expect a `None` to be willy nilly handled all the time XD For now I'm masking out all the code and leaving very detailed surrounding notes but am not removing it quite yet in case for strange reason it is needed by some edge case (though I haven't found according to the test suite). Backstory: ------ - ------ Originally (i'm pretty sure anyway) it was added as a super naive "remote cancellation" mechanism (back before there were specific `Actor` methods for such things) that was mostly (only?) used before IPC `Channel` closures to "more gracefully cancel" the connection's parented RPC tasks. Since we now have explicit runtime-RPC endpoints for conducting remote cancellation of both tasks and full actors, it should really be removed anyway, because: - a `None`-msg setinel is inconsistent with other RPC endpoint handling input patterns which (even prior to typed msging) had specific msg-value triggers. - the IPC endpoint's (block) implementation should use `Actor.cancel_rpc_tasks(parent_chan=chan)` instead of a manual loop through a `Actor._rpc_tasks.copy()`.. Deats: - mask the `Channel.send(None)` calls from both the `Actor._stream_handler()` tail as well as from the `._portal.open_portal()` was connected block. - mask the msg loop endpoint block and toss in lotsa notes. Unrelated tweaks: - drop `Actor._debug_mode`; unused. - make `Actor.cancel_server()` return a `bool`. - use `.msg.pretty_struct.Struct.pformat()` to show any msg that is ignored (bc invalid) in `._push_result()`.
2024-04-05 23:07:12 +00:00
_spawn_spec: msgtypes.SpawnSpec|None = None
# if started on ``asycio`` running ``trio`` in guest mode
_infected_aio: bool = False
Drop `None`-sentinel cancels RPC loop mechanism Pretty sure we haven't *needed it* for a while, it was always generally hazardous in terms of IPC msg types, AND it's definitely incompatible with a dynamically applied typed msg spec: you can't just expect a `None` to be willy nilly handled all the time XD For now I'm masking out all the code and leaving very detailed surrounding notes but am not removing it quite yet in case for strange reason it is needed by some edge case (though I haven't found according to the test suite). Backstory: ------ - ------ Originally (i'm pretty sure anyway) it was added as a super naive "remote cancellation" mechanism (back before there were specific `Actor` methods for such things) that was mostly (only?) used before IPC `Channel` closures to "more gracefully cancel" the connection's parented RPC tasks. Since we now have explicit runtime-RPC endpoints for conducting remote cancellation of both tasks and full actors, it should really be removed anyway, because: - a `None`-msg setinel is inconsistent with other RPC endpoint handling input patterns which (even prior to typed msging) had specific msg-value triggers. - the IPC endpoint's (block) implementation should use `Actor.cancel_rpc_tasks(parent_chan=chan)` instead of a manual loop through a `Actor._rpc_tasks.copy()`.. Deats: - mask the `Channel.send(None)` calls from both the `Actor._stream_handler()` tail as well as from the `._portal.open_portal()` was connected block. - mask the msg loop endpoint block and toss in lotsa notes. Unrelated tweaks: - drop `Actor._debug_mode`; unused. - make `Actor.cancel_server()` return a `bool`. - use `.msg.pretty_struct.Struct.pformat()` to show any msg that is ignored (bc invalid) in `._push_result()`.
2024-04-05 23:07:12 +00:00
# TODO: nursery tracking like `trio` does?
Improved log msg formatting in core As part of solving some final edge cases todo with inter-peer remote cancellation (particularly a remote cancel from a separate actor tree-client hanging on the request side in `modden`..) I needed less dense, more line-delimited log msg formats when understanding ipc channel and context cancels from console logging; this adds a ton of that to: - `._invoke()` which now does, - better formatting of `Context`-task info as multi-line `'<field>: <value>\n'` messages, - use of `trio.Task` (from `.lowlevel.current_task()` for full rpc-func namespace-path info, - better "msg flow annotations" with `<=` for understanding `ContextCancelled` flow. - `Actor._stream_handler()` where in we break down IPC peers reporting better as multi-line `|_<Channel>` log msgs instead of all jammed on one line.. - `._ipc.Channel.send()` use `pformat()` for repr of packet. Also tweak some optional deps imports for debug mode: - add `maybe_import_gb()` for attempting to import `greenback`. - maybe enable `stackscope` tree pprinter on `SIGUSR1` if installed. Add a further stale-debugger-lock guard before removal: - read the `._debug.Lock.global_actor_in_debug: tuple` uid and possibly `maybe_wait_for_debugger()` when the child-user is known to have a live process in our tree. - only cancel `Lock._root_local_task_cs_in_debug: CancelScope` when the disconnected channel maps to the `Lock.global_actor_in_debug`, though not sure this is correct yet? Started adding missing type annots in sections that were modified.
2024-02-19 17:25:08 +00:00
# _ans: dict[
# tuple[str, str],
# list[ActorNursery],
# ] = {}
# Process-global stack closed at end on actor runtime teardown.
# NOTE: this is currently an undocumented public api.
lifetime_stack: ExitStack = ExitStack()
2018-07-14 20:09:05 +00:00
def __init__(
self,
name: str,
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
uuid: str,
*,
enable_modules: list[str] = [],
loglevel: str|None = None,
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
registry_addrs: list[UnwrappedAddress]|None = None,
spawn_method: str|None = None,
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
# TODO: remove!
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
arbiter_addr: UnwrappedAddress|None = None,
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
2018-08-31 21:16:24 +00:00
) -> None:
2022-08-03 20:09:16 +00:00
'''
This constructor is called in the parent actor **before** the spawning
phase (aka before a new process is executed).
2022-08-03 20:09:16 +00:00
'''
self._aid = msgtypes.Aid(
name=name,
uuid=uuid,
pid=os.getpid(),
)
self._task: trio.Task|None = None
# state
self._cancel_complete = trio.Event()
self._cancel_called_by_remote: tuple[str, tuple]|None = None
self._cancel_called: bool = False
# retreive and store parent `__main__` data which
# will be passed to children
self._parent_main_data = _mp_fixup_main._mp_figure_out_main()
# TODO? only add this when `is_debug_mode() == True` no?
2020-07-30 14:41:58 +00:00
# always include debugging tools module
if _state.is_root_process():
enable_modules.append('tractor.devx.debug._tty_lock')
2020-07-30 14:41:58 +00:00
self.enable_modules: dict[str, str] = get_mod_nsps2fps(
mod_ns_paths=enable_modules,
)
2021-12-02 17:34:27 +00:00
self._mods: dict[str, ModuleType] = {}
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
self.loglevel: str = loglevel
if arbiter_addr is not None:
warnings.warn(
'`Actor(arbiter_addr=<blah>)` is now deprecated.\n'
'Use `registry_addrs: list[tuple]` instead.',
DeprecationWarning,
stacklevel=2,
)
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
registry_addrs: list[UnwrappedAddress] = [arbiter_addr]
2021-07-01 18:52:52 +00:00
# marked by the process spawning backend at startup
# will be None for the parent most process started manually
# by the user (currently called the "arbiter")
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
self._spawn_method: str = spawn_method
# RPC state
self._ongoing_rpc_tasks = trio.Event()
self._ongoing_rpc_tasks.set()
2021-12-02 17:34:27 +00:00
self._rpc_tasks: dict[
tuple[Channel, str], # (chan, cid)
tuple[Context, Callable, trio.Event] # (ctx=>, fn(), done?)
2018-08-31 21:16:24 +00:00
] = {}
# map {actor uids -> Context}
self._contexts: dict[
tuple[
tuple[str, str], # .uid
str, # .cid
str, # .side
],
Context
2019-12-10 05:55:03 +00:00
] = {}
self._parent_chan: Channel|None = None
self._forkserver_info: tuple|None = None
Use `DebugStatus` around subactor lock requests Breaks out all the (sub)actor local conc primitives from `Lock` (which is now only used in and by the root actor) such that there's an explicit distinction between a task that's "consuming" the `Lock` (remotely) vs. the root-side service tasks which do the actual acquire on behalf of the requesters. `DebugStatus` changeover deats: ------ - ------ - move all the actor-local vars over `DebugStatus` including: - move `_trio_handler` and `_orig_sigint_handler` - `local_task_in_debug` now `repl_task` - `_debugger_request_cs` now `req_cs` - `local_pdb_complete` now `repl_release` - drop all ^ fields from `Lock.repr()` obvi.. - move over the `.[un]shield_sigint()` and `.is_main_trio_thread()` methods. - add some new attrs/meths: - `DebugStatus.repl` for the currently running `Pdb` in-actor singleton. - `.repr()` for pprint of state (like `Lock`). - Note: that even when a root-actor task is in REPL, the `DebugStatus` is still used for certain actor-local state mgmt, such as SIGINT handler shielding. - obvi change all lock-requester code bits to now use a `DebugStatus` in their local actor-state instead of `Lock`, i.e. change usage from `Lock` in `._runtime` and `._root`. - use new `Lock.get_locking_task_cs()` API in when checking for sub-in-debug from `._runtime.Actor._stream_handler()`. Unrelated to topic-at-hand tweaks: ------ - ------ - drop the commented bits about hiding `@[a]cm` stack frames from `_debug.pause()` and simplify to only one block with the `shield` passthrough since we already solved the issue with cancel-scopes using `@pdbp.hideframe` B) - this includes all the extra logging about the extra frame for the user (good thing i put in that wasted effort back then eh..) - put the `try/except BaseException` with `log.exception()` around the whole of `._pause()` to ensure we don't miss in-func errors which can cause hangs.. - allow passing in `portal: Portal` to `Actor.start_remote_task()` such that `Portal` task spawning methods are always denoted correctly in terms of `Context.side`. - lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.
2024-04-18 16:47:28 +00:00
# track each child/sub-actor in it's locally
# supervising nursery
self._actoruid2nursery: dict[
Use `DebugStatus` around subactor lock requests Breaks out all the (sub)actor local conc primitives from `Lock` (which is now only used in and by the root actor) such that there's an explicit distinction between a task that's "consuming" the `Lock` (remotely) vs. the root-side service tasks which do the actual acquire on behalf of the requesters. `DebugStatus` changeover deats: ------ - ------ - move all the actor-local vars over `DebugStatus` including: - move `_trio_handler` and `_orig_sigint_handler` - `local_task_in_debug` now `repl_task` - `_debugger_request_cs` now `req_cs` - `local_pdb_complete` now `repl_release` - drop all ^ fields from `Lock.repr()` obvi.. - move over the `.[un]shield_sigint()` and `.is_main_trio_thread()` methods. - add some new attrs/meths: - `DebugStatus.repl` for the currently running `Pdb` in-actor singleton. - `.repr()` for pprint of state (like `Lock`). - Note: that even when a root-actor task is in REPL, the `DebugStatus` is still used for certain actor-local state mgmt, such as SIGINT handler shielding. - obvi change all lock-requester code bits to now use a `DebugStatus` in their local actor-state instead of `Lock`, i.e. change usage from `Lock` in `._runtime` and `._root`. - use new `Lock.get_locking_task_cs()` API in when checking for sub-in-debug from `._runtime.Actor._stream_handler()`. Unrelated to topic-at-hand tweaks: ------ - ------ - drop the commented bits about hiding `@[a]cm` stack frames from `_debug.pause()` and simplify to only one block with the `shield` passthrough since we already solved the issue with cancel-scopes using `@pdbp.hideframe` B) - this includes all the extra logging about the extra frame for the user (good thing i put in that wasted effort back then eh..) - put the `try/except BaseException` with `log.exception()` around the whole of `._pause()` to ensure we don't miss in-func errors which can cause hangs.. - allow passing in `portal: Portal` to `Actor.start_remote_task()` such that `Portal` task spawning methods are always denoted correctly in terms of `Context.side`. - lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.
2024-04-18 16:47:28 +00:00
tuple[str, str], # sub-`Actor.uid`
ActorNursery|None,
Use `DebugStatus` around subactor lock requests Breaks out all the (sub)actor local conc primitives from `Lock` (which is now only used in and by the root actor) such that there's an explicit distinction between a task that's "consuming" the `Lock` (remotely) vs. the root-side service tasks which do the actual acquire on behalf of the requesters. `DebugStatus` changeover deats: ------ - ------ - move all the actor-local vars over `DebugStatus` including: - move `_trio_handler` and `_orig_sigint_handler` - `local_task_in_debug` now `repl_task` - `_debugger_request_cs` now `req_cs` - `local_pdb_complete` now `repl_release` - drop all ^ fields from `Lock.repr()` obvi.. - move over the `.[un]shield_sigint()` and `.is_main_trio_thread()` methods. - add some new attrs/meths: - `DebugStatus.repl` for the currently running `Pdb` in-actor singleton. - `.repr()` for pprint of state (like `Lock`). - Note: that even when a root-actor task is in REPL, the `DebugStatus` is still used for certain actor-local state mgmt, such as SIGINT handler shielding. - obvi change all lock-requester code bits to now use a `DebugStatus` in their local actor-state instead of `Lock`, i.e. change usage from `Lock` in `._runtime` and `._root`. - use new `Lock.get_locking_task_cs()` API in when checking for sub-in-debug from `._runtime.Actor._stream_handler()`. Unrelated to topic-at-hand tweaks: ------ - ------ - drop the commented bits about hiding `@[a]cm` stack frames from `_debug.pause()` and simplify to only one block with the `shield` passthrough since we already solved the issue with cancel-scopes using `@pdbp.hideframe` B) - this includes all the extra logging about the extra frame for the user (good thing i put in that wasted effort back then eh..) - put the `try/except BaseException` with `log.exception()` around the whole of `._pause()` to ensure we don't miss in-func errors which can cause hangs.. - allow passing in `portal: Portal` to `Actor.start_remote_task()` such that `Portal` task spawning methods are always denoted correctly in terms of `Context.side`. - lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.
2024-04-18 16:47:28 +00:00
] = {}
2018-07-14 20:09:05 +00:00
# when provided, init the registry addresses property from
# input via the validator.
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
self._reg_addrs: list[UnwrappedAddress] = []
if registry_addrs:
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
self.reg_addrs: list[UnwrappedAddress] = registry_addrs
_state._runtime_vars['_registry_addrs'] = registry_addrs
@property
def aid(self) -> msgtypes.Aid:
'''
This process-singleton-actor's "unique actor ID" in struct form.
See the `tractor.msg.Aid` struct for details.
'''
return self._aid
@property
def name(self) -> str:
return self._aid.name
@property
def uid(self) -> tuple[str, str]:
'''
This process-singleton's "unique (cross-host) ID".
Delivered from the `.Aid.name/.uuid` fields as a `tuple` pair
and should be multi-host unique despite a large distributed
process plane.
'''
msg: str = (
f'`{type(self).__name__}.uid` is now deprecated.\n'
'Use the new `.aid: tractor.msg.Aid` (struct) instead '
'which also provides additional named (optional) fields '
'beyond just the `.name` and `.uuid`.'
)
warnings.warn(
msg,
DeprecationWarning,
stacklevel=2,
)
return (
self._aid.name,
self._aid.uuid,
)
@property
def pid(self) -> int:
return self._aid.pid
def pformat(self) -> str:
ds: str = '='
parent_uid: tuple|None = None
if rent_chan := self._parent_chan:
parent_uid = rent_chan.uid
peers: list = []
server: _server.IPCServer = self.ipc_server
if server:
peers: list[tuple] = list(server._peer_connected)
fmtstr: str = (
f' |_id: {self.aid!r}\n'
# f" aid{ds}{self.aid!r}\n"
f" parent{ds}{parent_uid}\n"
f'\n'
f' |_ipc: {len(peers)!r} connected peers\n'
f" peers{ds}{peers!r}\n"
Factor actor-embedded IPC-tpt-server to `ipc` subsys Primarily moving the `Actor._serve_forever()`-task-as-method and supporting actor-instance attributes to a new `.ipo._server` sub-mod which now encapsulates, - the coupling various `trio.Nursery`s (and their independent lifetime mgmt) to different `trio.serve_listener()`s tasks and `SocketStream` handler scopes. - `Address` and `SocketListener` mgmt and tracking through the idea of an "IPC endpoint": each "bound-and-active instance" of a served-listener for some (varied transport protocol's socket) address. - start and shutdown of the entire server's lifetime via an `@acm`. - delegation of starting/stopping tpt-protocol-specific `trio.abc.Listener`s to the corresponding `.ipc._<proto_key>` sub-module (newly defined mod-top-level instead of `Address` method) `start/close_listener()` funcs. Impl details of the `.ipc._server` sub-sys, - add new `IPCServer`, allocated with `open_ipc_server()`, and which encapsulates starting multiple-transport-proto-`trio.abc.Listener`s from an input set of `._addr.Address`s using, |_`IPCServer.listen_on()` which internally spawns tasks that delegate to a new `_serve_ipc_eps()`, a rework of what was (effectively) `Actor._serve_forever()` and which now, * allocates a new `IPCEndpoint`-struct (see below) for each address-listener pair alongside the specified listener-serving/stream-handling `trio.Nursery`s provided by the caller. * starts and stops each transport (socket's) listener by calling `IPCEndpoint.start/close_listener()` which in turn delegates to the underlying `inspect.getmodule(IPCEndpoint.addr)` backend tpt module's equivalent impl. * tracks all created endpoints in a `._endpoints: list[IPCEndpoint]` which is further exposed through public properties for introspection of served transport-protocols and their addresses. |_`IPCServer._[parent/stream_handler]_tn: Nursery`s which are either allocated (in which case, as the same instance) or provided by the caller of `open_ipc_server()` such that the same nursery-cancel-scope controls offered by `trio.serve_listeners(handler_nursery=)` are offered where the `._parent_tn` is used to spawn `_serve_ipc_eps()` tasks, and `._stream_handler_tn` is passed verbatim as `handler_nursery`. - a new `IPCEndpoint`-struct (as mentioned) which wraps each transport-proto's address + listener + allocated-supervising-nursery to encapsulate the "lifetime of a server IPC endpoint" such that eventually we can track and managed per-protocol/address/`.listen_on()`-call scoped starts/stops/restarts for the purposes of filtering/banning peer traffic. |_ also included is an unused `.peer_tpts` table which we can hopefully use to replace `Actor._peers` in a `Channel`-tracking transport-proto-aware way! Surrounding changes to `.ipc.*` primitives to match, - make `[TCP|UDS]Address` types `msgspec.Struct(frozen=True)` and thus drop any-and-all `addr._host =` style mutation throughout. |_ as such also drop their `.__init__()` and `.__eq__()` meths. |_ UDS tweaks to field names and thus `.__repr__()`. - move `[TCP|UDS]Address.[start/close]_listener()` meths to be mod-level equiv `start|close_listener()` funcs. - just hard code the `.ipc._types._key_to_transport/._addr_to_transport` table entries instead of all the prior fancy dynamic class property reading stuff (remember, "explicit is better then implicit"). Modified in `._runtime.Actor` internals, - drop the `._serve_forever()` and `.cancel_server()`, methods and `._server_down` waiting logic from `.cancel_soon()` - add `.[_]ipc_server` which is opened just after the `._service_n` and delegate to it for any equivalent publicly exposed instance attributes/properties.
2025-04-10 22:06:12 +00:00
f" ipc_server{ds}{self._ipc_server}\n"
f'\n'
f' |_rpc: {len(self._rpc_tasks)} tasks\n'
f" ctxs{ds}{len(self._contexts)}\n"
f'\n'
f' |_runtime: ._task{ds}{self._task!r}\n'
f' _spawn_method{ds}{self._spawn_method}\n'
f' _actoruid2nursery{ds}{self._actoruid2nursery}\n'
f' _forkserver_info{ds}{self._forkserver_info}\n'
f'\n'
f' |_state: "TODO: .repr_state()"\n'
f' _cancel_complete{ds}{self._cancel_complete}\n'
f' _cancel_called_by_remote{ds}{self._cancel_called_by_remote}\n'
f' _cancel_called{ds}{self._cancel_called}\n'
)
return (
'<Actor(\n'
+
fmtstr
+
')>\n'
)
__repr__ = pformat
@property
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
def reg_addrs(self) -> list[UnwrappedAddress]:
'''
List of (socket) addresses for all known (and contactable)
registry actors.
'''
return self._reg_addrs
@reg_addrs.setter
def reg_addrs(
self,
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
addrs: list[UnwrappedAddress],
) -> None:
if not addrs:
log.warning(
'Empty registry address list is invalid:\n'
f'{addrs}'
)
return
self._reg_addrs = addrs
def load_modules(
self,
) -> None:
2022-08-03 20:09:16 +00:00
'''
Load explicitly enabled python modules from local fs after
process spawn.
Since this actor may be spawned on a different machine from
the original nursery we need to try and load the local module
code manually (presuming it exists).
2022-08-03 20:09:16 +00:00
'''
2019-11-26 14:23:37 +00:00
try:
if self._spawn_method == 'trio':
parent_data = self._parent_main_data
if 'init_main_from_name' in parent_data:
_mp_fixup_main._fixup_main_from_name(
parent_data['init_main_from_name'])
elif 'init_main_from_path' in parent_data:
_mp_fixup_main._fixup_main_from_path(
parent_data['init_main_from_path'])
status: str = 'Attempting to import enabled modules:\n'
modpath: str
filepath: str
2021-01-05 13:28:06 +00:00
for modpath, filepath in self.enable_modules.items():
# XXX append the allowed module to the python path which
# should allow for relative (at least downward) imports.
sys.path.append(os.path.dirname(filepath))
status += (
f'|_{modpath!r} -> {filepath!r}\n'
)
mod: ModuleType = importlib.import_module(modpath)
2020-02-09 06:05:52 +00:00
self._mods[modpath] = mod
if modpath == '__main__':
self._mods['__mp_main__'] = mod
2021-11-07 22:05:40 +00:00
log.runtime(status)
2019-11-26 14:23:37 +00:00
except ModuleNotFoundError:
# it is expected the corresponding `ModuleNotExposed` error
# will be raised later
log.error(
f"Failed to import {modpath} in {self.name}"
)
2019-11-26 14:23:37 +00:00
raise
2018-07-14 20:09:05 +00:00
def _get_rpc_func(self, ns, funcname):
'''
Try to lookup and return a target RPC func from the
post-fork enabled module set.
'''
try:
return getattr(self._mods[ns], funcname)
except KeyError as err:
2020-10-16 02:49:12 +00:00
mne = ModuleNotExposed(*err.args)
if ns == '__main__':
modpath = '__name__'
else:
modpath = f"'{ns}'"
msg = (
"\n\nMake sure you exposed the target module, `{ns}`, "
"using:\n"
"ActorNursery.start_actor(<name>, enable_modules=[{mod}])"
).format(
ns=ns,
mod=modpath,
)
2020-10-16 02:49:12 +00:00
mne.msg += msg
2020-10-16 02:49:12 +00:00
raise mne
# TODO: rename to `._deliver_payload()` since this handles
# more then just `result` msgs now obvi XD
async def _deliver_ctx_payload(
2019-02-15 21:23:58 +00:00
self,
2019-02-16 19:05:03 +00:00
chan: Channel,
cid: str,
msg: MsgType|MsgTypeError,
Improved log msg formatting in core As part of solving some final edge cases todo with inter-peer remote cancellation (particularly a remote cancel from a separate actor tree-client hanging on the request side in `modden`..) I needed less dense, more line-delimited log msg formats when understanding ipc channel and context cancels from console logging; this adds a ton of that to: - `._invoke()` which now does, - better formatting of `Context`-task info as multi-line `'<field>: <value>\n'` messages, - use of `trio.Task` (from `.lowlevel.current_task()` for full rpc-func namespace-path info, - better "msg flow annotations" with `<=` for understanding `ContextCancelled` flow. - `Actor._stream_handler()` where in we break down IPC peers reporting better as multi-line `|_<Channel>` log msgs instead of all jammed on one line.. - `._ipc.Channel.send()` use `pformat()` for repr of packet. Also tweak some optional deps imports for debug mode: - add `maybe_import_gb()` for attempting to import `greenback`. - maybe enable `stackscope` tree pprinter on `SIGUSR1` if installed. Add a further stale-debugger-lock guard before removal: - read the `._debug.Lock.global_actor_in_debug: tuple` uid and possibly `maybe_wait_for_debugger()` when the child-user is known to have a live process in our tree. - only cancel `Lock._root_local_task_cs_in_debug: CancelScope` when the disconnected channel maps to the `Lock.global_actor_in_debug`, though not sure this is correct yet? Started adding missing type annots in sections that were modified.
2024-02-19 17:25:08 +00:00
) -> None|bool:
Disable msg stream backpressure by default Half of portal API usage requires a 1 message response (`.run()`, `.run_in_actor()`) and the streaming APIs should probably be explicitly enabled for backpressure if desired by the user. This makes more sense in (psuedo) realtime systems where it's better to notify on a block then freeze without notice. Make this default behaviour with a new error to be raised: `tractor._exceptions.StreamOverrun` when a sender overruns a stream by the default size (2**6 for now). The old behavior can be enabled with `Context.open_stream(backpressure=True)` but now with warning log messages when there are overruns. Add task-linked-context error propagation using a "nursery raising" technique such that if either end of context linked pair of tasks errors, that error can be relayed to other side and raised as a form of interrupt at the receiving task's next `trio` checkpoint. This enables reliable error relay without expecting the (error) receiving task to call an API which would raise the remote exception (which it might never currently if using `tractor.MsgStream` APIs). Further internal implementation details: - define the default msg buffer size as `Actor.msg_buffer_size` - expose a `msg_buffer_size: int` kwarg from `Actor.get_context()` - maybe raise aforementioned context errors using `Context._maybe_error_from_remote_msg()` inside `Actor._push_result()` - support optional backpressure on a stream when pushing messages in `Actor._push_result()` - in `_invote()` handle multierrors raised from a `@tractor.context` entrypoint as being potentially caused by a relayed error from the remote caller task, if `Context._error` has been set then raise that error inside the `RemoteActorError` that will be relayed back to that caller more or less proxying through the source side error back to its origin.
2021-12-06 00:31:41 +00:00
'''
Push an RPC msg-payload to the local consumer peer-task's
queue.
Disable msg stream backpressure by default Half of portal API usage requires a 1 message response (`.run()`, `.run_in_actor()`) and the streaming APIs should probably be explicitly enabled for backpressure if desired by the user. This makes more sense in (psuedo) realtime systems where it's better to notify on a block then freeze without notice. Make this default behaviour with a new error to be raised: `tractor._exceptions.StreamOverrun` when a sender overruns a stream by the default size (2**6 for now). The old behavior can be enabled with `Context.open_stream(backpressure=True)` but now with warning log messages when there are overruns. Add task-linked-context error propagation using a "nursery raising" technique such that if either end of context linked pair of tasks errors, that error can be relayed to other side and raised as a form of interrupt at the receiving task's next `trio` checkpoint. This enables reliable error relay without expecting the (error) receiving task to call an API which would raise the remote exception (which it might never currently if using `tractor.MsgStream` APIs). Further internal implementation details: - define the default msg buffer size as `Actor.msg_buffer_size` - expose a `msg_buffer_size: int` kwarg from `Actor.get_context()` - maybe raise aforementioned context errors using `Context._maybe_error_from_remote_msg()` inside `Actor._push_result()` - support optional backpressure on a stream when pushing messages in `Actor._push_result()` - in `_invote()` handle multierrors raised from a `@tractor.context` entrypoint as being potentially caused by a relayed error from the remote caller task, if `Context._error` has been set then raise that error inside the `RemoteActorError` that will be relayed back to that caller more or less proxying through the source side error back to its origin.
2021-12-06 00:31:41 +00:00
'''
Improved log msg formatting in core As part of solving some final edge cases todo with inter-peer remote cancellation (particularly a remote cancel from a separate actor tree-client hanging on the request side in `modden`..) I needed less dense, more line-delimited log msg formats when understanding ipc channel and context cancels from console logging; this adds a ton of that to: - `._invoke()` which now does, - better formatting of `Context`-task info as multi-line `'<field>: <value>\n'` messages, - use of `trio.Task` (from `.lowlevel.current_task()` for full rpc-func namespace-path info, - better "msg flow annotations" with `<=` for understanding `ContextCancelled` flow. - `Actor._stream_handler()` where in we break down IPC peers reporting better as multi-line `|_<Channel>` log msgs instead of all jammed on one line.. - `._ipc.Channel.send()` use `pformat()` for repr of packet. Also tweak some optional deps imports for debug mode: - add `maybe_import_gb()` for attempting to import `greenback`. - maybe enable `stackscope` tree pprinter on `SIGUSR1` if installed. Add a further stale-debugger-lock guard before removal: - read the `._debug.Lock.global_actor_in_debug: tuple` uid and possibly `maybe_wait_for_debugger()` when the child-user is known to have a live process in our tree. - only cancel `Lock._root_local_task_cs_in_debug: CancelScope` when the disconnected channel maps to the `Lock.global_actor_in_debug`, though not sure this is correct yet? Started adding missing type annots in sections that were modified.
2024-02-19 17:25:08 +00:00
uid: tuple[str, str] = chan.uid
assert uid, f"`chan.uid` can't be {uid}"
try:
ctx: Context = self._contexts[(
uid,
cid,
# TODO: how to determine this tho?
# side,
)]
except KeyError:
report: str = (
'Ignoring invalid IPC msg!?\n'
f'Ctx seems to not/no-longer exist??\n'
f'\n'
f'<=? {uid}\n'
f' |_{pretty_struct.pformat(msg)}\n'
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
)
match msg:
case Stop():
log.runtime(report)
case _:
log.warning(report)
return
# if isinstance(msg, MsgTypeError):
# return await ctx._deliver_bad_msg()
Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` **but** with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.
2023-04-13 20:03:35 +00:00
return await ctx._deliver_msg(msg)
2019-02-15 21:23:58 +00:00
def get_context(
2018-08-31 21:16:24 +00:00
self,
chan: Channel,
cid: str,
nsf: NamespacePath,
Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` **but** with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.
2023-04-13 20:03:35 +00:00
# TODO: support lookup by `Context.side: str` ?
# -> would allow making a self-context which might have
# certain special use cases where RPC isolation is wanted
# between 2 tasks running in the same process?
# => prolly needs some deeper though on the real use cases
# and whether or not such things should be better
# implemented using a `TaskManager` style nursery..
#
# side: str|None = None,
msg_buffer_size: int|None = None,
Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` **but** with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.
2023-04-13 20:03:35 +00:00
allow_overruns: bool = False,
) -> Context:
'''
Look-up (existing) or create a new
inter-actor-SC-linked task "context" (a `Context`) which
encapsulates the local RPC task's execution enviroment
around `Channel` relayed msg handling including,
- a dedicated `trio` cancel scope (`Context._scope`),
- a pair of IPC-msg-relay "feeder" mem-channels
(`Context._recv/send_chan`),
- and a "context id" (cid) unique to the task-pair
msging session's lifetime.
'''
actor_uid = chan.uid
assert actor_uid
2019-02-15 21:23:58 +00:00
try:
ctx = self._contexts[(
actor_uid,
cid,
# side,
)]
Use `DebugStatus` around subactor lock requests Breaks out all the (sub)actor local conc primitives from `Lock` (which is now only used in and by the root actor) such that there's an explicit distinction between a task that's "consuming" the `Lock` (remotely) vs. the root-side service tasks which do the actual acquire on behalf of the requesters. `DebugStatus` changeover deats: ------ - ------ - move all the actor-local vars over `DebugStatus` including: - move `_trio_handler` and `_orig_sigint_handler` - `local_task_in_debug` now `repl_task` - `_debugger_request_cs` now `req_cs` - `local_pdb_complete` now `repl_release` - drop all ^ fields from `Lock.repr()` obvi.. - move over the `.[un]shield_sigint()` and `.is_main_trio_thread()` methods. - add some new attrs/meths: - `DebugStatus.repl` for the currently running `Pdb` in-actor singleton. - `.repr()` for pprint of state (like `Lock`). - Note: that even when a root-actor task is in REPL, the `DebugStatus` is still used for certain actor-local state mgmt, such as SIGINT handler shielding. - obvi change all lock-requester code bits to now use a `DebugStatus` in their local actor-state instead of `Lock`, i.e. change usage from `Lock` in `._runtime` and `._root`. - use new `Lock.get_locking_task_cs()` API in when checking for sub-in-debug from `._runtime.Actor._stream_handler()`. Unrelated to topic-at-hand tweaks: ------ - ------ - drop the commented bits about hiding `@[a]cm` stack frames from `_debug.pause()` and simplify to only one block with the `shield` passthrough since we already solved the issue with cancel-scopes using `@pdbp.hideframe` B) - this includes all the extra logging about the extra frame for the user (good thing i put in that wasted effort back then eh..) - put the `try/except BaseException` with `log.exception()` around the whole of `._pause()` to ensure we don't miss in-func errors which can cause hangs.. - allow passing in `portal: Portal` to `Actor.start_remote_task()` such that `Portal` task spawning methods are always denoted correctly in terms of `Context.side`. - lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.
2024-04-18 16:47:28 +00:00
log.debug(
f'Retreived cached IPC ctx for\n'
f'peer: {chan.uid}\n'
f'cid:{cid}\n'
)
ctx._allow_overruns: bool = allow_overruns
Disable msg stream backpressure by default Half of portal API usage requires a 1 message response (`.run()`, `.run_in_actor()`) and the streaming APIs should probably be explicitly enabled for backpressure if desired by the user. This makes more sense in (psuedo) realtime systems where it's better to notify on a block then freeze without notice. Make this default behaviour with a new error to be raised: `tractor._exceptions.StreamOverrun` when a sender overruns a stream by the default size (2**6 for now). The old behavior can be enabled with `Context.open_stream(backpressure=True)` but now with warning log messages when there are overruns. Add task-linked-context error propagation using a "nursery raising" technique such that if either end of context linked pair of tasks errors, that error can be relayed to other side and raised as a form of interrupt at the receiving task's next `trio` checkpoint. This enables reliable error relay without expecting the (error) receiving task to call an API which would raise the remote exception (which it might never currently if using `tractor.MsgStream` APIs). Further internal implementation details: - define the default msg buffer size as `Actor.msg_buffer_size` - expose a `msg_buffer_size: int` kwarg from `Actor.get_context()` - maybe raise aforementioned context errors using `Context._maybe_error_from_remote_msg()` inside `Actor._push_result()` - support optional backpressure on a stream when pushing messages in `Actor._push_result()` - in `_invote()` handle multierrors raised from a `@tractor.context` entrypoint as being potentially caused by a relayed error from the remote caller task, if `Context._error` has been set then raise that error inside the `RemoteActorError` that will be relayed back to that caller more or less proxying through the source side error back to its origin.
2021-12-06 00:31:41 +00:00
# adjust buffer size if specified
state: MemoryChannelState = ctx._send_chan._state # type: ignore
if (
msg_buffer_size
and
state.max_buffer_size != msg_buffer_size
):
Disable msg stream backpressure by default Half of portal API usage requires a 1 message response (`.run()`, `.run_in_actor()`) and the streaming APIs should probably be explicitly enabled for backpressure if desired by the user. This makes more sense in (psuedo) realtime systems where it's better to notify on a block then freeze without notice. Make this default behaviour with a new error to be raised: `tractor._exceptions.StreamOverrun` when a sender overruns a stream by the default size (2**6 for now). The old behavior can be enabled with `Context.open_stream(backpressure=True)` but now with warning log messages when there are overruns. Add task-linked-context error propagation using a "nursery raising" technique such that if either end of context linked pair of tasks errors, that error can be relayed to other side and raised as a form of interrupt at the receiving task's next `trio` checkpoint. This enables reliable error relay without expecting the (error) receiving task to call an API which would raise the remote exception (which it might never currently if using `tractor.MsgStream` APIs). Further internal implementation details: - define the default msg buffer size as `Actor.msg_buffer_size` - expose a `msg_buffer_size: int` kwarg from `Actor.get_context()` - maybe raise aforementioned context errors using `Context._maybe_error_from_remote_msg()` inside `Actor._push_result()` - support optional backpressure on a stream when pushing messages in `Actor._push_result()` - in `_invote()` handle multierrors raised from a `@tractor.context` entrypoint as being potentially caused by a relayed error from the remote caller task, if `Context._error` has been set then raise that error inside the `RemoteActorError` that will be relayed back to that caller more or less proxying through the source side error back to its origin.
2021-12-06 00:31:41 +00:00
state.max_buffer_size = msg_buffer_size
2019-02-15 21:23:58 +00:00
except KeyError:
First draft "payload receiver in a new `.msg._ops` As per much tinkering, re-designs and preceding rubber-ducking via many "commit msg novelas", **finally** this adds the (hopefully) final missing layer for typed msg safety: `tractor.msg._ops.PldRx` (or `PayloadReceiver`? haven't decided how verbose to go..) Design justification summary: ------ - ------ - need a way to be as-close-as-possible to the `tractor`-application such that when `MsgType.pld: PayloadT` validation takes place, it is straightforward and obvious how user code can decide to handle any resulting `MsgTypeError`. - there should be a common and optional-yet-modular way to modify **how** data delivered via IPC (possibly embedded as user defined, type-constrained `.pld: msgspec.Struct`s) can be handled and processed during fault conditions and/or IPC "msg attacks". - support for nested type constraints within a `MsgType.pld` field should be simple to define, implement and understand at runtime. - a layer between the app-level IPC primitive APIs (`Context`/`MsgStream`) and application-task code (consumer code of those APIs) should be easily customized and prove-to-be-as-such through demonstrably rigorous internal (sub-sys) use! -> eg. via seemless runtime RPC eps support like `Actor.cancel()` -> by correctly implementing our `.devx._debug.Lock` REPL TTY mgmt dialog prot, via a dead simple payload-as-ctl-msg-spec. There are some fairly detailed doc strings included so I won't duplicate that content, the majority of the work here is actually somewhat of a factoring of many similar blocks that are doing more or less the same `msg = await Context._rx_chan.receive()` with boilerplate for `Error`/`Stop` handling via `_raise_from_no_key_in_msg()`. The new `PldRx` basically provides a shim layer for this common "receive msg, decode its payload, yield it up to the consuming app task" by pairing the RPC feeder mem-chan with a msg-payload decoder and expecting IPC API internals to use **one** API instead of re-implementing the same pattern all over the place XD `PldRx` breakdown ------ - ------ - for now only expects a `._msgdec: MsgDec` which allows for override-able `MsgType.pld` validation and most obviously used in the impl of `.dec_msg()`, the decode message method. - provides multiple mem-chan receive options including: |_ `.recv_pld()` which does the e2e operation of receiving a payload item. |_ a sync `.recv_pld_nowait()` version. |_ a `.recv_msg_w_pld()` which optionally allows retreiving both the shuttling `MsgType` as well as it's `.pld` body for use cases where info on both is important (eg. draining a `MsgStream`). Dirty internal changeover/implementation deatz: ------ - ------ - obvi move over all the IPC "primitives" that previously had the duplicate recv-n-yield logic: - `MsgStream.receive[_nowait]()` delegating instead to the equivalent `PldRx.recv_pld[_nowait]()`. - add `Context._pld_rx: PldRx`, created and passed in by `mk_context()`; use it for the `.started()` -> `first: Started` retrieval inside `open_context_from_portal()`. - all the relevant `Portal` invocation methods: `.result()`, `.run_from_ns()`, `.run()`; also allows for dropping `_unwrap_msg()` and `.Portal_return_once()` outright Bo - rename `Context.ctx._recv_chan` -> `._rx_chan`. - add detailed `Context._scope` info for logging whether or not it's cancelled inside `_maybe_cancel_and_set_remote_error()`. - move `._context._drain_to_final_msg()` -> `._ops.drain_to_final_msg()` since it's really not necessarily ctx specific per say, and it does kinda fit with "msg operations" more abstractly ;)
2024-04-23 21:43:45 +00:00
log.debug(
f'Allocate new IPC ctx for\n'
f'peer: {chan.uid}\n'
f'cid: {cid}\n'
)
Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` **but** with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.
2023-04-13 20:03:35 +00:00
ctx = mk_context(
chan,
cid,
nsf=nsf,
Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` **but** with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.
2023-04-13 20:03:35 +00:00
msg_buffer_size=msg_buffer_size or self.msg_buffer_size,
_allow_overruns=allow_overruns,
)
self._contexts[(
actor_uid,
cid,
# side,
)] = ctx
2019-02-15 21:23:58 +00:00
return ctx
2018-07-14 20:09:05 +00:00
async def start_remote_task(
2019-02-15 21:23:58 +00:00
self,
chan: Channel,
nsf: NamespacePath,
Disable msg stream backpressure by default Half of portal API usage requires a 1 message response (`.run()`, `.run_in_actor()`) and the streaming APIs should probably be explicitly enabled for backpressure if desired by the user. This makes more sense in (psuedo) realtime systems where it's better to notify on a block then freeze without notice. Make this default behaviour with a new error to be raised: `tractor._exceptions.StreamOverrun` when a sender overruns a stream by the default size (2**6 for now). The old behavior can be enabled with `Context.open_stream(backpressure=True)` but now with warning log messages when there are overruns. Add task-linked-context error propagation using a "nursery raising" technique such that if either end of context linked pair of tasks errors, that error can be relayed to other side and raised as a form of interrupt at the receiving task's next `trio` checkpoint. This enables reliable error relay without expecting the (error) receiving task to call an API which would raise the remote exception (which it might never currently if using `tractor.MsgStream` APIs). Further internal implementation details: - define the default msg buffer size as `Actor.msg_buffer_size` - expose a `msg_buffer_size: int` kwarg from `Actor.get_context()` - maybe raise aforementioned context errors using `Context._maybe_error_from_remote_msg()` inside `Actor._push_result()` - support optional backpressure on a stream when pushing messages in `Actor._push_result()` - in `_invote()` handle multierrors raised from a `@tractor.context` entrypoint as being potentially caused by a relayed error from the remote caller task, if `Context._error` has been set then raise that error inside the `RemoteActorError` that will be relayed back to that caller more or less proxying through the source side error back to its origin.
2021-12-06 00:31:41 +00:00
kwargs: dict,
Use `DebugStatus` around subactor lock requests Breaks out all the (sub)actor local conc primitives from `Lock` (which is now only used in and by the root actor) such that there's an explicit distinction between a task that's "consuming" the `Lock` (remotely) vs. the root-side service tasks which do the actual acquire on behalf of the requesters. `DebugStatus` changeover deats: ------ - ------ - move all the actor-local vars over `DebugStatus` including: - move `_trio_handler` and `_orig_sigint_handler` - `local_task_in_debug` now `repl_task` - `_debugger_request_cs` now `req_cs` - `local_pdb_complete` now `repl_release` - drop all ^ fields from `Lock.repr()` obvi.. - move over the `.[un]shield_sigint()` and `.is_main_trio_thread()` methods. - add some new attrs/meths: - `DebugStatus.repl` for the currently running `Pdb` in-actor singleton. - `.repr()` for pprint of state (like `Lock`). - Note: that even when a root-actor task is in REPL, the `DebugStatus` is still used for certain actor-local state mgmt, such as SIGINT handler shielding. - obvi change all lock-requester code bits to now use a `DebugStatus` in their local actor-state instead of `Lock`, i.e. change usage from `Lock` in `._runtime` and `._root`. - use new `Lock.get_locking_task_cs()` API in when checking for sub-in-debug from `._runtime.Actor._stream_handler()`. Unrelated to topic-at-hand tweaks: ------ - ------ - drop the commented bits about hiding `@[a]cm` stack frames from `_debug.pause()` and simplify to only one block with the `shield` passthrough since we already solved the issue with cancel-scopes using `@pdbp.hideframe` B) - this includes all the extra logging about the extra frame for the user (good thing i put in that wasted effort back then eh..) - put the `try/except BaseException` with `log.exception()` around the whole of `._pause()` to ensure we don't miss in-func errors which can cause hangs.. - allow passing in `portal: Portal` to `Actor.start_remote_task()` such that `Portal` task spawning methods are always denoted correctly in terms of `Context.side`. - lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.
2024-04-18 16:47:28 +00:00
# determines `Context.side: str`
portal: Portal|None = None,
# IPC channel config
msg_buffer_size: int|None = None,
Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` **but** with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.
2023-04-13 20:03:35 +00:00
allow_overruns: bool = False,
load_nsf: bool = False,
ack_timeout: float = float('inf'),
) -> Context:
'''
Send a `'cmd'` msg to a remote actor, which requests the
start and schedule of a remote task-as-function's
entrypoint.
Synchronously validates the endpoint type and returns
a (caller side) `Context` that can be used to accept
delivery of msg payloads from the local runtime's
processing loop: `._rpc.process_messages()`.
'''
cid: str = str(uuid.uuid4())
2018-08-31 21:16:24 +00:00
assert chan.uid
ctx = self.get_context(
chan=chan,
cid=cid,
nsf=nsf,
# side='caller',
msg_buffer_size=msg_buffer_size,
Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` **but** with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.
2023-04-13 20:03:35 +00:00
allow_overruns=allow_overruns,
)
Use `DebugStatus` around subactor lock requests Breaks out all the (sub)actor local conc primitives from `Lock` (which is now only used in and by the root actor) such that there's an explicit distinction between a task that's "consuming" the `Lock` (remotely) vs. the root-side service tasks which do the actual acquire on behalf of the requesters. `DebugStatus` changeover deats: ------ - ------ - move all the actor-local vars over `DebugStatus` including: - move `_trio_handler` and `_orig_sigint_handler` - `local_task_in_debug` now `repl_task` - `_debugger_request_cs` now `req_cs` - `local_pdb_complete` now `repl_release` - drop all ^ fields from `Lock.repr()` obvi.. - move over the `.[un]shield_sigint()` and `.is_main_trio_thread()` methods. - add some new attrs/meths: - `DebugStatus.repl` for the currently running `Pdb` in-actor singleton. - `.repr()` for pprint of state (like `Lock`). - Note: that even when a root-actor task is in REPL, the `DebugStatus` is still used for certain actor-local state mgmt, such as SIGINT handler shielding. - obvi change all lock-requester code bits to now use a `DebugStatus` in their local actor-state instead of `Lock`, i.e. change usage from `Lock` in `._runtime` and `._root`. - use new `Lock.get_locking_task_cs()` API in when checking for sub-in-debug from `._runtime.Actor._stream_handler()`. Unrelated to topic-at-hand tweaks: ------ - ------ - drop the commented bits about hiding `@[a]cm` stack frames from `_debug.pause()` and simplify to only one block with the `shield` passthrough since we already solved the issue with cancel-scopes using `@pdbp.hideframe` B) - this includes all the extra logging about the extra frame for the user (good thing i put in that wasted effort back then eh..) - put the `try/except BaseException` with `log.exception()` around the whole of `._pause()` to ensure we don't miss in-func errors which can cause hangs.. - allow passing in `portal: Portal` to `Actor.start_remote_task()` such that `Portal` task spawning methods are always denoted correctly in terms of `Context.side`. - lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.
2024-04-18 16:47:28 +00:00
ctx._portal = portal
if (
'self' in nsf
Use `DebugStatus` around subactor lock requests Breaks out all the (sub)actor local conc primitives from `Lock` (which is now only used in and by the root actor) such that there's an explicit distinction between a task that's "consuming" the `Lock` (remotely) vs. the root-side service tasks which do the actual acquire on behalf of the requesters. `DebugStatus` changeover deats: ------ - ------ - move all the actor-local vars over `DebugStatus` including: - move `_trio_handler` and `_orig_sigint_handler` - `local_task_in_debug` now `repl_task` - `_debugger_request_cs` now `req_cs` - `local_pdb_complete` now `repl_release` - drop all ^ fields from `Lock.repr()` obvi.. - move over the `.[un]shield_sigint()` and `.is_main_trio_thread()` methods. - add some new attrs/meths: - `DebugStatus.repl` for the currently running `Pdb` in-actor singleton. - `.repr()` for pprint of state (like `Lock`). - Note: that even when a root-actor task is in REPL, the `DebugStatus` is still used for certain actor-local state mgmt, such as SIGINT handler shielding. - obvi change all lock-requester code bits to now use a `DebugStatus` in their local actor-state instead of `Lock`, i.e. change usage from `Lock` in `._runtime` and `._root`. - use new `Lock.get_locking_task_cs()` API in when checking for sub-in-debug from `._runtime.Actor._stream_handler()`. Unrelated to topic-at-hand tweaks: ------ - ------ - drop the commented bits about hiding `@[a]cm` stack frames from `_debug.pause()` and simplify to only one block with the `shield` passthrough since we already solved the issue with cancel-scopes using `@pdbp.hideframe` B) - this includes all the extra logging about the extra frame for the user (good thing i put in that wasted effort back then eh..) - put the `try/except BaseException` with `log.exception()` around the whole of `._pause()` to ensure we don't miss in-func errors which can cause hangs.. - allow passing in `portal: Portal` to `Actor.start_remote_task()` such that `Portal` task spawning methods are always denoted correctly in terms of `Context.side`. - lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.
2024-04-18 16:47:28 +00:00
or
not load_nsf
):
ns, _, func = nsf.partition(':')
else:
# TODO: pass nsf directly over wire!
# -[ ] but, how to do `self:<Actor.meth>`??
ns, func = nsf.to_tuple()
Use `DebugStatus` around subactor lock requests Breaks out all the (sub)actor local conc primitives from `Lock` (which is now only used in and by the root actor) such that there's an explicit distinction between a task that's "consuming" the `Lock` (remotely) vs. the root-side service tasks which do the actual acquire on behalf of the requesters. `DebugStatus` changeover deats: ------ - ------ - move all the actor-local vars over `DebugStatus` including: - move `_trio_handler` and `_orig_sigint_handler` - `local_task_in_debug` now `repl_task` - `_debugger_request_cs` now `req_cs` - `local_pdb_complete` now `repl_release` - drop all ^ fields from `Lock.repr()` obvi.. - move over the `.[un]shield_sigint()` and `.is_main_trio_thread()` methods. - add some new attrs/meths: - `DebugStatus.repl` for the currently running `Pdb` in-actor singleton. - `.repr()` for pprint of state (like `Lock`). - Note: that even when a root-actor task is in REPL, the `DebugStatus` is still used for certain actor-local state mgmt, such as SIGINT handler shielding. - obvi change all lock-requester code bits to now use a `DebugStatus` in their local actor-state instead of `Lock`, i.e. change usage from `Lock` in `._runtime` and `._root`. - use new `Lock.get_locking_task_cs()` API in when checking for sub-in-debug from `._runtime.Actor._stream_handler()`. Unrelated to topic-at-hand tweaks: ------ - ------ - drop the commented bits about hiding `@[a]cm` stack frames from `_debug.pause()` and simplify to only one block with the `shield` passthrough since we already solved the issue with cancel-scopes using `@pdbp.hideframe` B) - this includes all the extra logging about the extra frame for the user (good thing i put in that wasted effort back then eh..) - put the `try/except BaseException` with `log.exception()` around the whole of `._pause()` to ensure we don't miss in-func errors which can cause hangs.. - allow passing in `portal: Portal` to `Actor.start_remote_task()` such that `Portal` task spawning methods are always denoted correctly in terms of `Context.side`. - lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.
2024-04-18 16:47:28 +00:00
msg = msgtypes.Start(
ns=ns,
func=func,
kwargs=kwargs,
uid=self.uid,
cid=cid,
)
Use `DebugStatus` around subactor lock requests Breaks out all the (sub)actor local conc primitives from `Lock` (which is now only used in and by the root actor) such that there's an explicit distinction between a task that's "consuming" the `Lock` (remotely) vs. the root-side service tasks which do the actual acquire on behalf of the requesters. `DebugStatus` changeover deats: ------ - ------ - move all the actor-local vars over `DebugStatus` including: - move `_trio_handler` and `_orig_sigint_handler` - `local_task_in_debug` now `repl_task` - `_debugger_request_cs` now `req_cs` - `local_pdb_complete` now `repl_release` - drop all ^ fields from `Lock.repr()` obvi.. - move over the `.[un]shield_sigint()` and `.is_main_trio_thread()` methods. - add some new attrs/meths: - `DebugStatus.repl` for the currently running `Pdb` in-actor singleton. - `.repr()` for pprint of state (like `Lock`). - Note: that even when a root-actor task is in REPL, the `DebugStatus` is still used for certain actor-local state mgmt, such as SIGINT handler shielding. - obvi change all lock-requester code bits to now use a `DebugStatus` in their local actor-state instead of `Lock`, i.e. change usage from `Lock` in `._runtime` and `._root`. - use new `Lock.get_locking_task_cs()` API in when checking for sub-in-debug from `._runtime.Actor._stream_handler()`. Unrelated to topic-at-hand tweaks: ------ - ------ - drop the commented bits about hiding `@[a]cm` stack frames from `_debug.pause()` and simplify to only one block with the `shield` passthrough since we already solved the issue with cancel-scopes using `@pdbp.hideframe` B) - this includes all the extra logging about the extra frame for the user (good thing i put in that wasted effort back then eh..) - put the `try/except BaseException` with `log.exception()` around the whole of `._pause()` to ensure we don't miss in-func errors which can cause hangs.. - allow passing in `portal: Portal` to `Actor.start_remote_task()` such that `Portal` task spawning methods are always denoted correctly in terms of `Context.side`. - lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.
2024-04-18 16:47:28 +00:00
log.runtime(
'Sending RPC `Start`\n\n'
Use `DebugStatus` around subactor lock requests Breaks out all the (sub)actor local conc primitives from `Lock` (which is now only used in and by the root actor) such that there's an explicit distinction between a task that's "consuming" the `Lock` (remotely) vs. the root-side service tasks which do the actual acquire on behalf of the requesters. `DebugStatus` changeover deats: ------ - ------ - move all the actor-local vars over `DebugStatus` including: - move `_trio_handler` and `_orig_sigint_handler` - `local_task_in_debug` now `repl_task` - `_debugger_request_cs` now `req_cs` - `local_pdb_complete` now `repl_release` - drop all ^ fields from `Lock.repr()` obvi.. - move over the `.[un]shield_sigint()` and `.is_main_trio_thread()` methods. - add some new attrs/meths: - `DebugStatus.repl` for the currently running `Pdb` in-actor singleton. - `.repr()` for pprint of state (like `Lock`). - Note: that even when a root-actor task is in REPL, the `DebugStatus` is still used for certain actor-local state mgmt, such as SIGINT handler shielding. - obvi change all lock-requester code bits to now use a `DebugStatus` in their local actor-state instead of `Lock`, i.e. change usage from `Lock` in `._runtime` and `._root`. - use new `Lock.get_locking_task_cs()` API in when checking for sub-in-debug from `._runtime.Actor._stream_handler()`. Unrelated to topic-at-hand tweaks: ------ - ------ - drop the commented bits about hiding `@[a]cm` stack frames from `_debug.pause()` and simplify to only one block with the `shield` passthrough since we already solved the issue with cancel-scopes using `@pdbp.hideframe` B) - this includes all the extra logging about the extra frame for the user (good thing i put in that wasted effort back then eh..) - put the `try/except BaseException` with `log.exception()` around the whole of `._pause()` to ensure we don't miss in-func errors which can cause hangs.. - allow passing in `portal: Portal` to `Actor.start_remote_task()` such that `Portal` task spawning methods are always denoted correctly in terms of `Context.side`. - lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.
2024-04-18 16:47:28 +00:00
f'=> peer: {chan.uid}\n'
f' |_ {ns}.{func}({kwargs})\n\n'
f'{pretty_struct.pformat(msg)}'
)
Use `DebugStatus` around subactor lock requests Breaks out all the (sub)actor local conc primitives from `Lock` (which is now only used in and by the root actor) such that there's an explicit distinction between a task that's "consuming" the `Lock` (remotely) vs. the root-side service tasks which do the actual acquire on behalf of the requesters. `DebugStatus` changeover deats: ------ - ------ - move all the actor-local vars over `DebugStatus` including: - move `_trio_handler` and `_orig_sigint_handler` - `local_task_in_debug` now `repl_task` - `_debugger_request_cs` now `req_cs` - `local_pdb_complete` now `repl_release` - drop all ^ fields from `Lock.repr()` obvi.. - move over the `.[un]shield_sigint()` and `.is_main_trio_thread()` methods. - add some new attrs/meths: - `DebugStatus.repl` for the currently running `Pdb` in-actor singleton. - `.repr()` for pprint of state (like `Lock`). - Note: that even when a root-actor task is in REPL, the `DebugStatus` is still used for certain actor-local state mgmt, such as SIGINT handler shielding. - obvi change all lock-requester code bits to now use a `DebugStatus` in their local actor-state instead of `Lock`, i.e. change usage from `Lock` in `._runtime` and `._root`. - use new `Lock.get_locking_task_cs()` API in when checking for sub-in-debug from `._runtime.Actor._stream_handler()`. Unrelated to topic-at-hand tweaks: ------ - ------ - drop the commented bits about hiding `@[a]cm` stack frames from `_debug.pause()` and simplify to only one block with the `shield` passthrough since we already solved the issue with cancel-scopes using `@pdbp.hideframe` B) - this includes all the extra logging about the extra frame for the user (good thing i put in that wasted effort back then eh..) - put the `try/except BaseException` with `log.exception()` around the whole of `._pause()` to ensure we don't miss in-func errors which can cause hangs.. - allow passing in `portal: Portal` to `Actor.start_remote_task()` such that `Portal` task spawning methods are always denoted correctly in terms of `Context.side`. - lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.
2024-04-18 16:47:28 +00:00
await chan.send(msg)
# NOTE wait on first `StartAck` response msg and validate;
# this should be immediate and does not (yet) wait for the
# remote child task to sync via `Context.started()`.
with trio.fail_after(ack_timeout):
First draft "payload receiver in a new `.msg._ops` As per much tinkering, re-designs and preceding rubber-ducking via many "commit msg novelas", **finally** this adds the (hopefully) final missing layer for typed msg safety: `tractor.msg._ops.PldRx` (or `PayloadReceiver`? haven't decided how verbose to go..) Design justification summary: ------ - ------ - need a way to be as-close-as-possible to the `tractor`-application such that when `MsgType.pld: PayloadT` validation takes place, it is straightforward and obvious how user code can decide to handle any resulting `MsgTypeError`. - there should be a common and optional-yet-modular way to modify **how** data delivered via IPC (possibly embedded as user defined, type-constrained `.pld: msgspec.Struct`s) can be handled and processed during fault conditions and/or IPC "msg attacks". - support for nested type constraints within a `MsgType.pld` field should be simple to define, implement and understand at runtime. - a layer between the app-level IPC primitive APIs (`Context`/`MsgStream`) and application-task code (consumer code of those APIs) should be easily customized and prove-to-be-as-such through demonstrably rigorous internal (sub-sys) use! -> eg. via seemless runtime RPC eps support like `Actor.cancel()` -> by correctly implementing our `.devx._debug.Lock` REPL TTY mgmt dialog prot, via a dead simple payload-as-ctl-msg-spec. There are some fairly detailed doc strings included so I won't duplicate that content, the majority of the work here is actually somewhat of a factoring of many similar blocks that are doing more or less the same `msg = await Context._rx_chan.receive()` with boilerplate for `Error`/`Stop` handling via `_raise_from_no_key_in_msg()`. The new `PldRx` basically provides a shim layer for this common "receive msg, decode its payload, yield it up to the consuming app task" by pairing the RPC feeder mem-chan with a msg-payload decoder and expecting IPC API internals to use **one** API instead of re-implementing the same pattern all over the place XD `PldRx` breakdown ------ - ------ - for now only expects a `._msgdec: MsgDec` which allows for override-able `MsgType.pld` validation and most obviously used in the impl of `.dec_msg()`, the decode message method. - provides multiple mem-chan receive options including: |_ `.recv_pld()` which does the e2e operation of receiving a payload item. |_ a sync `.recv_pld_nowait()` version. |_ a `.recv_msg_w_pld()` which optionally allows retreiving both the shuttling `MsgType` as well as it's `.pld` body for use cases where info on both is important (eg. draining a `MsgStream`). Dirty internal changeover/implementation deatz: ------ - ------ - obvi move over all the IPC "primitives" that previously had the duplicate recv-n-yield logic: - `MsgStream.receive[_nowait]()` delegating instead to the equivalent `PldRx.recv_pld[_nowait]()`. - add `Context._pld_rx: PldRx`, created and passed in by `mk_context()`; use it for the `.started()` -> `first: Started` retrieval inside `open_context_from_portal()`. - all the relevant `Portal` invocation methods: `.result()`, `.run_from_ns()`, `.run()`; also allows for dropping `_unwrap_msg()` and `.Portal_return_once()` outright Bo - rename `Context.ctx._recv_chan` -> `._rx_chan`. - add detailed `Context._scope` info for logging whether or not it's cancelled inside `_maybe_cancel_and_set_remote_error()`. - move `._context._drain_to_final_msg()` -> `._ops.drain_to_final_msg()` since it's really not necessarily ctx specific per say, and it does kinda fit with "msg operations" more abstractly ;)
2024-04-23 21:43:45 +00:00
first_msg: msgtypes.StartAck = await ctx._rx_chan.receive()
2024-04-02 17:41:52 +00:00
try:
functype: str = first_msg.functype
except AttributeError:
raise unpack_error(first_msg, chan)
2024-04-02 17:41:52 +00:00
if functype not in (
'asyncfunc',
'asyncgen',
'context',
):
2024-04-02 17:41:52 +00:00
raise ValueError(
Use `DebugStatus` around subactor lock requests Breaks out all the (sub)actor local conc primitives from `Lock` (which is now only used in and by the root actor) such that there's an explicit distinction between a task that's "consuming" the `Lock` (remotely) vs. the root-side service tasks which do the actual acquire on behalf of the requesters. `DebugStatus` changeover deats: ------ - ------ - move all the actor-local vars over `DebugStatus` including: - move `_trio_handler` and `_orig_sigint_handler` - `local_task_in_debug` now `repl_task` - `_debugger_request_cs` now `req_cs` - `local_pdb_complete` now `repl_release` - drop all ^ fields from `Lock.repr()` obvi.. - move over the `.[un]shield_sigint()` and `.is_main_trio_thread()` methods. - add some new attrs/meths: - `DebugStatus.repl` for the currently running `Pdb` in-actor singleton. - `.repr()` for pprint of state (like `Lock`). - Note: that even when a root-actor task is in REPL, the `DebugStatus` is still used for certain actor-local state mgmt, such as SIGINT handler shielding. - obvi change all lock-requester code bits to now use a `DebugStatus` in their local actor-state instead of `Lock`, i.e. change usage from `Lock` in `._runtime` and `._root`. - use new `Lock.get_locking_task_cs()` API in when checking for sub-in-debug from `._runtime.Actor._stream_handler()`. Unrelated to topic-at-hand tweaks: ------ - ------ - drop the commented bits about hiding `@[a]cm` stack frames from `_debug.pause()` and simplify to only one block with the `shield` passthrough since we already solved the issue with cancel-scopes using `@pdbp.hideframe` B) - this includes all the extra logging about the extra frame for the user (good thing i put in that wasted effort back then eh..) - put the `try/except BaseException` with `log.exception()` around the whole of `._pause()` to ensure we don't miss in-func errors which can cause hangs.. - allow passing in `portal: Portal` to `Actor.start_remote_task()` such that `Portal` task spawning methods are always denoted correctly in terms of `Context.side`. - lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.
2024-04-18 16:47:28 +00:00
f'Invalid `StartAck.functype: str = {first_msg!r}` ??'
2024-04-02 17:41:52 +00:00
)
ctx._remote_func_type = functype
return ctx
2018-07-14 20:09:05 +00:00
async def _from_parent(
self,
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
parent_addr: UnwrappedAddress|None,
) -> tuple[
Channel,
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
list[UnwrappedAddress]|None,
list[str]|None, # preferred tpts
]:
'''
Bootstrap this local actor's runtime config from its parent by
connecting back via the IPC transport, handshaking and then
`Channel.recv()`-ing seeded data.
'''
try:
# Connect back to the parent actor and conduct initial
# handshake. From this point on if we error, we
# attempt to ship the exception back to the parent.
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
chan = await Channel.from_addr(
addr=wrap_address(parent_addr)
)
assert isinstance(chan, Channel)
# init handshake: swap actor-IDs.
await chan._do_handshake(aid=self.aid)
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
accept_addrs: list[UnwrappedAddress]|None = None
2024-04-02 17:41:52 +00:00
if self._spawn_method == "trio":
2024-04-02 17:41:52 +00:00
# Receive post-spawn runtime state from our parent.
2024-04-02 17:41:52 +00:00
spawnspec: msgtypes.SpawnSpec = await chan.recv()
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
match spawnspec:
case MsgTypeError():
raise spawnspec
case msgtypes.SpawnSpec():
self._spawn_spec = spawnspec
log.runtime(
'Received runtime spec from parent:\n\n'
# TODO: eventually all these msgs as
# `msgspec.Struct` with a special mode that
# pformats them in multi-line mode, BUT only
# if "trace"/"util" mode is enabled?
f'{pretty_struct.pformat(spawnspec)}\n'
)
2024-04-02 17:41:52 +00:00
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
case _:
raise InternalError(
f'Received invalid non-`SpawnSpec` payload !?\n'
f'{spawnspec}\n'
)
# ^^XXX TODO XXX^^^
# when the `SpawnSpec` fails to decode the above will
# raise a `MsgTypeError` which if we do NOT ALSO
# RAISE it will tried to be pprinted in the
# log.runtime() below..
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
#
# SO we gotta look at how other `chan.recv()` calls
# are wrapped and do the same for this spec receive!
# -[ ] see `._rpc` likely has the answer?
# ^^^XXX NOTE XXX^^^, can't be called here!
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
#
# breakpoint()
# import pdbp; pdbp.set_trace()
#
# => bc we haven't yet received the
# `spawnspec._runtime_vars` which contains
# `debug_mode: bool`..
# `SpawnSpec.bind_addrs`
# ---------------------
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
accept_addrs: list[UnwrappedAddress] = spawnspec.bind_addrs
Improved log msg formatting in core As part of solving some final edge cases todo with inter-peer remote cancellation (particularly a remote cancel from a separate actor tree-client hanging on the request side in `modden`..) I needed less dense, more line-delimited log msg formats when understanding ipc channel and context cancels from console logging; this adds a ton of that to: - `._invoke()` which now does, - better formatting of `Context`-task info as multi-line `'<field>: <value>\n'` messages, - use of `trio.Task` (from `.lowlevel.current_task()` for full rpc-func namespace-path info, - better "msg flow annotations" with `<=` for understanding `ContextCancelled` flow. - `Actor._stream_handler()` where in we break down IPC peers reporting better as multi-line `|_<Channel>` log msgs instead of all jammed on one line.. - `._ipc.Channel.send()` use `pformat()` for repr of packet. Also tweak some optional deps imports for debug mode: - add `maybe_import_gb()` for attempting to import `greenback`. - maybe enable `stackscope` tree pprinter on `SIGUSR1` if installed. Add a further stale-debugger-lock guard before removal: - read the `._debug.Lock.global_actor_in_debug: tuple` uid and possibly `maybe_wait_for_debugger()` when the child-user is known to have a live process in our tree. - only cancel `Lock._root_local_task_cs_in_debug: CancelScope` when the disconnected channel maps to the `Lock.global_actor_in_debug`, though not sure this is correct yet? Started adding missing type annots in sections that were modified.
2024-02-19 17:25:08 +00:00
# `SpawnSpec._runtime_vars`
# -------------------------
# => update process-wide globals
# TODO! -[ ] another `Struct` for rtvs..
rvs: dict[str, Any] = spawnspec._runtime_vars
Improved log msg formatting in core As part of solving some final edge cases todo with inter-peer remote cancellation (particularly a remote cancel from a separate actor tree-client hanging on the request side in `modden`..) I needed less dense, more line-delimited log msg formats when understanding ipc channel and context cancels from console logging; this adds a ton of that to: - `._invoke()` which now does, - better formatting of `Context`-task info as multi-line `'<field>: <value>\n'` messages, - use of `trio.Task` (from `.lowlevel.current_task()` for full rpc-func namespace-path info, - better "msg flow annotations" with `<=` for understanding `ContextCancelled` flow. - `Actor._stream_handler()` where in we break down IPC peers reporting better as multi-line `|_<Channel>` log msgs instead of all jammed on one line.. - `._ipc.Channel.send()` use `pformat()` for repr of packet. Also tweak some optional deps imports for debug mode: - add `maybe_import_gb()` for attempting to import `greenback`. - maybe enable `stackscope` tree pprinter on `SIGUSR1` if installed. Add a further stale-debugger-lock guard before removal: - read the `._debug.Lock.global_actor_in_debug: tuple` uid and possibly `maybe_wait_for_debugger()` when the child-user is known to have a live process in our tree. - only cancel `Lock._root_local_task_cs_in_debug: CancelScope` when the disconnected channel maps to the `Lock.global_actor_in_debug`, though not sure this is correct yet? Started adding missing type annots in sections that were modified.
2024-02-19 17:25:08 +00:00
if rvs['_debug_mode']:
from .devx import (
enable_stack_on_sig,
maybe_init_greenback,
)
Improved log msg formatting in core As part of solving some final edge cases todo with inter-peer remote cancellation (particularly a remote cancel from a separate actor tree-client hanging on the request side in `modden`..) I needed less dense, more line-delimited log msg formats when understanding ipc channel and context cancels from console logging; this adds a ton of that to: - `._invoke()` which now does, - better formatting of `Context`-task info as multi-line `'<field>: <value>\n'` messages, - use of `trio.Task` (from `.lowlevel.current_task()` for full rpc-func namespace-path info, - better "msg flow annotations" with `<=` for understanding `ContextCancelled` flow. - `Actor._stream_handler()` where in we break down IPC peers reporting better as multi-line `|_<Channel>` log msgs instead of all jammed on one line.. - `._ipc.Channel.send()` use `pformat()` for repr of packet. Also tweak some optional deps imports for debug mode: - add `maybe_import_gb()` for attempting to import `greenback`. - maybe enable `stackscope` tree pprinter on `SIGUSR1` if installed. Add a further stale-debugger-lock guard before removal: - read the `._debug.Lock.global_actor_in_debug: tuple` uid and possibly `maybe_wait_for_debugger()` when the child-user is known to have a live process in our tree. - only cancel `Lock._root_local_task_cs_in_debug: CancelScope` when the disconnected channel maps to the `Lock.global_actor_in_debug`, though not sure this is correct yet? Started adding missing type annots in sections that were modified.
2024-02-19 17:25:08 +00:00
try:
# TODO: maybe return some status msgs upward
# to that we can emit them in `con_status`
# instead?
log.devx(
2024-04-02 17:41:52 +00:00
'Enabling `stackscope` traces on SIGUSR1'
)
Improved log msg formatting in core As part of solving some final edge cases todo with inter-peer remote cancellation (particularly a remote cancel from a separate actor tree-client hanging on the request side in `modden`..) I needed less dense, more line-delimited log msg formats when understanding ipc channel and context cancels from console logging; this adds a ton of that to: - `._invoke()` which now does, - better formatting of `Context`-task info as multi-line `'<field>: <value>\n'` messages, - use of `trio.Task` (from `.lowlevel.current_task()` for full rpc-func namespace-path info, - better "msg flow annotations" with `<=` for understanding `ContextCancelled` flow. - `Actor._stream_handler()` where in we break down IPC peers reporting better as multi-line `|_<Channel>` log msgs instead of all jammed on one line.. - `._ipc.Channel.send()` use `pformat()` for repr of packet. Also tweak some optional deps imports for debug mode: - add `maybe_import_gb()` for attempting to import `greenback`. - maybe enable `stackscope` tree pprinter on `SIGUSR1` if installed. Add a further stale-debugger-lock guard before removal: - read the `._debug.Lock.global_actor_in_debug: tuple` uid and possibly `maybe_wait_for_debugger()` when the child-user is known to have a live process in our tree. - only cancel `Lock._root_local_task_cs_in_debug: CancelScope` when the disconnected channel maps to the `Lock.global_actor_in_debug`, though not sure this is correct yet? Started adding missing type annots in sections that were modified.
2024-02-19 17:25:08 +00:00
enable_stack_on_sig()
Improved log msg formatting in core As part of solving some final edge cases todo with inter-peer remote cancellation (particularly a remote cancel from a separate actor tree-client hanging on the request side in `modden`..) I needed less dense, more line-delimited log msg formats when understanding ipc channel and context cancels from console logging; this adds a ton of that to: - `._invoke()` which now does, - better formatting of `Context`-task info as multi-line `'<field>: <value>\n'` messages, - use of `trio.Task` (from `.lowlevel.current_task()` for full rpc-func namespace-path info, - better "msg flow annotations" with `<=` for understanding `ContextCancelled` flow. - `Actor._stream_handler()` where in we break down IPC peers reporting better as multi-line `|_<Channel>` log msgs instead of all jammed on one line.. - `._ipc.Channel.send()` use `pformat()` for repr of packet. Also tweak some optional deps imports for debug mode: - add `maybe_import_gb()` for attempting to import `greenback`. - maybe enable `stackscope` tree pprinter on `SIGUSR1` if installed. Add a further stale-debugger-lock guard before removal: - read the `._debug.Lock.global_actor_in_debug: tuple` uid and possibly `maybe_wait_for_debugger()` when the child-user is known to have a live process in our tree. - only cancel `Lock._root_local_task_cs_in_debug: CancelScope` when the disconnected channel maps to the `Lock.global_actor_in_debug`, though not sure this is correct yet? Started adding missing type annots in sections that were modified.
2024-02-19 17:25:08 +00:00
except ImportError:
log.warning(
'`stackscope` not installed for use in debug mode!'
)
if rvs.get('use_greenback', False):
maybe_mod: ModuleType|None = await maybe_init_greenback()
if maybe_mod:
log.devx(
'Activated `greenback` '
'for `tractor.pause_from_sync()` support!'
)
else:
rvs['use_greenback'] = False
log.warning(
'`greenback` not installed for use in debug mode!\n'
'`tractor.pause_from_sync()` not available!'
)
# XXX ensure the "infected `asyncio` mode" setting
# passed down from our spawning parent is consistent
# with `trio`-runtime initialization:
# - during sub-proc boot, the entrypoint func
# (`._entry.<spawn_backend>_main()`) should set
# `._infected_aio = True` before calling
# `run_as_asyncio_guest()`,
# - the value of `infect_asyncio: bool = True` as
# passed to `ActorNursery.start_actor()` must be
# the same as `_runtime_vars['_is_infected_aio']`
if (
(aio_rtv := rvs['_is_infected_aio'])
!=
(aio_attr := self._infected_aio)
):
raise InternalError(
'Parent sent runtime-vars that mismatch for the '
'"infected `asyncio` mode" settings ?!?\n\n'
f'rvs["_is_infected_aio"] = {aio_rtv}\n'
f'self._infected_aio = {aio_attr}\n'
)
if aio_rtv:
assert (
trio_runtime.GLOBAL_RUN_CONTEXT.runner.is_guest
# and
# ^TODO^ possibly add a `sniffio` or
# `trio` pub-API for `is_guest_mode()`?
)
rvs['_is_root'] = False # obvi XD
_state._runtime_vars.update(rvs)
# `SpawnSpec.reg_addrs`
# ---------------------
# => update parent provided registrar contact info
2024-04-02 17:41:52 +00:00
#
self.reg_addrs = [
# TODO: we don't really NEED these as tuples?
# so we can probably drop this casting since
# apparently in python lists are "more
# efficient"?
tuple(val)
for val in spawnspec.reg_addrs
]
# `SpawnSpec.enable_modules`
# ---------------------
# => extend RPC-python-module (capabilities) with
# those permitted by parent.
#
# NOTE, only the root actor should have
# a pre-permitted entry for `.devx.debug._tty_lock`.
assert not self.enable_modules
self.enable_modules.update(
spawnspec.enable_modules
)
self._parent_main_data = spawnspec._parent_main_data
# XXX QUESTION(s)^^^
# -[ ] already set in `.__init__()` right, but how is
# it diff from this blatant parent copy?
# -[ ] do we need/want the .__init__() value in
# just the root case orr?
2024-04-02 17:41:52 +00:00
return (
chan,
accept_addrs,
Use `enable_transports: list[str]` parameter Actually applying the input it in the root as well as all sub-actors by passing it down to sub-actors through runtime-vars as delivered by the initial `SpawnSpec` msg during child runtime init. Impl deats, - add a new `_state._runtime_vars['_enable_tpts']: list[str]` field set by the input param (if provided) to `.open_root_actor()`. - mk `current_ipc_protos()` return the runtime-var entry with instead the default in the `_runtime_vars: dict` set to `[_def_tpt_proto]`. - in `.open_root_actor()`, still error on this being a >1 `list[str]` until we have more testing infra/suites to audit multi-protos per actor. - return the new value (as 3rd element) from `Actor._from_parent()` as per the todo note; means `_runtime.async_main()` will allocate `accept_addrs` as tpt-specific `Address` entries and pass them to `IPCServer.listen_on()`. Also, - also add a new `_state._runtime_vars['_root_addrs']: list = []` field with the intent of fully replacing the `'_root_mailbox'` field since, * it will need to be a collection to support multi-tpt, * it's a more cohesive field name alongside `_registry_addrs`, * the root actor of every tree needs to have a dedicated addr set (separate from any host-singleton registry actor) so that all its subs can contact it for capabilities mgmt including debugger access/locking. - in the root, populate the field in `._runtime.async_main()` and for now just set '_root_mailbox' to the first entry in that list in anticipation of future multi-homing/transport support.
2025-06-17 15:33:36 +00:00
_state._runtime_vars['_enable_tpts']
2024-04-02 17:41:52 +00:00
)
# failed to connect back?
except (
OSError,
ConnectionError,
):
log.warning(
f'Failed to connect to spawning parent actor!?\n'
f'\n'
f'x=> {parent_addr}\n'
f' |_{self}\n\n'
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
)
await self.cancel(req_chan=None) # self cancel
raise
def cancel_soon(self) -> None:
2022-08-03 20:09:16 +00:00
'''
Cancel this actor asap; can be called from a sync context.
Schedules runtime cancellation via `Actor.cancel()` inside
the RPC service nursery.
2022-08-03 20:09:16 +00:00
'''
2021-04-27 16:07:16 +00:00
assert self._service_n
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
self._service_n.start_soon(
self.cancel,
None, # self cancel all rpc tasks
)
Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` **but** with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.
2023-04-13 20:03:35 +00:00
async def cancel(
self,
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
# chan whose lifetime limits the lifetime of its remotely
# requested and locally spawned RPC tasks - similar to the
# supervision semantics of a nursery wherein the actual
# implementation does start all such tasks in a sub-nursery.
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
req_chan: Channel|None,
Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` **but** with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.
2023-04-13 20:03:35 +00:00
) -> bool:
2022-08-03 20:09:16 +00:00
'''
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
Cancel this actor's runtime, eventually resulting in
termination of its containing OS process.
2018-08-01 19:15:18 +00:00
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
The ideal "deterministic" teardown sequence in order is:
- cancel all ongoing rpc tasks by cancel scope.
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
- cancel the channel server to prevent new inbound
connections.
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
- cancel the "service" nursery reponsible for
spawning new rpc tasks.
- return control the parent channel message loop.
2022-08-03 20:09:16 +00:00
'''
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
(
requesting_uid,
requester_type,
req_chan,
Mega-refactor on `._invoke()` targeting `@context`s Since eventually we want to implement all other RPC "func types" as contexts underneath this starts the rework to move all the other cases into a separate func not only to simplify the main `._invoke()` body but also as a reminder of the intention to do it XD Details of re-factor: - add a new `._invoke_non_context()` which just moves all the old blocks for non-context handling to a single def. - factor what was basically just the `finally:` block handler (doing all the task bookkeeping) into a new `@acm`: `_errors_relayed_via_ipc()` with that content packed into the post-`yield` (also with a `hide_tb: bool` flag added of course). * include a `debug_kbis: bool` for when needed. - since the `@context` block is the only type left in the main `_invoke()` body, de-dent it so it's more grok-able B) Obviously this patch also includes a few improvements regarding context-cancellation-semantics (for the `context` RPC case) on the callee side in order to match previous changes to the `Context` api: - always setting any ctxc as the `Context._local_error`. - using the new convenience `.maybe_raise()` topically (for now). - avoiding any previous reliance on `Context.cancelled_caught` for anything public of meaning. Further included is more logging content updates: - being pedantic in `.cancel()` msgs about whether termination is caused by error or ctxc. - optional `._invoke()` traceback hiding via a `hide_tb: bool`. - simpler log headers throughout instead leveraging new `.__repr__()` on primitives. - buncha `<= <actor-uid>` sent some message emissions. - simplified handshake statuses reporting. Other subsys api changes we need to match: - change to `Channel.transport`. - avoiding any `local_nursery: ActorNursery` waiting when the `._implicit_runtime_started` is set. And yes, lotsa more comments for #TODOs dawg.. since there's always somethin!
2024-03-03 00:26:40 +00:00
log_meth,
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
) = (
req_chan.uid,
'peer',
req_chan,
Mega-refactor on `._invoke()` targeting `@context`s Since eventually we want to implement all other RPC "func types" as contexts underneath this starts the rework to move all the other cases into a separate func not only to simplify the main `._invoke()` body but also as a reminder of the intention to do it XD Details of re-factor: - add a new `._invoke_non_context()` which just moves all the old blocks for non-context handling to a single def. - factor what was basically just the `finally:` block handler (doing all the task bookkeeping) into a new `@acm`: `_errors_relayed_via_ipc()` with that content packed into the post-`yield` (also with a `hide_tb: bool` flag added of course). * include a `debug_kbis: bool` for when needed. - since the `@context` block is the only type left in the main `_invoke()` body, de-dent it so it's more grok-able B) Obviously this patch also includes a few improvements regarding context-cancellation-semantics (for the `context` RPC case) on the callee side in order to match previous changes to the `Context` api: - always setting any ctxc as the `Context._local_error`. - using the new convenience `.maybe_raise()` topically (for now). - avoiding any previous reliance on `Context.cancelled_caught` for anything public of meaning. Further included is more logging content updates: - being pedantic in `.cancel()` msgs about whether termination is caused by error or ctxc. - optional `._invoke()` traceback hiding via a `hide_tb: bool`. - simpler log headers throughout instead leveraging new `.__repr__()` on primitives. - buncha `<= <actor-uid>` sent some message emissions. - simplified handshake statuses reporting. Other subsys api changes we need to match: - change to `Channel.transport`. - avoiding any `local_nursery: ActorNursery` waiting when the `._implicit_runtime_started` is set. And yes, lotsa more comments for #TODOs dawg.. since there's always somethin!
2024-03-03 00:26:40 +00:00
log.cancel,
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
) if req_chan else (
# a self cancel of ALL rpc tasks
self.uid,
'self',
Mega-refactor on `._invoke()` targeting `@context`s Since eventually we want to implement all other RPC "func types" as contexts underneath this starts the rework to move all the other cases into a separate func not only to simplify the main `._invoke()` body but also as a reminder of the intention to do it XD Details of re-factor: - add a new `._invoke_non_context()` which just moves all the old blocks for non-context handling to a single def. - factor what was basically just the `finally:` block handler (doing all the task bookkeeping) into a new `@acm`: `_errors_relayed_via_ipc()` with that content packed into the post-`yield` (also with a `hide_tb: bool` flag added of course). * include a `debug_kbis: bool` for when needed. - since the `@context` block is the only type left in the main `_invoke()` body, de-dent it so it's more grok-able B) Obviously this patch also includes a few improvements regarding context-cancellation-semantics (for the `context` RPC case) on the callee side in order to match previous changes to the `Context` api: - always setting any ctxc as the `Context._local_error`. - using the new convenience `.maybe_raise()` topically (for now). - avoiding any previous reliance on `Context.cancelled_caught` for anything public of meaning. Further included is more logging content updates: - being pedantic in `.cancel()` msgs about whether termination is caused by error or ctxc. - optional `._invoke()` traceback hiding via a `hide_tb: bool`. - simpler log headers throughout instead leveraging new `.__repr__()` on primitives. - buncha `<= <actor-uid>` sent some message emissions. - simplified handshake statuses reporting. Other subsys api changes we need to match: - change to `Channel.transport`. - avoiding any `local_nursery: ActorNursery` waiting when the `._implicit_runtime_started` is set. And yes, lotsa more comments for #TODOs dawg.. since there's always somethin!
2024-03-03 00:26:40 +00:00
self,
log.runtime,
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
)
Mega-refactor on `._invoke()` targeting `@context`s Since eventually we want to implement all other RPC "func types" as contexts underneath this starts the rework to move all the other cases into a separate func not only to simplify the main `._invoke()` body but also as a reminder of the intention to do it XD Details of re-factor: - add a new `._invoke_non_context()` which just moves all the old blocks for non-context handling to a single def. - factor what was basically just the `finally:` block handler (doing all the task bookkeeping) into a new `@acm`: `_errors_relayed_via_ipc()` with that content packed into the post-`yield` (also with a `hide_tb: bool` flag added of course). * include a `debug_kbis: bool` for when needed. - since the `@context` block is the only type left in the main `_invoke()` body, de-dent it so it's more grok-able B) Obviously this patch also includes a few improvements regarding context-cancellation-semantics (for the `context` RPC case) on the callee side in order to match previous changes to the `Context` api: - always setting any ctxc as the `Context._local_error`. - using the new convenience `.maybe_raise()` topically (for now). - avoiding any previous reliance on `Context.cancelled_caught` for anything public of meaning. Further included is more logging content updates: - being pedantic in `.cancel()` msgs about whether termination is caused by error or ctxc. - optional `._invoke()` traceback hiding via a `hide_tb: bool`. - simpler log headers throughout instead leveraging new `.__repr__()` on primitives. - buncha `<= <actor-uid>` sent some message emissions. - simplified handshake statuses reporting. Other subsys api changes we need to match: - change to `Channel.transport`. - avoiding any `local_nursery: ActorNursery` waiting when the `._implicit_runtime_started` is set. And yes, lotsa more comments for #TODOs dawg.. since there's always somethin!
2024-03-03 00:26:40 +00:00
# TODO: just use the new `Context.repr_rpc: str` (and
# other) repr fields instead of doing this all manual..
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
msg: str = (
2024-07-04 19:06:15 +00:00
f'Actor-runtime cancel request from {requester_type}\n\n'
f'<=c) {requesting_uid}\n'
f' |_{self}\n'
f'\n'
Be mega-pedantic with `ContextCancelled` semantics As part of extremely detailed inter-peer-actor testing, add much more granular `Context` cancellation state tracking via the following (new) fields: - `.canceller: tuple[str, str]` the uuid of the actor responsible for the cancellation condition - always set by `Context._maybe_cancel_and_set_remote_error()` and replaces `._cancelled_remote` and `.cancel_called_remote`. If set, this value should normally always match a value from some `ContextCancelled` raised or caught by one side of the context. - `._local_error` which is always set to the locally raised (and caller or callee task's scope-internal) error which caused any eventual cancellation/error condition and thus any closure of the context's per-task-side-`trio.Nursery`. - `.cancelled_caught: bool` is now always `True` whenever the local task catches (or "silently absorbs") a `ContextCancelled` (a `ctxc`) that indeed originated from one of the context's linked tasks or any other context which raised its own `ctxc` in the current `.open_context()` scope. => whenever there is a case that no `ContextCancelled` was raised **in** the `.open_context().__aexit__()` (eg. `ctx.result()` called after a call `ctx.cancel()`), we still consider the context's as having "caught a cancellation" since the `ctxc` was indeed silently handled by the cancel requester; all other error cases are already represented by mirroring the state of the `._scope: trio.CancelScope` => IOW there should be **no case** where an error is **not raised** in the context's scope and `.cancelled_caught: bool == False`, i.e. no case where `._scope.cancelled_caught == False and ._local_error is not None`! - always raise any `ctxc` from `.open_stream()` if `._cancel_called == True` - if the cancellation request has not already resulted in a `._remote_error: ContextCancelled` we raise a `RuntimeError` to indicate improper usage to the guilty side's task code. - make `._maybe_raise_remote_err()` a sync func and don't raise any `ctxc` which is matched against a `.canceller` determined to be the current actor, aka a "self cancel", and always set the `._local_error` to any such `ctxc`. - `.side: str` taken from inside `.cancel()` and unused as of now since it might be better re-written as a similar `.is_opener() -> bool`? - drop unused `._started_received: bool`.. - TONS and TONS of detailed comments/docs to attempt to explain all the possible cancellation/exit cases and how they should exhibit as either silent closes or raises from the `Context` API! Adjust the `._runtime._invoke()` code to match: - use `ctx._maybe_raise_remote_err()` in `._invoke()`. - adjust to new `.canceller` property. - more type hints. - better `log.cancel()` msging around self-cancels vs. peer-cancels. - always set the `._local_error: BaseException` for the "callee" task just like `Portal.open_context()` now will do B) Prior we were raising any `Context._remote_error` directly and doing (more or less) the same `ContextCancelled` "absorbing" logic (well kinda) in block; instead delegate to the method
2023-10-23 18:35:36 +00:00
)
# TODO: what happens here when we self-cancel tho?
self._cancel_called_by_remote: tuple = requesting_uid
self._cancel_called = True
2018-08-01 19:15:18 +00:00
# cancel all ongoing rpc tasks
Guarding for IPC failures in `._runtime._invoke()` Took me longer then i wanted to figure out the source of a failed-response to a remote-cancellation (in this case in `modden` where a client was cancelling a workspace layer.. but disconnects before receiving the ack msg) that was triggering an IPC error when sending the error msg for the cancellation of a `Actor._cancel_task()`, but since this (non-rpc) `._invoke()` task was trying to send to a now disconnected canceller it was resulting in a `BrokenPipeError` (or similar) error. Now, we except for such IPC errors and only raise them when, 1. the transport `Channel` is for sure up (bc ow what's the point of trying to send an error on the thing that caused it..) 2. it's definitely for handling an RPC task Similarly if the entire main invoke `try:` excepts, - we only hide the call-stack frame from the debugger (with `__tracebackhide__: bool`) if it's an RPC task that has a connected channel since we always want to see the frame when debugging internal task or IPC failures. - we don't bother trying to send errors to the context caller (actor) when it's a non-RPC request since failures on actor-runtime-internal tasks shouldn't really ever be reported remotely, only maybe raised locally. Also some other tidying, - this properly corrects for the self-cancel case where an RPC context is cancelled due to a local (runtime) task calling a method like `Actor.cancel_soon()`. We now set our own `.uid` as the `ContextCancelled.canceller` value so that other-end tasks know that the cancellation was due to a self-cancellation by the actor itself. We still need to properly test for this though! - add a more detailed module doc-str. - more explicit imports for `trio` core types throughout.
2024-01-02 14:08:39 +00:00
with CancelScope(shield=True):
# kill any debugger request task to avoid deadlock
# with the root actor in this tree
debug_req = debug.DebugStatus
lock_req_ctx: Context = debug_req.req_ctx
if (
lock_req_ctx
and
lock_req_ctx.has_outcome
):
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
msg += (
f'\n'
f'-> Cancelling active debugger request..\n'
f'|_{debug.Lock.repr()}\n\n'
f'|_{lock_req_ctx}\n\n'
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
)
# lock_req_ctx._scope.cancel()
# TODO: wrap this in a method-API..
debug_req.req_cs.cancel()
# if lock_req_ctx:
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
# self-cancel **all** ongoing RPC tasks
Be mega-pedantic with `ContextCancelled` semantics As part of extremely detailed inter-peer-actor testing, add much more granular `Context` cancellation state tracking via the following (new) fields: - `.canceller: tuple[str, str]` the uuid of the actor responsible for the cancellation condition - always set by `Context._maybe_cancel_and_set_remote_error()` and replaces `._cancelled_remote` and `.cancel_called_remote`. If set, this value should normally always match a value from some `ContextCancelled` raised or caught by one side of the context. - `._local_error` which is always set to the locally raised (and caller or callee task's scope-internal) error which caused any eventual cancellation/error condition and thus any closure of the context's per-task-side-`trio.Nursery`. - `.cancelled_caught: bool` is now always `True` whenever the local task catches (or "silently absorbs") a `ContextCancelled` (a `ctxc`) that indeed originated from one of the context's linked tasks or any other context which raised its own `ctxc` in the current `.open_context()` scope. => whenever there is a case that no `ContextCancelled` was raised **in** the `.open_context().__aexit__()` (eg. `ctx.result()` called after a call `ctx.cancel()`), we still consider the context's as having "caught a cancellation" since the `ctxc` was indeed silently handled by the cancel requester; all other error cases are already represented by mirroring the state of the `._scope: trio.CancelScope` => IOW there should be **no case** where an error is **not raised** in the context's scope and `.cancelled_caught: bool == False`, i.e. no case where `._scope.cancelled_caught == False and ._local_error is not None`! - always raise any `ctxc` from `.open_stream()` if `._cancel_called == True` - if the cancellation request has not already resulted in a `._remote_error: ContextCancelled` we raise a `RuntimeError` to indicate improper usage to the guilty side's task code. - make `._maybe_raise_remote_err()` a sync func and don't raise any `ctxc` which is matched against a `.canceller` determined to be the current actor, aka a "self cancel", and always set the `._local_error` to any such `ctxc`. - `.side: str` taken from inside `.cancel()` and unused as of now since it might be better re-written as a similar `.is_opener() -> bool`? - drop unused `._started_received: bool`.. - TONS and TONS of detailed comments/docs to attempt to explain all the possible cancellation/exit cases and how they should exhibit as either silent closes or raises from the `Context` API! Adjust the `._runtime._invoke()` code to match: - use `ctx._maybe_raise_remote_err()` in `._invoke()`. - adjust to new `.canceller` property. - more type hints. - better `log.cancel()` msging around self-cancels vs. peer-cancels. - always set the `._local_error: BaseException` for the "callee" task just like `Portal.open_context()` now will do B) Prior we were raising any `Context._remote_error` directly and doing (more or less) the same `ContextCancelled` "absorbing" logic (well kinda) in block; instead delegate to the method
2023-10-23 18:35:36 +00:00
await self.cancel_rpc_tasks(
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
req_uid=requesting_uid,
parent_chan=None,
Be mega-pedantic with `ContextCancelled` semantics As part of extremely detailed inter-peer-actor testing, add much more granular `Context` cancellation state tracking via the following (new) fields: - `.canceller: tuple[str, str]` the uuid of the actor responsible for the cancellation condition - always set by `Context._maybe_cancel_and_set_remote_error()` and replaces `._cancelled_remote` and `.cancel_called_remote`. If set, this value should normally always match a value from some `ContextCancelled` raised or caught by one side of the context. - `._local_error` which is always set to the locally raised (and caller or callee task's scope-internal) error which caused any eventual cancellation/error condition and thus any closure of the context's per-task-side-`trio.Nursery`. - `.cancelled_caught: bool` is now always `True` whenever the local task catches (or "silently absorbs") a `ContextCancelled` (a `ctxc`) that indeed originated from one of the context's linked tasks or any other context which raised its own `ctxc` in the current `.open_context()` scope. => whenever there is a case that no `ContextCancelled` was raised **in** the `.open_context().__aexit__()` (eg. `ctx.result()` called after a call `ctx.cancel()`), we still consider the context's as having "caught a cancellation" since the `ctxc` was indeed silently handled by the cancel requester; all other error cases are already represented by mirroring the state of the `._scope: trio.CancelScope` => IOW there should be **no case** where an error is **not raised** in the context's scope and `.cancelled_caught: bool == False`, i.e. no case where `._scope.cancelled_caught == False and ._local_error is not None`! - always raise any `ctxc` from `.open_stream()` if `._cancel_called == True` - if the cancellation request has not already resulted in a `._remote_error: ContextCancelled` we raise a `RuntimeError` to indicate improper usage to the guilty side's task code. - make `._maybe_raise_remote_err()` a sync func and don't raise any `ctxc` which is matched against a `.canceller` determined to be the current actor, aka a "self cancel", and always set the `._local_error` to any such `ctxc`. - `.side: str` taken from inside `.cancel()` and unused as of now since it might be better re-written as a similar `.is_opener() -> bool`? - drop unused `._started_received: bool`.. - TONS and TONS of detailed comments/docs to attempt to explain all the possible cancellation/exit cases and how they should exhibit as either silent closes or raises from the `Context` API! Adjust the `._runtime._invoke()` code to match: - use `ctx._maybe_raise_remote_err()` in `._invoke()`. - adjust to new `.canceller` property. - more type hints. - better `log.cancel()` msging around self-cancels vs. peer-cancels. - always set the `._local_error: BaseException` for the "callee" task just like `Portal.open_context()` now will do B) Prior we were raising any `Context._remote_error` directly and doing (more or less) the same `ContextCancelled` "absorbing" logic (well kinda) in block; instead delegate to the method
2023-10-23 18:35:36 +00:00
)
2020-08-09 00:57:18 +00:00
# stop channel server
Factor actor-embedded IPC-tpt-server to `ipc` subsys Primarily moving the `Actor._serve_forever()`-task-as-method and supporting actor-instance attributes to a new `.ipo._server` sub-mod which now encapsulates, - the coupling various `trio.Nursery`s (and their independent lifetime mgmt) to different `trio.serve_listener()`s tasks and `SocketStream` handler scopes. - `Address` and `SocketListener` mgmt and tracking through the idea of an "IPC endpoint": each "bound-and-active instance" of a served-listener for some (varied transport protocol's socket) address. - start and shutdown of the entire server's lifetime via an `@acm`. - delegation of starting/stopping tpt-protocol-specific `trio.abc.Listener`s to the corresponding `.ipc._<proto_key>` sub-module (newly defined mod-top-level instead of `Address` method) `start/close_listener()` funcs. Impl details of the `.ipc._server` sub-sys, - add new `IPCServer`, allocated with `open_ipc_server()`, and which encapsulates starting multiple-transport-proto-`trio.abc.Listener`s from an input set of `._addr.Address`s using, |_`IPCServer.listen_on()` which internally spawns tasks that delegate to a new `_serve_ipc_eps()`, a rework of what was (effectively) `Actor._serve_forever()` and which now, * allocates a new `IPCEndpoint`-struct (see below) for each address-listener pair alongside the specified listener-serving/stream-handling `trio.Nursery`s provided by the caller. * starts and stops each transport (socket's) listener by calling `IPCEndpoint.start/close_listener()` which in turn delegates to the underlying `inspect.getmodule(IPCEndpoint.addr)` backend tpt module's equivalent impl. * tracks all created endpoints in a `._endpoints: list[IPCEndpoint]` which is further exposed through public properties for introspection of served transport-protocols and their addresses. |_`IPCServer._[parent/stream_handler]_tn: Nursery`s which are either allocated (in which case, as the same instance) or provided by the caller of `open_ipc_server()` such that the same nursery-cancel-scope controls offered by `trio.serve_listeners(handler_nursery=)` are offered where the `._parent_tn` is used to spawn `_serve_ipc_eps()` tasks, and `._stream_handler_tn` is passed verbatim as `handler_nursery`. - a new `IPCEndpoint`-struct (as mentioned) which wraps each transport-proto's address + listener + allocated-supervising-nursery to encapsulate the "lifetime of a server IPC endpoint" such that eventually we can track and managed per-protocol/address/`.listen_on()`-call scoped starts/stops/restarts for the purposes of filtering/banning peer traffic. |_ also included is an unused `.peer_tpts` table which we can hopefully use to replace `Actor._peers` in a `Channel`-tracking transport-proto-aware way! Surrounding changes to `.ipc.*` primitives to match, - make `[TCP|UDS]Address` types `msgspec.Struct(frozen=True)` and thus drop any-and-all `addr._host =` style mutation throughout. |_ as such also drop their `.__init__()` and `.__eq__()` meths. |_ UDS tweaks to field names and thus `.__repr__()`. - move `[TCP|UDS]Address.[start/close]_listener()` meths to be mod-level equiv `start|close_listener()` funcs. - just hard code the `.ipc._types._key_to_transport/._addr_to_transport` table entries instead of all the prior fancy dynamic class property reading stuff (remember, "explicit is better then implicit"). Modified in `._runtime.Actor` internals, - drop the `._serve_forever()` and `.cancel_server()`, methods and `._server_down` waiting logic from `.cancel_soon()` - add `.[_]ipc_server` which is opened just after the `._service_n` and delegate to it for any equivalent publicly exposed instance attributes/properties.
2025-04-10 22:06:12 +00:00
if ipc_server := self.ipc_server:
ipc_server.cancel()
await ipc_server.wait_for_shutdown()
2020-08-09 00:57:18 +00:00
# cancel all rpc tasks permanently
if self._service_n:
self._service_n.cancel_scope.cancel()
Mega-refactor on `._invoke()` targeting `@context`s Since eventually we want to implement all other RPC "func types" as contexts underneath this starts the rework to move all the other cases into a separate func not only to simplify the main `._invoke()` body but also as a reminder of the intention to do it XD Details of re-factor: - add a new `._invoke_non_context()` which just moves all the old blocks for non-context handling to a single def. - factor what was basically just the `finally:` block handler (doing all the task bookkeeping) into a new `@acm`: `_errors_relayed_via_ipc()` with that content packed into the post-`yield` (also with a `hide_tb: bool` flag added of course). * include a `debug_kbis: bool` for when needed. - since the `@context` block is the only type left in the main `_invoke()` body, de-dent it so it's more grok-able B) Obviously this patch also includes a few improvements regarding context-cancellation-semantics (for the `context` RPC case) on the callee side in order to match previous changes to the `Context` api: - always setting any ctxc as the `Context._local_error`. - using the new convenience `.maybe_raise()` topically (for now). - avoiding any previous reliance on `Context.cancelled_caught` for anything public of meaning. Further included is more logging content updates: - being pedantic in `.cancel()` msgs about whether termination is caused by error or ctxc. - optional `._invoke()` traceback hiding via a `hide_tb: bool`. - simpler log headers throughout instead leveraging new `.__repr__()` on primitives. - buncha `<= <actor-uid>` sent some message emissions. - simplified handshake statuses reporting. Other subsys api changes we need to match: - change to `Channel.transport`. - avoiding any `local_nursery: ActorNursery` waiting when the `._implicit_runtime_started` is set. And yes, lotsa more comments for #TODOs dawg.. since there's always somethin!
2024-03-03 00:26:40 +00:00
log_meth(msg)
self._cancel_complete.set()
return True
# XXX: hard kill logic if needed?
# def _hard_mofo_kill(self):
# # If we're the root actor or zombied kill everything
# if self._parent_chan is None: # TODO: more robust check
# root = trio.lowlevel.current_root_task()
# for n in root.child_nurseries:
# n.cancel_scope.cancel()
2018-07-14 20:09:05 +00:00
Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` **but** with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.
2023-04-13 20:03:35 +00:00
async def _cancel_task(
self,
cid: str,
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
parent_chan: Channel,
requesting_uid: tuple[str, str]|None,
ipc_msg: dict|None|bool = False,
Be mega-pedantic with `ContextCancelled` semantics As part of extremely detailed inter-peer-actor testing, add much more granular `Context` cancellation state tracking via the following (new) fields: - `.canceller: tuple[str, str]` the uuid of the actor responsible for the cancellation condition - always set by `Context._maybe_cancel_and_set_remote_error()` and replaces `._cancelled_remote` and `.cancel_called_remote`. If set, this value should normally always match a value from some `ContextCancelled` raised or caught by one side of the context. - `._local_error` which is always set to the locally raised (and caller or callee task's scope-internal) error which caused any eventual cancellation/error condition and thus any closure of the context's per-task-side-`trio.Nursery`. - `.cancelled_caught: bool` is now always `True` whenever the local task catches (or "silently absorbs") a `ContextCancelled` (a `ctxc`) that indeed originated from one of the context's linked tasks or any other context which raised its own `ctxc` in the current `.open_context()` scope. => whenever there is a case that no `ContextCancelled` was raised **in** the `.open_context().__aexit__()` (eg. `ctx.result()` called after a call `ctx.cancel()`), we still consider the context's as having "caught a cancellation" since the `ctxc` was indeed silently handled by the cancel requester; all other error cases are already represented by mirroring the state of the `._scope: trio.CancelScope` => IOW there should be **no case** where an error is **not raised** in the context's scope and `.cancelled_caught: bool == False`, i.e. no case where `._scope.cancelled_caught == False and ._local_error is not None`! - always raise any `ctxc` from `.open_stream()` if `._cancel_called == True` - if the cancellation request has not already resulted in a `._remote_error: ContextCancelled` we raise a `RuntimeError` to indicate improper usage to the guilty side's task code. - make `._maybe_raise_remote_err()` a sync func and don't raise any `ctxc` which is matched against a `.canceller` determined to be the current actor, aka a "self cancel", and always set the `._local_error` to any such `ctxc`. - `.side: str` taken from inside `.cancel()` and unused as of now since it might be better re-written as a similar `.is_opener() -> bool`? - drop unused `._started_received: bool`.. - TONS and TONS of detailed comments/docs to attempt to explain all the possible cancellation/exit cases and how they should exhibit as either silent closes or raises from the `Context` API! Adjust the `._runtime._invoke()` code to match: - use `ctx._maybe_raise_remote_err()` in `._invoke()`. - adjust to new `.canceller` property. - more type hints. - better `log.cancel()` msging around self-cancels vs. peer-cancels. - always set the `._local_error: BaseException` for the "callee" task just like `Portal.open_context()` now will do B) Prior we were raising any `Context._remote_error` directly and doing (more or less) the same `ContextCancelled` "absorbing" logic (well kinda) in block; instead delegate to the method
2023-10-23 18:35:36 +00:00
Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` **but** with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.
2023-04-13 20:03:35 +00:00
) -> bool:
'''
Cancel a local (RPC) task by context-id/channel by calling
`trio.CancelScope.cancel()` on it's surrounding cancel
scope.
'''
# this ctx based lookup ensures the requested task to be
# cancelled was indeed spawned by a request from its
# parent (or some grandparent's) channel
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
ctx: Context
func: Callable
is_complete: trio.Event
try:
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
(
ctx,
func,
is_complete,
) = self._rpc_tasks[(
parent_chan,
cid,
)]
Guarding for IPC failures in `._runtime._invoke()` Took me longer then i wanted to figure out the source of a failed-response to a remote-cancellation (in this case in `modden` where a client was cancelling a workspace layer.. but disconnects before receiving the ack msg) that was triggering an IPC error when sending the error msg for the cancellation of a `Actor._cancel_task()`, but since this (non-rpc) `._invoke()` task was trying to send to a now disconnected canceller it was resulting in a `BrokenPipeError` (or similar) error. Now, we except for such IPC errors and only raise them when, 1. the transport `Channel` is for sure up (bc ow what's the point of trying to send an error on the thing that caused it..) 2. it's definitely for handling an RPC task Similarly if the entire main invoke `try:` excepts, - we only hide the call-stack frame from the debugger (with `__tracebackhide__: bool`) if it's an RPC task that has a connected channel since we always want to see the frame when debugging internal task or IPC failures. - we don't bother trying to send errors to the context caller (actor) when it's a non-RPC request since failures on actor-runtime-internal tasks shouldn't really ever be reported remotely, only maybe raised locally. Also some other tidying, - this properly corrects for the self-cancel case where an RPC context is cancelled due to a local (runtime) task calling a method like `Actor.cancel_soon()`. We now set our own `.uid` as the `ContextCancelled.canceller` value so that other-end tasks know that the cancellation was due to a self-cancellation by the actor itself. We still need to properly test for this though! - add a more detailed module doc-str. - more explicit imports for `trio` core types throughout.
2024-01-02 14:08:39 +00:00
scope: CancelScope = ctx._scope
except KeyError:
# NOTE: during msging race conditions this will often
# emit, some examples:
Use `DebugStatus` around subactor lock requests Breaks out all the (sub)actor local conc primitives from `Lock` (which is now only used in and by the root actor) such that there's an explicit distinction between a task that's "consuming" the `Lock` (remotely) vs. the root-side service tasks which do the actual acquire on behalf of the requesters. `DebugStatus` changeover deats: ------ - ------ - move all the actor-local vars over `DebugStatus` including: - move `_trio_handler` and `_orig_sigint_handler` - `local_task_in_debug` now `repl_task` - `_debugger_request_cs` now `req_cs` - `local_pdb_complete` now `repl_release` - drop all ^ fields from `Lock.repr()` obvi.. - move over the `.[un]shield_sigint()` and `.is_main_trio_thread()` methods. - add some new attrs/meths: - `DebugStatus.repl` for the currently running `Pdb` in-actor singleton. - `.repr()` for pprint of state (like `Lock`). - Note: that even when a root-actor task is in REPL, the `DebugStatus` is still used for certain actor-local state mgmt, such as SIGINT handler shielding. - obvi change all lock-requester code bits to now use a `DebugStatus` in their local actor-state instead of `Lock`, i.e. change usage from `Lock` in `._runtime` and `._root`. - use new `Lock.get_locking_task_cs()` API in when checking for sub-in-debug from `._runtime.Actor._stream_handler()`. Unrelated to topic-at-hand tweaks: ------ - ------ - drop the commented bits about hiding `@[a]cm` stack frames from `_debug.pause()` and simplify to only one block with the `shield` passthrough since we already solved the issue with cancel-scopes using `@pdbp.hideframe` B) - this includes all the extra logging about the extra frame for the user (good thing i put in that wasted effort back then eh..) - put the `try/except BaseException` with `log.exception()` around the whole of `._pause()` to ensure we don't miss in-func errors which can cause hangs.. - allow passing in `portal: Portal` to `Actor.start_remote_task()` such that `Portal` task spawning methods are always denoted correctly in terms of `Context.side`. - lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.
2024-04-18 16:47:28 +00:00
# - child returns a result before cancel-msg/ctxc-raised
# - child self raises ctxc before parent send request,
# - child errors prior to cancel req.
log.runtime(
'Cancel request for invalid RPC task.\n'
'The task likely already completed or was never started!\n\n'
f'<= canceller: {requesting_uid}\n'
f'=> {cid}@{parent_chan.uid}\n'
f' |_{parent_chan}\n'
)
Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` **but** with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.
2023-04-13 20:03:35 +00:00
return True
log.cancel(
'Rxed cancel request for RPC task\n'
f'<=c) {requesting_uid}\n'
2024-07-04 19:06:15 +00:00
f' |_{ctx._task}\n'
f' >> {ctx.repr_rpc}\n'
# f'=> {ctx._task}\n'
# f' >> Actor._cancel_task() => {ctx._task}\n'
# f' |_ {ctx._task}\n\n'
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
# TODO: better ascii repr for "supervisor" like
# a nursery or context scope?
Mega-refactor on `._invoke()` targeting `@context`s Since eventually we want to implement all other RPC "func types" as contexts underneath this starts the rework to move all the other cases into a separate func not only to simplify the main `._invoke()` body but also as a reminder of the intention to do it XD Details of re-factor: - add a new `._invoke_non_context()` which just moves all the old blocks for non-context handling to a single def. - factor what was basically just the `finally:` block handler (doing all the task bookkeeping) into a new `@acm`: `_errors_relayed_via_ipc()` with that content packed into the post-`yield` (also with a `hide_tb: bool` flag added of course). * include a `debug_kbis: bool` for when needed. - since the `@context` block is the only type left in the main `_invoke()` body, de-dent it so it's more grok-able B) Obviously this patch also includes a few improvements regarding context-cancellation-semantics (for the `context` RPC case) on the callee side in order to match previous changes to the `Context` api: - always setting any ctxc as the `Context._local_error`. - using the new convenience `.maybe_raise()` topically (for now). - avoiding any previous reliance on `Context.cancelled_caught` for anything public of meaning. Further included is more logging content updates: - being pedantic in `.cancel()` msgs about whether termination is caused by error or ctxc. - optional `._invoke()` traceback hiding via a `hide_tb: bool`. - simpler log headers throughout instead leveraging new `.__repr__()` on primitives. - buncha `<= <actor-uid>` sent some message emissions. - simplified handshake statuses reporting. Other subsys api changes we need to match: - change to `Channel.transport`. - avoiding any `local_nursery: ActorNursery` waiting when the `._implicit_runtime_started` is set. And yes, lotsa more comments for #TODOs dawg.. since there's always somethin!
2024-03-03 00:26:40 +00:00
# f'=> {parent_chan}\n'
# f' |_{ctx._task}\n'
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
# TODO: simplified `Context.__repr__()` fields output
# shows only application state-related stuff like,
# - ._stream
# - .closed
# - .started_called
# - .. etc.
# f' >> {ctx.repr_rpc}\n'
Mega-refactor on `._invoke()` targeting `@context`s Since eventually we want to implement all other RPC "func types" as contexts underneath this starts the rework to move all the other cases into a separate func not only to simplify the main `._invoke()` body but also as a reminder of the intention to do it XD Details of re-factor: - add a new `._invoke_non_context()` which just moves all the old blocks for non-context handling to a single def. - factor what was basically just the `finally:` block handler (doing all the task bookkeeping) into a new `@acm`: `_errors_relayed_via_ipc()` with that content packed into the post-`yield` (also with a `hide_tb: bool` flag added of course). * include a `debug_kbis: bool` for when needed. - since the `@context` block is the only type left in the main `_invoke()` body, de-dent it so it's more grok-able B) Obviously this patch also includes a few improvements regarding context-cancellation-semantics (for the `context` RPC case) on the callee side in order to match previous changes to the `Context` api: - always setting any ctxc as the `Context._local_error`. - using the new convenience `.maybe_raise()` topically (for now). - avoiding any previous reliance on `Context.cancelled_caught` for anything public of meaning. Further included is more logging content updates: - being pedantic in `.cancel()` msgs about whether termination is caused by error or ctxc. - optional `._invoke()` traceback hiding via a `hide_tb: bool`. - simpler log headers throughout instead leveraging new `.__repr__()` on primitives. - buncha `<= <actor-uid>` sent some message emissions. - simplified handshake statuses reporting. Other subsys api changes we need to match: - change to `Channel.transport`. - avoiding any `local_nursery: ActorNursery` waiting when the `._implicit_runtime_started` is set. And yes, lotsa more comments for #TODOs dawg.. since there's always somethin!
2024-03-03 00:26:40 +00:00
# f' |_ctx: {cid}\n'
# f' >> {ctx._nsf}()\n'
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
)
Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` **but** with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.
2023-04-13 20:03:35 +00:00
if (
Be mega-pedantic with `ContextCancelled` semantics As part of extremely detailed inter-peer-actor testing, add much more granular `Context` cancellation state tracking via the following (new) fields: - `.canceller: tuple[str, str]` the uuid of the actor responsible for the cancellation condition - always set by `Context._maybe_cancel_and_set_remote_error()` and replaces `._cancelled_remote` and `.cancel_called_remote`. If set, this value should normally always match a value from some `ContextCancelled` raised or caught by one side of the context. - `._local_error` which is always set to the locally raised (and caller or callee task's scope-internal) error which caused any eventual cancellation/error condition and thus any closure of the context's per-task-side-`trio.Nursery`. - `.cancelled_caught: bool` is now always `True` whenever the local task catches (or "silently absorbs") a `ContextCancelled` (a `ctxc`) that indeed originated from one of the context's linked tasks or any other context which raised its own `ctxc` in the current `.open_context()` scope. => whenever there is a case that no `ContextCancelled` was raised **in** the `.open_context().__aexit__()` (eg. `ctx.result()` called after a call `ctx.cancel()`), we still consider the context's as having "caught a cancellation" since the `ctxc` was indeed silently handled by the cancel requester; all other error cases are already represented by mirroring the state of the `._scope: trio.CancelScope` => IOW there should be **no case** where an error is **not raised** in the context's scope and `.cancelled_caught: bool == False`, i.e. no case where `._scope.cancelled_caught == False and ._local_error is not None`! - always raise any `ctxc` from `.open_stream()` if `._cancel_called == True` - if the cancellation request has not already resulted in a `._remote_error: ContextCancelled` we raise a `RuntimeError` to indicate improper usage to the guilty side's task code. - make `._maybe_raise_remote_err()` a sync func and don't raise any `ctxc` which is matched against a `.canceller` determined to be the current actor, aka a "self cancel", and always set the `._local_error` to any such `ctxc`. - `.side: str` taken from inside `.cancel()` and unused as of now since it might be better re-written as a similar `.is_opener() -> bool`? - drop unused `._started_received: bool`.. - TONS and TONS of detailed comments/docs to attempt to explain all the possible cancellation/exit cases and how they should exhibit as either silent closes or raises from the `Context` API! Adjust the `._runtime._invoke()` code to match: - use `ctx._maybe_raise_remote_err()` in `._invoke()`. - adjust to new `.canceller` property. - more type hints. - better `log.cancel()` msging around self-cancels vs. peer-cancels. - always set the `._local_error: BaseException` for the "callee" task just like `Portal.open_context()` now will do B) Prior we were raising any `Context._remote_error` directly and doing (more or less) the same `ContextCancelled` "absorbing" logic (well kinda) in block; instead delegate to the method
2023-10-23 18:35:36 +00:00
ctx._canceller is None
Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` **but** with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.
2023-04-13 20:03:35 +00:00
and requesting_uid
):
Be mega-pedantic with `ContextCancelled` semantics As part of extremely detailed inter-peer-actor testing, add much more granular `Context` cancellation state tracking via the following (new) fields: - `.canceller: tuple[str, str]` the uuid of the actor responsible for the cancellation condition - always set by `Context._maybe_cancel_and_set_remote_error()` and replaces `._cancelled_remote` and `.cancel_called_remote`. If set, this value should normally always match a value from some `ContextCancelled` raised or caught by one side of the context. - `._local_error` which is always set to the locally raised (and caller or callee task's scope-internal) error which caused any eventual cancellation/error condition and thus any closure of the context's per-task-side-`trio.Nursery`. - `.cancelled_caught: bool` is now always `True` whenever the local task catches (or "silently absorbs") a `ContextCancelled` (a `ctxc`) that indeed originated from one of the context's linked tasks or any other context which raised its own `ctxc` in the current `.open_context()` scope. => whenever there is a case that no `ContextCancelled` was raised **in** the `.open_context().__aexit__()` (eg. `ctx.result()` called after a call `ctx.cancel()`), we still consider the context's as having "caught a cancellation" since the `ctxc` was indeed silently handled by the cancel requester; all other error cases are already represented by mirroring the state of the `._scope: trio.CancelScope` => IOW there should be **no case** where an error is **not raised** in the context's scope and `.cancelled_caught: bool == False`, i.e. no case where `._scope.cancelled_caught == False and ._local_error is not None`! - always raise any `ctxc` from `.open_stream()` if `._cancel_called == True` - if the cancellation request has not already resulted in a `._remote_error: ContextCancelled` we raise a `RuntimeError` to indicate improper usage to the guilty side's task code. - make `._maybe_raise_remote_err()` a sync func and don't raise any `ctxc` which is matched against a `.canceller` determined to be the current actor, aka a "self cancel", and always set the `._local_error` to any such `ctxc`. - `.side: str` taken from inside `.cancel()` and unused as of now since it might be better re-written as a similar `.is_opener() -> bool`? - drop unused `._started_received: bool`.. - TONS and TONS of detailed comments/docs to attempt to explain all the possible cancellation/exit cases and how they should exhibit as either silent closes or raises from the `Context` API! Adjust the `._runtime._invoke()` code to match: - use `ctx._maybe_raise_remote_err()` in `._invoke()`. - adjust to new `.canceller` property. - more type hints. - better `log.cancel()` msging around self-cancels vs. peer-cancels. - always set the `._local_error: BaseException` for the "callee" task just like `Portal.open_context()` now will do B) Prior we were raising any `Context._remote_error` directly and doing (more or less) the same `ContextCancelled` "absorbing" logic (well kinda) in block; instead delegate to the method
2023-10-23 18:35:36 +00:00
ctx._canceller: tuple = requesting_uid
Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` **but** with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.
2023-04-13 20:03:35 +00:00
# TODO: pack the RPC `{'cmd': <blah>}` msg into a ctxc and
# then raise and pack it here?
if (
ipc_msg
and ctx._cancel_msg is None
):
# assign RPC msg directly from the loop which usually
# the case with `ctx.cancel()` on the other side.
ctx._cancel_msg = ipc_msg
# don't allow cancelling this function mid-execution
# (is this necessary?)
if func is self._cancel_task:
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
log.error('Do not cancel a cancel!?')
Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` **but** with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.
2023-04-13 20:03:35 +00:00
return True
Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` **but** with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.
2023-04-13 20:03:35 +00:00
# TODO: shouldn't we eventually be calling ``Context.cancel()``
# directly here instead (since that method can handle both
# side's calls into it?
Be mega-pedantic with `ContextCancelled` semantics As part of extremely detailed inter-peer-actor testing, add much more granular `Context` cancellation state tracking via the following (new) fields: - `.canceller: tuple[str, str]` the uuid of the actor responsible for the cancellation condition - always set by `Context._maybe_cancel_and_set_remote_error()` and replaces `._cancelled_remote` and `.cancel_called_remote`. If set, this value should normally always match a value from some `ContextCancelled` raised or caught by one side of the context. - `._local_error` which is always set to the locally raised (and caller or callee task's scope-internal) error which caused any eventual cancellation/error condition and thus any closure of the context's per-task-side-`trio.Nursery`. - `.cancelled_caught: bool` is now always `True` whenever the local task catches (or "silently absorbs") a `ContextCancelled` (a `ctxc`) that indeed originated from one of the context's linked tasks or any other context which raised its own `ctxc` in the current `.open_context()` scope. => whenever there is a case that no `ContextCancelled` was raised **in** the `.open_context().__aexit__()` (eg. `ctx.result()` called after a call `ctx.cancel()`), we still consider the context's as having "caught a cancellation" since the `ctxc` was indeed silently handled by the cancel requester; all other error cases are already represented by mirroring the state of the `._scope: trio.CancelScope` => IOW there should be **no case** where an error is **not raised** in the context's scope and `.cancelled_caught: bool == False`, i.e. no case where `._scope.cancelled_caught == False and ._local_error is not None`! - always raise any `ctxc` from `.open_stream()` if `._cancel_called == True` - if the cancellation request has not already resulted in a `._remote_error: ContextCancelled` we raise a `RuntimeError` to indicate improper usage to the guilty side's task code. - make `._maybe_raise_remote_err()` a sync func and don't raise any `ctxc` which is matched against a `.canceller` determined to be the current actor, aka a "self cancel", and always set the `._local_error` to any such `ctxc`. - `.side: str` taken from inside `.cancel()` and unused as of now since it might be better re-written as a similar `.is_opener() -> bool`? - drop unused `._started_received: bool`.. - TONS and TONS of detailed comments/docs to attempt to explain all the possible cancellation/exit cases and how they should exhibit as either silent closes or raises from the `Context` API! Adjust the `._runtime._invoke()` code to match: - use `ctx._maybe_raise_remote_err()` in `._invoke()`. - adjust to new `.canceller` property. - more type hints. - better `log.cancel()` msging around self-cancels vs. peer-cancels. - always set the `._local_error: BaseException` for the "callee" task just like `Portal.open_context()` now will do B) Prior we were raising any `Context._remote_error` directly and doing (more or less) the same `ContextCancelled` "absorbing" logic (well kinda) in block; instead delegate to the method
2023-10-23 18:35:36 +00:00
# await ctx.cancel()
scope.cancel()
# wait for _invoke to mark the task complete
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
flow_info: str = (
f'<= canceller: {requesting_uid}\n'
f'=> ipc-parent: {parent_chan}\n'
Use `DebugStatus` around subactor lock requests Breaks out all the (sub)actor local conc primitives from `Lock` (which is now only used in and by the root actor) such that there's an explicit distinction between a task that's "consuming" the `Lock` (remotely) vs. the root-side service tasks which do the actual acquire on behalf of the requesters. `DebugStatus` changeover deats: ------ - ------ - move all the actor-local vars over `DebugStatus` including: - move `_trio_handler` and `_orig_sigint_handler` - `local_task_in_debug` now `repl_task` - `_debugger_request_cs` now `req_cs` - `local_pdb_complete` now `repl_release` - drop all ^ fields from `Lock.repr()` obvi.. - move over the `.[un]shield_sigint()` and `.is_main_trio_thread()` methods. - add some new attrs/meths: - `DebugStatus.repl` for the currently running `Pdb` in-actor singleton. - `.repr()` for pprint of state (like `Lock`). - Note: that even when a root-actor task is in REPL, the `DebugStatus` is still used for certain actor-local state mgmt, such as SIGINT handler shielding. - obvi change all lock-requester code bits to now use a `DebugStatus` in their local actor-state instead of `Lock`, i.e. change usage from `Lock` in `._runtime` and `._root`. - use new `Lock.get_locking_task_cs()` API in when checking for sub-in-debug from `._runtime.Actor._stream_handler()`. Unrelated to topic-at-hand tweaks: ------ - ------ - drop the commented bits about hiding `@[a]cm` stack frames from `_debug.pause()` and simplify to only one block with the `shield` passthrough since we already solved the issue with cancel-scopes using `@pdbp.hideframe` B) - this includes all the extra logging about the extra frame for the user (good thing i put in that wasted effort back then eh..) - put the `try/except BaseException` with `log.exception()` around the whole of `._pause()` to ensure we don't miss in-func errors which can cause hangs.. - allow passing in `portal: Portal` to `Actor.start_remote_task()` such that `Portal` task spawning methods are always denoted correctly in terms of `Context.side`. - lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.
2024-04-18 16:47:28 +00:00
f'|_{ctx}\n'
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
)
log.runtime(
Use `DebugStatus` around subactor lock requests Breaks out all the (sub)actor local conc primitives from `Lock` (which is now only used in and by the root actor) such that there's an explicit distinction between a task that's "consuming" the `Lock` (remotely) vs. the root-side service tasks which do the actual acquire on behalf of the requesters. `DebugStatus` changeover deats: ------ - ------ - move all the actor-local vars over `DebugStatus` including: - move `_trio_handler` and `_orig_sigint_handler` - `local_task_in_debug` now `repl_task` - `_debugger_request_cs` now `req_cs` - `local_pdb_complete` now `repl_release` - drop all ^ fields from `Lock.repr()` obvi.. - move over the `.[un]shield_sigint()` and `.is_main_trio_thread()` methods. - add some new attrs/meths: - `DebugStatus.repl` for the currently running `Pdb` in-actor singleton. - `.repr()` for pprint of state (like `Lock`). - Note: that even when a root-actor task is in REPL, the `DebugStatus` is still used for certain actor-local state mgmt, such as SIGINT handler shielding. - obvi change all lock-requester code bits to now use a `DebugStatus` in their local actor-state instead of `Lock`, i.e. change usage from `Lock` in `._runtime` and `._root`. - use new `Lock.get_locking_task_cs()` API in when checking for sub-in-debug from `._runtime.Actor._stream_handler()`. Unrelated to topic-at-hand tweaks: ------ - ------ - drop the commented bits about hiding `@[a]cm` stack frames from `_debug.pause()` and simplify to only one block with the `shield` passthrough since we already solved the issue with cancel-scopes using `@pdbp.hideframe` B) - this includes all the extra logging about the extra frame for the user (good thing i put in that wasted effort back then eh..) - put the `try/except BaseException` with `log.exception()` around the whole of `._pause()` to ensure we don't miss in-func errors which can cause hangs.. - allow passing in `portal: Portal` to `Actor.start_remote_task()` such that `Portal` task spawning methods are always denoted correctly in terms of `Context.side`. - lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.
2024-04-18 16:47:28 +00:00
'Waiting on RPC task to cancel\n\n'
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
f'{flow_info}'
Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` **but** with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.
2023-04-13 20:03:35 +00:00
)
await is_complete.wait()
log.runtime(
Use `DebugStatus` around subactor lock requests Breaks out all the (sub)actor local conc primitives from `Lock` (which is now only used in and by the root actor) such that there's an explicit distinction between a task that's "consuming" the `Lock` (remotely) vs. the root-side service tasks which do the actual acquire on behalf of the requesters. `DebugStatus` changeover deats: ------ - ------ - move all the actor-local vars over `DebugStatus` including: - move `_trio_handler` and `_orig_sigint_handler` - `local_task_in_debug` now `repl_task` - `_debugger_request_cs` now `req_cs` - `local_pdb_complete` now `repl_release` - drop all ^ fields from `Lock.repr()` obvi.. - move over the `.[un]shield_sigint()` and `.is_main_trio_thread()` methods. - add some new attrs/meths: - `DebugStatus.repl` for the currently running `Pdb` in-actor singleton. - `.repr()` for pprint of state (like `Lock`). - Note: that even when a root-actor task is in REPL, the `DebugStatus` is still used for certain actor-local state mgmt, such as SIGINT handler shielding. - obvi change all lock-requester code bits to now use a `DebugStatus` in their local actor-state instead of `Lock`, i.e. change usage from `Lock` in `._runtime` and `._root`. - use new `Lock.get_locking_task_cs()` API in when checking for sub-in-debug from `._runtime.Actor._stream_handler()`. Unrelated to topic-at-hand tweaks: ------ - ------ - drop the commented bits about hiding `@[a]cm` stack frames from `_debug.pause()` and simplify to only one block with the `shield` passthrough since we already solved the issue with cancel-scopes using `@pdbp.hideframe` B) - this includes all the extra logging about the extra frame for the user (good thing i put in that wasted effort back then eh..) - put the `try/except BaseException` with `log.exception()` around the whole of `._pause()` to ensure we don't miss in-func errors which can cause hangs.. - allow passing in `portal: Portal` to `Actor.start_remote_task()` such that `Portal` task spawning methods are always denoted correctly in terms of `Context.side`. - lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.
2024-04-18 16:47:28 +00:00
f'Sucessfully cancelled RPC task\n\n'
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
f'{flow_info}'
)
Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` **but** with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.
2023-04-13 20:03:35 +00:00
return True
async def cancel_rpc_tasks(
self,
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
req_uid: tuple[str, str],
# NOTE: when None is passed we cancel **all** rpc
# tasks running in this actor!
parent_chan: Channel|None,
Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` **but** with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.
2023-04-13 20:03:35 +00:00
) -> None:
2022-08-03 20:09:16 +00:00
'''
Cancel all ongoing RPC tasks owned/spawned for a given
`parent_chan: Channel` or simply all tasks (inside
`._service_n`) when `parent_chan=None`.
2022-08-03 20:09:16 +00:00
'''
Be mega-pedantic with `ContextCancelled` semantics As part of extremely detailed inter-peer-actor testing, add much more granular `Context` cancellation state tracking via the following (new) fields: - `.canceller: tuple[str, str]` the uuid of the actor responsible for the cancellation condition - always set by `Context._maybe_cancel_and_set_remote_error()` and replaces `._cancelled_remote` and `.cancel_called_remote`. If set, this value should normally always match a value from some `ContextCancelled` raised or caught by one side of the context. - `._local_error` which is always set to the locally raised (and caller or callee task's scope-internal) error which caused any eventual cancellation/error condition and thus any closure of the context's per-task-side-`trio.Nursery`. - `.cancelled_caught: bool` is now always `True` whenever the local task catches (or "silently absorbs") a `ContextCancelled` (a `ctxc`) that indeed originated from one of the context's linked tasks or any other context which raised its own `ctxc` in the current `.open_context()` scope. => whenever there is a case that no `ContextCancelled` was raised **in** the `.open_context().__aexit__()` (eg. `ctx.result()` called after a call `ctx.cancel()`), we still consider the context's as having "caught a cancellation" since the `ctxc` was indeed silently handled by the cancel requester; all other error cases are already represented by mirroring the state of the `._scope: trio.CancelScope` => IOW there should be **no case** where an error is **not raised** in the context's scope and `.cancelled_caught: bool == False`, i.e. no case where `._scope.cancelled_caught == False and ._local_error is not None`! - always raise any `ctxc` from `.open_stream()` if `._cancel_called == True` - if the cancellation request has not already resulted in a `._remote_error: ContextCancelled` we raise a `RuntimeError` to indicate improper usage to the guilty side's task code. - make `._maybe_raise_remote_err()` a sync func and don't raise any `ctxc` which is matched against a `.canceller` determined to be the current actor, aka a "self cancel", and always set the `._local_error` to any such `ctxc`. - `.side: str` taken from inside `.cancel()` and unused as of now since it might be better re-written as a similar `.is_opener() -> bool`? - drop unused `._started_received: bool`.. - TONS and TONS of detailed comments/docs to attempt to explain all the possible cancellation/exit cases and how they should exhibit as either silent closes or raises from the `Context` API! Adjust the `._runtime._invoke()` code to match: - use `ctx._maybe_raise_remote_err()` in `._invoke()`. - adjust to new `.canceller` property. - more type hints. - better `log.cancel()` msging around self-cancels vs. peer-cancels. - always set the `._local_error: BaseException` for the "callee" task just like `Portal.open_context()` now will do B) Prior we were raising any `Context._remote_error` directly and doing (more or less) the same `ContextCancelled` "absorbing" logic (well kinda) in block; instead delegate to the method
2023-10-23 18:35:36 +00:00
tasks: dict = self._rpc_tasks
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
if not tasks:
Mega-refactor on `._invoke()` targeting `@context`s Since eventually we want to implement all other RPC "func types" as contexts underneath this starts the rework to move all the other cases into a separate func not only to simplify the main `._invoke()` body but also as a reminder of the intention to do it XD Details of re-factor: - add a new `._invoke_non_context()` which just moves all the old blocks for non-context handling to a single def. - factor what was basically just the `finally:` block handler (doing all the task bookkeeping) into a new `@acm`: `_errors_relayed_via_ipc()` with that content packed into the post-`yield` (also with a `hide_tb: bool` flag added of course). * include a `debug_kbis: bool` for when needed. - since the `@context` block is the only type left in the main `_invoke()` body, de-dent it so it's more grok-able B) Obviously this patch also includes a few improvements regarding context-cancellation-semantics (for the `context` RPC case) on the callee side in order to match previous changes to the `Context` api: - always setting any ctxc as the `Context._local_error`. - using the new convenience `.maybe_raise()` topically (for now). - avoiding any previous reliance on `Context.cancelled_caught` for anything public of meaning. Further included is more logging content updates: - being pedantic in `.cancel()` msgs about whether termination is caused by error or ctxc. - optional `._invoke()` traceback hiding via a `hide_tb: bool`. - simpler log headers throughout instead leveraging new `.__repr__()` on primitives. - buncha `<= <actor-uid>` sent some message emissions. - simplified handshake statuses reporting. Other subsys api changes we need to match: - change to `Channel.transport`. - avoiding any `local_nursery: ActorNursery` waiting when the `._implicit_runtime_started` is set. And yes, lotsa more comments for #TODOs dawg.. since there's always somethin!
2024-03-03 00:26:40 +00:00
log.runtime(
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
'Actor has no cancellable RPC tasks?\n'
2024-02-22 20:06:39 +00:00
f'<= canceller: {req_uid}\n'
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
)
return
Improved log msg formatting in core As part of solving some final edge cases todo with inter-peer remote cancellation (particularly a remote cancel from a separate actor tree-client hanging on the request side in `modden`..) I needed less dense, more line-delimited log msg formats when understanding ipc channel and context cancels from console logging; this adds a ton of that to: - `._invoke()` which now does, - better formatting of `Context`-task info as multi-line `'<field>: <value>\n'` messages, - use of `trio.Task` (from `.lowlevel.current_task()` for full rpc-func namespace-path info, - better "msg flow annotations" with `<=` for understanding `ContextCancelled` flow. - `Actor._stream_handler()` where in we break down IPC peers reporting better as multi-line `|_<Channel>` log msgs instead of all jammed on one line.. - `._ipc.Channel.send()` use `pformat()` for repr of packet. Also tweak some optional deps imports for debug mode: - add `maybe_import_gb()` for attempting to import `greenback`. - maybe enable `stackscope` tree pprinter on `SIGUSR1` if installed. Add a further stale-debugger-lock guard before removal: - read the `._debug.Lock.global_actor_in_debug: tuple` uid and possibly `maybe_wait_for_debugger()` when the child-user is known to have a live process in our tree. - only cancel `Lock._root_local_task_cs_in_debug: CancelScope` when the disconnected channel maps to the `Lock.global_actor_in_debug`, though not sure this is correct yet? Started adding missing type annots in sections that were modified.
2024-02-19 17:25:08 +00:00
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
# TODO: seriously factor this into some helper funcs XD
tasks_str: str = ''
for (ctx, func, _) in tasks.values():
# TODO: std repr of all primitives in
# a hierarchical tree format, since we can!!
# like => repr for funcs/addrs/msg-typing:
#
# -[ ] use a proper utf8 "arm" like
# `stackscope` has!
# -[ ] for typed msging, show the
# py-type-annot style?
# - maybe auto-gen via `inspect` / `typing` type-sig:
# https://stackoverflow.com/a/57110117
# => see ex. code pasted into `.msg.types`
#
# -[ ] proper .maddr() for IPC primitives?
# - `Channel.maddr() -> str:` obvi!
# - `Context.maddr() -> str:`
tasks_str += (
f' |_@ /ipv4/tcp/cid="{ctx.cid[-16:]} .."\n'
f' |>> {ctx._nsf}() -> dict:\n'
Be mega-pedantic with `ContextCancelled` semantics As part of extremely detailed inter-peer-actor testing, add much more granular `Context` cancellation state tracking via the following (new) fields: - `.canceller: tuple[str, str]` the uuid of the actor responsible for the cancellation condition - always set by `Context._maybe_cancel_and_set_remote_error()` and replaces `._cancelled_remote` and `.cancel_called_remote`. If set, this value should normally always match a value from some `ContextCancelled` raised or caught by one side of the context. - `._local_error` which is always set to the locally raised (and caller or callee task's scope-internal) error which caused any eventual cancellation/error condition and thus any closure of the context's per-task-side-`trio.Nursery`. - `.cancelled_caught: bool` is now always `True` whenever the local task catches (or "silently absorbs") a `ContextCancelled` (a `ctxc`) that indeed originated from one of the context's linked tasks or any other context which raised its own `ctxc` in the current `.open_context()` scope. => whenever there is a case that no `ContextCancelled` was raised **in** the `.open_context().__aexit__()` (eg. `ctx.result()` called after a call `ctx.cancel()`), we still consider the context's as having "caught a cancellation" since the `ctxc` was indeed silently handled by the cancel requester; all other error cases are already represented by mirroring the state of the `._scope: trio.CancelScope` => IOW there should be **no case** where an error is **not raised** in the context's scope and `.cancelled_caught: bool == False`, i.e. no case where `._scope.cancelled_caught == False and ._local_error is not None`! - always raise any `ctxc` from `.open_stream()` if `._cancel_called == True` - if the cancellation request has not already resulted in a `._remote_error: ContextCancelled` we raise a `RuntimeError` to indicate improper usage to the guilty side's task code. - make `._maybe_raise_remote_err()` a sync func and don't raise any `ctxc` which is matched against a `.canceller` determined to be the current actor, aka a "self cancel", and always set the `._local_error` to any such `ctxc`. - `.side: str` taken from inside `.cancel()` and unused as of now since it might be better re-written as a similar `.is_opener() -> bool`? - drop unused `._started_received: bool`.. - TONS and TONS of detailed comments/docs to attempt to explain all the possible cancellation/exit cases and how they should exhibit as either silent closes or raises from the `Context` API! Adjust the `._runtime._invoke()` code to match: - use `ctx._maybe_raise_remote_err()` in `._invoke()`. - adjust to new `.canceller` property. - more type hints. - better `log.cancel()` msging around self-cancels vs. peer-cancels. - always set the `._local_error: BaseException` for the "callee" task just like `Portal.open_context()` now will do B) Prior we were raising any `Context._remote_error` directly and doing (more or less) the same `ContextCancelled` "absorbing" logic (well kinda) in block; instead delegate to the method
2023-10-23 18:35:36 +00:00
)
Mega-refactor on `._invoke()` targeting `@context`s Since eventually we want to implement all other RPC "func types" as contexts underneath this starts the rework to move all the other cases into a separate func not only to simplify the main `._invoke()` body but also as a reminder of the intention to do it XD Details of re-factor: - add a new `._invoke_non_context()` which just moves all the old blocks for non-context handling to a single def. - factor what was basically just the `finally:` block handler (doing all the task bookkeeping) into a new `@acm`: `_errors_relayed_via_ipc()` with that content packed into the post-`yield` (also with a `hide_tb: bool` flag added of course). * include a `debug_kbis: bool` for when needed. - since the `@context` block is the only type left in the main `_invoke()` body, de-dent it so it's more grok-able B) Obviously this patch also includes a few improvements regarding context-cancellation-semantics (for the `context` RPC case) on the callee side in order to match previous changes to the `Context` api: - always setting any ctxc as the `Context._local_error`. - using the new convenience `.maybe_raise()` topically (for now). - avoiding any previous reliance on `Context.cancelled_caught` for anything public of meaning. Further included is more logging content updates: - being pedantic in `.cancel()` msgs about whether termination is caused by error or ctxc. - optional `._invoke()` traceback hiding via a `hide_tb: bool`. - simpler log headers throughout instead leveraging new `.__repr__()` on primitives. - buncha `<= <actor-uid>` sent some message emissions. - simplified handshake statuses reporting. Other subsys api changes we need to match: - change to `Channel.transport`. - avoiding any `local_nursery: ActorNursery` waiting when the `._implicit_runtime_started` is set. And yes, lotsa more comments for #TODOs dawg.. since there's always somethin!
2024-03-03 00:26:40 +00:00
descr: str = (
'all' if not parent_chan
else
"IPC channel's "
)
rent_chan_repr: str = (
2024-07-04 19:06:15 +00:00
f' |_{parent_chan}\n\n'
if parent_chan
else ''
)
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
log.cancel(
f'Cancelling {descr} RPC tasks\n\n'
2024-07-04 19:06:15 +00:00
f'<=c) {req_uid} [canceller]\n'
f'{rent_chan_repr}'
2024-07-04 19:06:15 +00:00
f'c)=> {self.uid} [cancellee]\n'
f' |_{self} [with {len(tasks)} tasks]\n'
# f' |_tasks: {len(tasks)}\n'
Mega-refactor on `._invoke()` targeting `@context`s Since eventually we want to implement all other RPC "func types" as contexts underneath this starts the rework to move all the other cases into a separate func not only to simplify the main `._invoke()` body but also as a reminder of the intention to do it XD Details of re-factor: - add a new `._invoke_non_context()` which just moves all the old blocks for non-context handling to a single def. - factor what was basically just the `finally:` block handler (doing all the task bookkeeping) into a new `@acm`: `_errors_relayed_via_ipc()` with that content packed into the post-`yield` (also with a `hide_tb: bool` flag added of course). * include a `debug_kbis: bool` for when needed. - since the `@context` block is the only type left in the main `_invoke()` body, de-dent it so it's more grok-able B) Obviously this patch also includes a few improvements regarding context-cancellation-semantics (for the `context` RPC case) on the callee side in order to match previous changes to the `Context` api: - always setting any ctxc as the `Context._local_error`. - using the new convenience `.maybe_raise()` topically (for now). - avoiding any previous reliance on `Context.cancelled_caught` for anything public of meaning. Further included is more logging content updates: - being pedantic in `.cancel()` msgs about whether termination is caused by error or ctxc. - optional `._invoke()` traceback hiding via a `hide_tb: bool`. - simpler log headers throughout instead leveraging new `.__repr__()` on primitives. - buncha `<= <actor-uid>` sent some message emissions. - simplified handshake statuses reporting. Other subsys api changes we need to match: - change to `Channel.transport`. - avoiding any `local_nursery: ActorNursery` waiting when the `._implicit_runtime_started` is set. And yes, lotsa more comments for #TODOs dawg.. since there's always somethin!
2024-03-03 00:26:40 +00:00
# f'{tasks_str}'
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
)
for (
(task_caller_chan, cid),
(ctx, func, is_complete),
) in tasks.copy().items():
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
if (
# maybe filter to specific IPC channel?
(parent_chan
and
task_caller_chan != parent_chan)
# never "cancel-a-cancel" XD
or (func == self._cancel_task)
):
continue
# TODO: this maybe block on the task cancellation
# and so should really done in a nursery batch?
await self._cancel_task(
cid,
task_caller_chan,
requesting_uid=req_uid,
Be mega-pedantic with `ContextCancelled` semantics As part of extremely detailed inter-peer-actor testing, add much more granular `Context` cancellation state tracking via the following (new) fields: - `.canceller: tuple[str, str]` the uuid of the actor responsible for the cancellation condition - always set by `Context._maybe_cancel_and_set_remote_error()` and replaces `._cancelled_remote` and `.cancel_called_remote`. If set, this value should normally always match a value from some `ContextCancelled` raised or caught by one side of the context. - `._local_error` which is always set to the locally raised (and caller or callee task's scope-internal) error which caused any eventual cancellation/error condition and thus any closure of the context's per-task-side-`trio.Nursery`. - `.cancelled_caught: bool` is now always `True` whenever the local task catches (or "silently absorbs") a `ContextCancelled` (a `ctxc`) that indeed originated from one of the context's linked tasks or any other context which raised its own `ctxc` in the current `.open_context()` scope. => whenever there is a case that no `ContextCancelled` was raised **in** the `.open_context().__aexit__()` (eg. `ctx.result()` called after a call `ctx.cancel()`), we still consider the context's as having "caught a cancellation" since the `ctxc` was indeed silently handled by the cancel requester; all other error cases are already represented by mirroring the state of the `._scope: trio.CancelScope` => IOW there should be **no case** where an error is **not raised** in the context's scope and `.cancelled_caught: bool == False`, i.e. no case where `._scope.cancelled_caught == False and ._local_error is not None`! - always raise any `ctxc` from `.open_stream()` if `._cancel_called == True` - if the cancellation request has not already resulted in a `._remote_error: ContextCancelled` we raise a `RuntimeError` to indicate improper usage to the guilty side's task code. - make `._maybe_raise_remote_err()` a sync func and don't raise any `ctxc` which is matched against a `.canceller` determined to be the current actor, aka a "self cancel", and always set the `._local_error` to any such `ctxc`. - `.side: str` taken from inside `.cancel()` and unused as of now since it might be better re-written as a similar `.is_opener() -> bool`? - drop unused `._started_received: bool`.. - TONS and TONS of detailed comments/docs to attempt to explain all the possible cancellation/exit cases and how they should exhibit as either silent closes or raises from the `Context` API! Adjust the `._runtime._invoke()` code to match: - use `ctx._maybe_raise_remote_err()` in `._invoke()`. - adjust to new `.canceller` property. - more type hints. - better `log.cancel()` msging around self-cancels vs. peer-cancels. - always set the `._local_error: BaseException` for the "callee" task just like `Portal.open_context()` now will do B) Prior we were raising any `Context._remote_error` directly and doing (more or less) the same `ContextCancelled` "absorbing" logic (well kinda) in block; instead delegate to the method
2023-10-23 18:35:36 +00:00
)
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
Mega-refactor on `._invoke()` targeting `@context`s Since eventually we want to implement all other RPC "func types" as contexts underneath this starts the rework to move all the other cases into a separate func not only to simplify the main `._invoke()` body but also as a reminder of the intention to do it XD Details of re-factor: - add a new `._invoke_non_context()` which just moves all the old blocks for non-context handling to a single def. - factor what was basically just the `finally:` block handler (doing all the task bookkeeping) into a new `@acm`: `_errors_relayed_via_ipc()` with that content packed into the post-`yield` (also with a `hide_tb: bool` flag added of course). * include a `debug_kbis: bool` for when needed. - since the `@context` block is the only type left in the main `_invoke()` body, de-dent it so it's more grok-able B) Obviously this patch also includes a few improvements regarding context-cancellation-semantics (for the `context` RPC case) on the callee side in order to match previous changes to the `Context` api: - always setting any ctxc as the `Context._local_error`. - using the new convenience `.maybe_raise()` topically (for now). - avoiding any previous reliance on `Context.cancelled_caught` for anything public of meaning. Further included is more logging content updates: - being pedantic in `.cancel()` msgs about whether termination is caused by error or ctxc. - optional `._invoke()` traceback hiding via a `hide_tb: bool`. - simpler log headers throughout instead leveraging new `.__repr__()` on primitives. - buncha `<= <actor-uid>` sent some message emissions. - simplified handshake statuses reporting. Other subsys api changes we need to match: - change to `Channel.transport`. - avoiding any `local_nursery: ActorNursery` waiting when the `._implicit_runtime_started` is set. And yes, lotsa more comments for #TODOs dawg.. since there's always somethin!
2024-03-03 00:26:40 +00:00
if tasks:
log.cancel(
'Waiting for remaining rpc tasks to complete\n'
f'|_{tasks_str}'
Mega-refactor on `._invoke()` targeting `@context`s Since eventually we want to implement all other RPC "func types" as contexts underneath this starts the rework to move all the other cases into a separate func not only to simplify the main `._invoke()` body but also as a reminder of the intention to do it XD Details of re-factor: - add a new `._invoke_non_context()` which just moves all the old blocks for non-context handling to a single def. - factor what was basically just the `finally:` block handler (doing all the task bookkeeping) into a new `@acm`: `_errors_relayed_via_ipc()` with that content packed into the post-`yield` (also with a `hide_tb: bool` flag added of course). * include a `debug_kbis: bool` for when needed. - since the `@context` block is the only type left in the main `_invoke()` body, de-dent it so it's more grok-able B) Obviously this patch also includes a few improvements regarding context-cancellation-semantics (for the `context` RPC case) on the callee side in order to match previous changes to the `Context` api: - always setting any ctxc as the `Context._local_error`. - using the new convenience `.maybe_raise()` topically (for now). - avoiding any previous reliance on `Context.cancelled_caught` for anything public of meaning. Further included is more logging content updates: - being pedantic in `.cancel()` msgs about whether termination is caused by error or ctxc. - optional `._invoke()` traceback hiding via a `hide_tb: bool`. - simpler log headers throughout instead leveraging new `.__repr__()` on primitives. - buncha `<= <actor-uid>` sent some message emissions. - simplified handshake statuses reporting. Other subsys api changes we need to match: - change to `Channel.transport`. - avoiding any `local_nursery: ActorNursery` waiting when the `._implicit_runtime_started` is set. And yes, lotsa more comments for #TODOs dawg.. since there's always somethin!
2024-03-03 00:26:40 +00:00
)
Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.
2024-02-22 18:42:48 +00:00
await self._ongoing_rpc_tasks.wait()
2018-08-01 19:15:18 +00:00
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
@property
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
def accept_addrs(self) -> list[UnwrappedAddress]:
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
'''
All addresses to which the transport-channel server binds
and listens for new connections.
'''
Factor actor-embedded IPC-tpt-server to `ipc` subsys Primarily moving the `Actor._serve_forever()`-task-as-method and supporting actor-instance attributes to a new `.ipo._server` sub-mod which now encapsulates, - the coupling various `trio.Nursery`s (and their independent lifetime mgmt) to different `trio.serve_listener()`s tasks and `SocketStream` handler scopes. - `Address` and `SocketListener` mgmt and tracking through the idea of an "IPC endpoint": each "bound-and-active instance" of a served-listener for some (varied transport protocol's socket) address. - start and shutdown of the entire server's lifetime via an `@acm`. - delegation of starting/stopping tpt-protocol-specific `trio.abc.Listener`s to the corresponding `.ipc._<proto_key>` sub-module (newly defined mod-top-level instead of `Address` method) `start/close_listener()` funcs. Impl details of the `.ipc._server` sub-sys, - add new `IPCServer`, allocated with `open_ipc_server()`, and which encapsulates starting multiple-transport-proto-`trio.abc.Listener`s from an input set of `._addr.Address`s using, |_`IPCServer.listen_on()` which internally spawns tasks that delegate to a new `_serve_ipc_eps()`, a rework of what was (effectively) `Actor._serve_forever()` and which now, * allocates a new `IPCEndpoint`-struct (see below) for each address-listener pair alongside the specified listener-serving/stream-handling `trio.Nursery`s provided by the caller. * starts and stops each transport (socket's) listener by calling `IPCEndpoint.start/close_listener()` which in turn delegates to the underlying `inspect.getmodule(IPCEndpoint.addr)` backend tpt module's equivalent impl. * tracks all created endpoints in a `._endpoints: list[IPCEndpoint]` which is further exposed through public properties for introspection of served transport-protocols and their addresses. |_`IPCServer._[parent/stream_handler]_tn: Nursery`s which are either allocated (in which case, as the same instance) or provided by the caller of `open_ipc_server()` such that the same nursery-cancel-scope controls offered by `trio.serve_listeners(handler_nursery=)` are offered where the `._parent_tn` is used to spawn `_serve_ipc_eps()` tasks, and `._stream_handler_tn` is passed verbatim as `handler_nursery`. - a new `IPCEndpoint`-struct (as mentioned) which wraps each transport-proto's address + listener + allocated-supervising-nursery to encapsulate the "lifetime of a server IPC endpoint" such that eventually we can track and managed per-protocol/address/`.listen_on()`-call scoped starts/stops/restarts for the purposes of filtering/banning peer traffic. |_ also included is an unused `.peer_tpts` table which we can hopefully use to replace `Actor._peers` in a `Channel`-tracking transport-proto-aware way! Surrounding changes to `.ipc.*` primitives to match, - make `[TCP|UDS]Address` types `msgspec.Struct(frozen=True)` and thus drop any-and-all `addr._host =` style mutation throughout. |_ as such also drop their `.__init__()` and `.__eq__()` meths. |_ UDS tweaks to field names and thus `.__repr__()`. - move `[TCP|UDS]Address.[start/close]_listener()` meths to be mod-level equiv `start|close_listener()` funcs. - just hard code the `.ipc._types._key_to_transport/._addr_to_transport` table entries instead of all the prior fancy dynamic class property reading stuff (remember, "explicit is better then implicit"). Modified in `._runtime.Actor` internals, - drop the `._serve_forever()` and `.cancel_server()`, methods and `._server_down` waiting logic from `.cancel_soon()` - add `.[_]ipc_server` which is opened just after the `._service_n` and delegate to it for any equivalent publicly exposed instance attributes/properties.
2025-04-10 22:06:12 +00:00
return self._ipc_server.accept_addrs
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
@property
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
def accept_addr(self) -> UnwrappedAddress:
2022-08-03 20:09:16 +00:00
'''
Primary address to which the IPC transport server is
bound and listening for new connections.
2022-08-03 20:09:16 +00:00
'''
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
return self.accept_addrs[0]
2018-07-14 20:09:05 +00:00
def get_parent(self) -> Portal:
2022-08-03 20:09:16 +00:00
'''
Return a `Portal` to our parent.
2022-08-03 20:09:16 +00:00
'''
2018-08-31 21:16:24 +00:00
assert self._parent_chan, "No parent channel for this actor?"
2018-07-14 20:09:05 +00:00
return Portal(self._parent_chan)
def get_chans(
self,
uid: tuple[str, str],
) -> list[Channel]:
2022-08-03 20:09:16 +00:00
'''
Return all IPC channels to the actor with provided `uid`.
2022-08-03 20:09:16 +00:00
'''
return self._peers[uid]
2018-07-14 20:09:05 +00:00
def is_infected_aio(self) -> bool:
'''
If `True`, this actor is running `trio` in guest mode on
the `asyncio` event loop and thus can use the APIs in
`.to_asyncio` to coordinate tasks running in each
framework but within the same actor runtime.
'''
return self._infected_aio
2022-08-03 19:29:34 +00:00
async def async_main(
actor: Actor,
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
accept_addrs: UnwrappedAddress|None = None,
2022-08-03 19:29:34 +00:00
# XXX: currently ``parent_addr`` is only needed for the
# ``multiprocessing`` backend (which pickles state sent to
# the child instead of relaying it over the connect-back
# channel). Once that backend is removed we can likely just
# change this to a simple ``is_subactor: bool`` which will
# be False when running as root actor and True when as
# a subactor.
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
parent_addr: UnwrappedAddress|None = None,
2022-08-03 19:29:34 +00:00
task_status: TaskStatus[None] = trio.TASK_STATUS_IGNORED,
) -> None:
'''
Main `Actor` runtime entrypoint; start the transport-specific
IPC channel server, (maybe) connect back to parent (to receive
additional config), startup all core `trio` machinery for
delivering RPCs, register with the discovery system.
2022-08-03 19:29:34 +00:00
The "root" (or "top-level") and "service" `trio.Nursery`s are
opened here and when cancelled/terminated effectively shutdown
the actor's "runtime" and all thus all ongoing RPC tasks.
2022-08-03 19:29:34 +00:00
'''
Decouple actor-state from low-level ipc-server As much as is possible given we currently do some graceful cancellation join-waiting on any connected sub-actors whenever an active `local_nursery: AcrtorNursery` in the post-rpc teardown sequence of `handle_stream_from_peer()` is detected. In such cases we try to allow the higher level inter-actor (task) context(s) to fully cancelled-ack before conducting IPC machinery shutdown. The main immediate motivation for all this is to support unit testing the `.ipc._server` APIs but in the future may be useful for anyone wanting to use our modular IPC transport layer sin-"actors". Impl deats, - drop passing an `actor: Actor` ref from as many routines in `.ipc._server` as possible instead opting to use `._state.current_actor()` where abs needed; thus the fns dropping an `actor` input param are: - `open_ipc_server()` - `IPCServer.listen_on()` - `._serve_ipc_eps()` - `.handle_stream_from_peer()` - factor the above mentioned graceful remote-cancel-ack waiting into a new `maybe_wait_on_canced_subs()` which is called from `handle_stream_from_peer()` and delivers a maybe-`local_nursery: ActorNursery` for downstream logic; it's this new fn which primarily still needs to call `current_actor()`. - in `handle_stream_from_peer()` also use `current_actor()` to check if a handshake is needed (or if it was called as part of some actor-runtime-less operation like our unit test suite!). - also don't pass an `actor` to `._rpc.process_messages()` see how-n-why below.. Surrounding ipc-server client/caller adjustments, - `._rpc.process_messages()` no longer takes an `actor` input and now calls `current_actor()` instead. - `._portal.open_portal()` is adjusted to ^. - `._runtime.async_main()` is adjusted to the `.ipc._server`'s removal of `actor` ref passing. Also, - drop some server `log.info()`s to `.runtime()`
2025-06-11 20:44:47 +00:00
# XXX NOTE, `_state._current_actor` **must** be set prior to
# calling this core runtime entrypoint!
assert actor is _state.current_actor()
actor._task: trio.Task = trio.lowlevel.current_task()
# attempt to retreive ``trio``'s sigint handler and stash it
Use `DebugStatus` around subactor lock requests Breaks out all the (sub)actor local conc primitives from `Lock` (which is now only used in and by the root actor) such that there's an explicit distinction between a task that's "consuming" the `Lock` (remotely) vs. the root-side service tasks which do the actual acquire on behalf of the requesters. `DebugStatus` changeover deats: ------ - ------ - move all the actor-local vars over `DebugStatus` including: - move `_trio_handler` and `_orig_sigint_handler` - `local_task_in_debug` now `repl_task` - `_debugger_request_cs` now `req_cs` - `local_pdb_complete` now `repl_release` - drop all ^ fields from `Lock.repr()` obvi.. - move over the `.[un]shield_sigint()` and `.is_main_trio_thread()` methods. - add some new attrs/meths: - `DebugStatus.repl` for the currently running `Pdb` in-actor singleton. - `.repr()` for pprint of state (like `Lock`). - Note: that even when a root-actor task is in REPL, the `DebugStatus` is still used for certain actor-local state mgmt, such as SIGINT handler shielding. - obvi change all lock-requester code bits to now use a `DebugStatus` in their local actor-state instead of `Lock`, i.e. change usage from `Lock` in `._runtime` and `._root`. - use new `Lock.get_locking_task_cs()` API in when checking for sub-in-debug from `._runtime.Actor._stream_handler()`. Unrelated to topic-at-hand tweaks: ------ - ------ - drop the commented bits about hiding `@[a]cm` stack frames from `_debug.pause()` and simplify to only one block with the `shield` passthrough since we already solved the issue with cancel-scopes using `@pdbp.hideframe` B) - this includes all the extra logging about the extra frame for the user (good thing i put in that wasted effort back then eh..) - put the `try/except BaseException` with `log.exception()` around the whole of `._pause()` to ensure we don't miss in-func errors which can cause hangs.. - allow passing in `portal: Portal` to `Actor.start_remote_task()` such that `Portal` task spawning methods are always denoted correctly in terms of `Context.side`. - lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.
2024-04-18 16:47:28 +00:00
# on our debugger state.
debug.DebugStatus._trio_handler = signal.getsignal(signal.SIGINT)
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
is_registered: bool = False
2022-08-03 19:29:34 +00:00
try:
# establish primary connection with immediate parent
actor._parent_chan: Channel|None = None
2022-08-03 19:29:34 +00:00
if parent_addr is not None:
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
(
actor._parent_chan,
set_accept_addr_says_rent,
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
maybe_preferred_transports_says_rent,
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
) = await actor._from_parent(parent_addr)
2022-08-03 19:29:34 +00:00
accept_addrs: list[UnwrappedAddress] = []
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
# either it's passed in because we're not a child or
# because we're running in mp mode
if (
set_accept_addr_says_rent
2024-04-02 17:41:52 +00:00
and
set_accept_addr_says_rent is not None
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
):
accept_addrs = set_accept_addr_says_rent
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
else:
enable_transports: list[str] = (
maybe_preferred_transports_says_rent
or
[_state._def_tpt_proto]
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
)
for transport_key in enable_transports:
transport_cls: Type[Address] = get_address_cls(
transport_key
)
addr: Address = transport_cls.get_random()
accept_addrs.append(addr.unwrap())
2022-08-03 19:29:34 +00:00
Factor actor-embedded IPC-tpt-server to `ipc` subsys Primarily moving the `Actor._serve_forever()`-task-as-method and supporting actor-instance attributes to a new `.ipo._server` sub-mod which now encapsulates, - the coupling various `trio.Nursery`s (and their independent lifetime mgmt) to different `trio.serve_listener()`s tasks and `SocketStream` handler scopes. - `Address` and `SocketListener` mgmt and tracking through the idea of an "IPC endpoint": each "bound-and-active instance" of a served-listener for some (varied transport protocol's socket) address. - start and shutdown of the entire server's lifetime via an `@acm`. - delegation of starting/stopping tpt-protocol-specific `trio.abc.Listener`s to the corresponding `.ipc._<proto_key>` sub-module (newly defined mod-top-level instead of `Address` method) `start/close_listener()` funcs. Impl details of the `.ipc._server` sub-sys, - add new `IPCServer`, allocated with `open_ipc_server()`, and which encapsulates starting multiple-transport-proto-`trio.abc.Listener`s from an input set of `._addr.Address`s using, |_`IPCServer.listen_on()` which internally spawns tasks that delegate to a new `_serve_ipc_eps()`, a rework of what was (effectively) `Actor._serve_forever()` and which now, * allocates a new `IPCEndpoint`-struct (see below) for each address-listener pair alongside the specified listener-serving/stream-handling `trio.Nursery`s provided by the caller. * starts and stops each transport (socket's) listener by calling `IPCEndpoint.start/close_listener()` which in turn delegates to the underlying `inspect.getmodule(IPCEndpoint.addr)` backend tpt module's equivalent impl. * tracks all created endpoints in a `._endpoints: list[IPCEndpoint]` which is further exposed through public properties for introspection of served transport-protocols and their addresses. |_`IPCServer._[parent/stream_handler]_tn: Nursery`s which are either allocated (in which case, as the same instance) or provided by the caller of `open_ipc_server()` such that the same nursery-cancel-scope controls offered by `trio.serve_listeners(handler_nursery=)` are offered where the `._parent_tn` is used to spawn `_serve_ipc_eps()` tasks, and `._stream_handler_tn` is passed verbatim as `handler_nursery`. - a new `IPCEndpoint`-struct (as mentioned) which wraps each transport-proto's address + listener + allocated-supervising-nursery to encapsulate the "lifetime of a server IPC endpoint" such that eventually we can track and managed per-protocol/address/`.listen_on()`-call scoped starts/stops/restarts for the purposes of filtering/banning peer traffic. |_ also included is an unused `.peer_tpts` table which we can hopefully use to replace `Actor._peers` in a `Channel`-tracking transport-proto-aware way! Surrounding changes to `.ipc.*` primitives to match, - make `[TCP|UDS]Address` types `msgspec.Struct(frozen=True)` and thus drop any-and-all `addr._host =` style mutation throughout. |_ as such also drop their `.__init__()` and `.__eq__()` meths. |_ UDS tweaks to field names and thus `.__repr__()`. - move `[TCP|UDS]Address.[start/close]_listener()` meths to be mod-level equiv `start|close_listener()` funcs. - just hard code the `.ipc._types._key_to_transport/._addr_to_transport` table entries instead of all the prior fancy dynamic class property reading stuff (remember, "explicit is better then implicit"). Modified in `._runtime.Actor` internals, - drop the `._serve_forever()` and `.cancel_server()`, methods and `._server_down` waiting logic from `.cancel_soon()` - add `.[_]ipc_server` which is opened just after the `._service_n` and delegate to it for any equivalent publicly exposed instance attributes/properties.
2025-04-10 22:06:12 +00:00
assert accept_addrs
2022-08-03 19:29:34 +00:00
# The "root" nursery ensures the channel with the immediate
# parent is kept alive as a resilient service until
# cancellation steps have (mostly) occurred in
# a deterministic way.
async with trio.open_nursery(
strict_exception_groups=False,
) as root_nursery:
2022-08-03 19:29:34 +00:00
actor._root_n = root_nursery
assert actor._root_n
Factor actor-embedded IPC-tpt-server to `ipc` subsys Primarily moving the `Actor._serve_forever()`-task-as-method and supporting actor-instance attributes to a new `.ipo._server` sub-mod which now encapsulates, - the coupling various `trio.Nursery`s (and their independent lifetime mgmt) to different `trio.serve_listener()`s tasks and `SocketStream` handler scopes. - `Address` and `SocketListener` mgmt and tracking through the idea of an "IPC endpoint": each "bound-and-active instance" of a served-listener for some (varied transport protocol's socket) address. - start and shutdown of the entire server's lifetime via an `@acm`. - delegation of starting/stopping tpt-protocol-specific `trio.abc.Listener`s to the corresponding `.ipc._<proto_key>` sub-module (newly defined mod-top-level instead of `Address` method) `start/close_listener()` funcs. Impl details of the `.ipc._server` sub-sys, - add new `IPCServer`, allocated with `open_ipc_server()`, and which encapsulates starting multiple-transport-proto-`trio.abc.Listener`s from an input set of `._addr.Address`s using, |_`IPCServer.listen_on()` which internally spawns tasks that delegate to a new `_serve_ipc_eps()`, a rework of what was (effectively) `Actor._serve_forever()` and which now, * allocates a new `IPCEndpoint`-struct (see below) for each address-listener pair alongside the specified listener-serving/stream-handling `trio.Nursery`s provided by the caller. * starts and stops each transport (socket's) listener by calling `IPCEndpoint.start/close_listener()` which in turn delegates to the underlying `inspect.getmodule(IPCEndpoint.addr)` backend tpt module's equivalent impl. * tracks all created endpoints in a `._endpoints: list[IPCEndpoint]` which is further exposed through public properties for introspection of served transport-protocols and their addresses. |_`IPCServer._[parent/stream_handler]_tn: Nursery`s which are either allocated (in which case, as the same instance) or provided by the caller of `open_ipc_server()` such that the same nursery-cancel-scope controls offered by `trio.serve_listeners(handler_nursery=)` are offered where the `._parent_tn` is used to spawn `_serve_ipc_eps()` tasks, and `._stream_handler_tn` is passed verbatim as `handler_nursery`. - a new `IPCEndpoint`-struct (as mentioned) which wraps each transport-proto's address + listener + allocated-supervising-nursery to encapsulate the "lifetime of a server IPC endpoint" such that eventually we can track and managed per-protocol/address/`.listen_on()`-call scoped starts/stops/restarts for the purposes of filtering/banning peer traffic. |_ also included is an unused `.peer_tpts` table which we can hopefully use to replace `Actor._peers` in a `Channel`-tracking transport-proto-aware way! Surrounding changes to `.ipc.*` primitives to match, - make `[TCP|UDS]Address` types `msgspec.Struct(frozen=True)` and thus drop any-and-all `addr._host =` style mutation throughout. |_ as such also drop their `.__init__()` and `.__eq__()` meths. |_ UDS tweaks to field names and thus `.__repr__()`. - move `[TCP|UDS]Address.[start/close]_listener()` meths to be mod-level equiv `start|close_listener()` funcs. - just hard code the `.ipc._types._key_to_transport/._addr_to_transport` table entries instead of all the prior fancy dynamic class property reading stuff (remember, "explicit is better then implicit"). Modified in `._runtime.Actor` internals, - drop the `._serve_forever()` and `.cancel_server()`, methods and `._server_down` waiting logic from `.cancel_soon()` - add `.[_]ipc_server` which is opened just after the `._service_n` and delegate to it for any equivalent publicly exposed instance attributes/properties.
2025-04-10 22:06:12 +00:00
ipc_server: _server.IPCServer
async with (
trio.open_nursery(
strict_exception_groups=False,
) as service_nursery,
_server.open_ipc_server(
parent_tn=service_nursery,
stream_handler_tn=service_nursery,
) as ipc_server,
# ) as actor._ipc_server,
# ^TODO? prettier?
):
2022-08-03 19:29:34 +00:00
# This nursery is used to handle all inbound
# connections to us such that if the TCP server
# is killed, connections can continue to process
# in the background until this nursery is cancelled.
actor._service_n = service_nursery
Factor actor-embedded IPC-tpt-server to `ipc` subsys Primarily moving the `Actor._serve_forever()`-task-as-method and supporting actor-instance attributes to a new `.ipo._server` sub-mod which now encapsulates, - the coupling various `trio.Nursery`s (and their independent lifetime mgmt) to different `trio.serve_listener()`s tasks and `SocketStream` handler scopes. - `Address` and `SocketListener` mgmt and tracking through the idea of an "IPC endpoint": each "bound-and-active instance" of a served-listener for some (varied transport protocol's socket) address. - start and shutdown of the entire server's lifetime via an `@acm`. - delegation of starting/stopping tpt-protocol-specific `trio.abc.Listener`s to the corresponding `.ipc._<proto_key>` sub-module (newly defined mod-top-level instead of `Address` method) `start/close_listener()` funcs. Impl details of the `.ipc._server` sub-sys, - add new `IPCServer`, allocated with `open_ipc_server()`, and which encapsulates starting multiple-transport-proto-`trio.abc.Listener`s from an input set of `._addr.Address`s using, |_`IPCServer.listen_on()` which internally spawns tasks that delegate to a new `_serve_ipc_eps()`, a rework of what was (effectively) `Actor._serve_forever()` and which now, * allocates a new `IPCEndpoint`-struct (see below) for each address-listener pair alongside the specified listener-serving/stream-handling `trio.Nursery`s provided by the caller. * starts and stops each transport (socket's) listener by calling `IPCEndpoint.start/close_listener()` which in turn delegates to the underlying `inspect.getmodule(IPCEndpoint.addr)` backend tpt module's equivalent impl. * tracks all created endpoints in a `._endpoints: list[IPCEndpoint]` which is further exposed through public properties for introspection of served transport-protocols and their addresses. |_`IPCServer._[parent/stream_handler]_tn: Nursery`s which are either allocated (in which case, as the same instance) or provided by the caller of `open_ipc_server()` such that the same nursery-cancel-scope controls offered by `trio.serve_listeners(handler_nursery=)` are offered where the `._parent_tn` is used to spawn `_serve_ipc_eps()` tasks, and `._stream_handler_tn` is passed verbatim as `handler_nursery`. - a new `IPCEndpoint`-struct (as mentioned) which wraps each transport-proto's address + listener + allocated-supervising-nursery to encapsulate the "lifetime of a server IPC endpoint" such that eventually we can track and managed per-protocol/address/`.listen_on()`-call scoped starts/stops/restarts for the purposes of filtering/banning peer traffic. |_ also included is an unused `.peer_tpts` table which we can hopefully use to replace `Actor._peers` in a `Channel`-tracking transport-proto-aware way! Surrounding changes to `.ipc.*` primitives to match, - make `[TCP|UDS]Address` types `msgspec.Struct(frozen=True)` and thus drop any-and-all `addr._host =` style mutation throughout. |_ as such also drop their `.__init__()` and `.__eq__()` meths. |_ UDS tweaks to field names and thus `.__repr__()`. - move `[TCP|UDS]Address.[start/close]_listener()` meths to be mod-level equiv `start|close_listener()` funcs. - just hard code the `.ipc._types._key_to_transport/._addr_to_transport` table entries instead of all the prior fancy dynamic class property reading stuff (remember, "explicit is better then implicit"). Modified in `._runtime.Actor` internals, - drop the `._serve_forever()` and `.cancel_server()`, methods and `._server_down` waiting logic from `.cancel_soon()` - add `.[_]ipc_server` which is opened just after the `._service_n` and delegate to it for any equivalent publicly exposed instance attributes/properties.
2025-04-10 22:06:12 +00:00
actor._ipc_server = ipc_server
assert (
actor._service_n
and (
actor._service_n
is
actor._ipc_server._parent_tn
is
ipc_server._stream_handler_tn
)
)
2022-08-03 19:29:34 +00:00
# load exposed/allowed RPC modules
# XXX: do this **after** establishing a channel to the parent
# but **before** starting the message loop for that channel
# such that import errors are properly propagated upwards
actor.load_modules()
# XXX TODO XXX: figuring out debugging of this
# would somemwhat guarantee "self-hosted" runtime
# debugging (since it hits all the ede cases?)
#
# `tractor.pause()` right?
# try:
# actor.load_modules()
# except ModuleNotFoundError as err:
# debug.pause_from_sync()
# import pdbp; pdbp.set_trace()
# raise
# Startup up the transport(-channel) server with,
2022-08-03 19:29:34 +00:00
# - subactor: the bind address is sent by our parent
# over our established channel
# - root actor: the ``accept_addr`` passed to this method
Factor actor-embedded IPC-tpt-server to `ipc` subsys Primarily moving the `Actor._serve_forever()`-task-as-method and supporting actor-instance attributes to a new `.ipo._server` sub-mod which now encapsulates, - the coupling various `trio.Nursery`s (and their independent lifetime mgmt) to different `trio.serve_listener()`s tasks and `SocketStream` handler scopes. - `Address` and `SocketListener` mgmt and tracking through the idea of an "IPC endpoint": each "bound-and-active instance" of a served-listener for some (varied transport protocol's socket) address. - start and shutdown of the entire server's lifetime via an `@acm`. - delegation of starting/stopping tpt-protocol-specific `trio.abc.Listener`s to the corresponding `.ipc._<proto_key>` sub-module (newly defined mod-top-level instead of `Address` method) `start/close_listener()` funcs. Impl details of the `.ipc._server` sub-sys, - add new `IPCServer`, allocated with `open_ipc_server()`, and which encapsulates starting multiple-transport-proto-`trio.abc.Listener`s from an input set of `._addr.Address`s using, |_`IPCServer.listen_on()` which internally spawns tasks that delegate to a new `_serve_ipc_eps()`, a rework of what was (effectively) `Actor._serve_forever()` and which now, * allocates a new `IPCEndpoint`-struct (see below) for each address-listener pair alongside the specified listener-serving/stream-handling `trio.Nursery`s provided by the caller. * starts and stops each transport (socket's) listener by calling `IPCEndpoint.start/close_listener()` which in turn delegates to the underlying `inspect.getmodule(IPCEndpoint.addr)` backend tpt module's equivalent impl. * tracks all created endpoints in a `._endpoints: list[IPCEndpoint]` which is further exposed through public properties for introspection of served transport-protocols and their addresses. |_`IPCServer._[parent/stream_handler]_tn: Nursery`s which are either allocated (in which case, as the same instance) or provided by the caller of `open_ipc_server()` such that the same nursery-cancel-scope controls offered by `trio.serve_listeners(handler_nursery=)` are offered where the `._parent_tn` is used to spawn `_serve_ipc_eps()` tasks, and `._stream_handler_tn` is passed verbatim as `handler_nursery`. - a new `IPCEndpoint`-struct (as mentioned) which wraps each transport-proto's address + listener + allocated-supervising-nursery to encapsulate the "lifetime of a server IPC endpoint" such that eventually we can track and managed per-protocol/address/`.listen_on()`-call scoped starts/stops/restarts for the purposes of filtering/banning peer traffic. |_ also included is an unused `.peer_tpts` table which we can hopefully use to replace `Actor._peers` in a `Channel`-tracking transport-proto-aware way! Surrounding changes to `.ipc.*` primitives to match, - make `[TCP|UDS]Address` types `msgspec.Struct(frozen=True)` and thus drop any-and-all `addr._host =` style mutation throughout. |_ as such also drop their `.__init__()` and `.__eq__()` meths. |_ UDS tweaks to field names and thus `.__repr__()`. - move `[TCP|UDS]Address.[start/close]_listener()` meths to be mod-level equiv `start|close_listener()` funcs. - just hard code the `.ipc._types._key_to_transport/._addr_to_transport` table entries instead of all the prior fancy dynamic class property reading stuff (remember, "explicit is better then implicit"). Modified in `._runtime.Actor` internals, - drop the `._serve_forever()` and `.cancel_server()`, methods and `._server_down` waiting logic from `.cancel_soon()` - add `.[_]ipc_server` which is opened just after the `._service_n` and delegate to it for any equivalent publicly exposed instance attributes/properties.
2025-04-10 22:06:12 +00:00
# TODO: why is this not with the root nursery?
try:
Factor actor-embedded IPC-tpt-server to `ipc` subsys Primarily moving the `Actor._serve_forever()`-task-as-method and supporting actor-instance attributes to a new `.ipo._server` sub-mod which now encapsulates, - the coupling various `trio.Nursery`s (and their independent lifetime mgmt) to different `trio.serve_listener()`s tasks and `SocketStream` handler scopes. - `Address` and `SocketListener` mgmt and tracking through the idea of an "IPC endpoint": each "bound-and-active instance" of a served-listener for some (varied transport protocol's socket) address. - start and shutdown of the entire server's lifetime via an `@acm`. - delegation of starting/stopping tpt-protocol-specific `trio.abc.Listener`s to the corresponding `.ipc._<proto_key>` sub-module (newly defined mod-top-level instead of `Address` method) `start/close_listener()` funcs. Impl details of the `.ipc._server` sub-sys, - add new `IPCServer`, allocated with `open_ipc_server()`, and which encapsulates starting multiple-transport-proto-`trio.abc.Listener`s from an input set of `._addr.Address`s using, |_`IPCServer.listen_on()` which internally spawns tasks that delegate to a new `_serve_ipc_eps()`, a rework of what was (effectively) `Actor._serve_forever()` and which now, * allocates a new `IPCEndpoint`-struct (see below) for each address-listener pair alongside the specified listener-serving/stream-handling `trio.Nursery`s provided by the caller. * starts and stops each transport (socket's) listener by calling `IPCEndpoint.start/close_listener()` which in turn delegates to the underlying `inspect.getmodule(IPCEndpoint.addr)` backend tpt module's equivalent impl. * tracks all created endpoints in a `._endpoints: list[IPCEndpoint]` which is further exposed through public properties for introspection of served transport-protocols and their addresses. |_`IPCServer._[parent/stream_handler]_tn: Nursery`s which are either allocated (in which case, as the same instance) or provided by the caller of `open_ipc_server()` such that the same nursery-cancel-scope controls offered by `trio.serve_listeners(handler_nursery=)` are offered where the `._parent_tn` is used to spawn `_serve_ipc_eps()` tasks, and `._stream_handler_tn` is passed verbatim as `handler_nursery`. - a new `IPCEndpoint`-struct (as mentioned) which wraps each transport-proto's address + listener + allocated-supervising-nursery to encapsulate the "lifetime of a server IPC endpoint" such that eventually we can track and managed per-protocol/address/`.listen_on()`-call scoped starts/stops/restarts for the purposes of filtering/banning peer traffic. |_ also included is an unused `.peer_tpts` table which we can hopefully use to replace `Actor._peers` in a `Channel`-tracking transport-proto-aware way! Surrounding changes to `.ipc.*` primitives to match, - make `[TCP|UDS]Address` types `msgspec.Struct(frozen=True)` and thus drop any-and-all `addr._host =` style mutation throughout. |_ as such also drop their `.__init__()` and `.__eq__()` meths. |_ UDS tweaks to field names and thus `.__repr__()`. - move `[TCP|UDS]Address.[start/close]_listener()` meths to be mod-level equiv `start|close_listener()` funcs. - just hard code the `.ipc._types._key_to_transport/._addr_to_transport` table entries instead of all the prior fancy dynamic class property reading stuff (remember, "explicit is better then implicit"). Modified in `._runtime.Actor` internals, - drop the `._serve_forever()` and `.cancel_server()`, methods and `._server_down` waiting logic from `.cancel_soon()` - add `.[_]ipc_server` which is opened just after the `._service_n` and delegate to it for any equivalent publicly exposed instance attributes/properties.
2025-04-10 22:06:12 +00:00
log.runtime(
'Booting IPC server'
2022-08-03 19:29:34 +00:00
)
Factor actor-embedded IPC-tpt-server to `ipc` subsys Primarily moving the `Actor._serve_forever()`-task-as-method and supporting actor-instance attributes to a new `.ipo._server` sub-mod which now encapsulates, - the coupling various `trio.Nursery`s (and their independent lifetime mgmt) to different `trio.serve_listener()`s tasks and `SocketStream` handler scopes. - `Address` and `SocketListener` mgmt and tracking through the idea of an "IPC endpoint": each "bound-and-active instance" of a served-listener for some (varied transport protocol's socket) address. - start and shutdown of the entire server's lifetime via an `@acm`. - delegation of starting/stopping tpt-protocol-specific `trio.abc.Listener`s to the corresponding `.ipc._<proto_key>` sub-module (newly defined mod-top-level instead of `Address` method) `start/close_listener()` funcs. Impl details of the `.ipc._server` sub-sys, - add new `IPCServer`, allocated with `open_ipc_server()`, and which encapsulates starting multiple-transport-proto-`trio.abc.Listener`s from an input set of `._addr.Address`s using, |_`IPCServer.listen_on()` which internally spawns tasks that delegate to a new `_serve_ipc_eps()`, a rework of what was (effectively) `Actor._serve_forever()` and which now, * allocates a new `IPCEndpoint`-struct (see below) for each address-listener pair alongside the specified listener-serving/stream-handling `trio.Nursery`s provided by the caller. * starts and stops each transport (socket's) listener by calling `IPCEndpoint.start/close_listener()` which in turn delegates to the underlying `inspect.getmodule(IPCEndpoint.addr)` backend tpt module's equivalent impl. * tracks all created endpoints in a `._endpoints: list[IPCEndpoint]` which is further exposed through public properties for introspection of served transport-protocols and their addresses. |_`IPCServer._[parent/stream_handler]_tn: Nursery`s which are either allocated (in which case, as the same instance) or provided by the caller of `open_ipc_server()` such that the same nursery-cancel-scope controls offered by `trio.serve_listeners(handler_nursery=)` are offered where the `._parent_tn` is used to spawn `_serve_ipc_eps()` tasks, and `._stream_handler_tn` is passed verbatim as `handler_nursery`. - a new `IPCEndpoint`-struct (as mentioned) which wraps each transport-proto's address + listener + allocated-supervising-nursery to encapsulate the "lifetime of a server IPC endpoint" such that eventually we can track and managed per-protocol/address/`.listen_on()`-call scoped starts/stops/restarts for the purposes of filtering/banning peer traffic. |_ also included is an unused `.peer_tpts` table which we can hopefully use to replace `Actor._peers` in a `Channel`-tracking transport-proto-aware way! Surrounding changes to `.ipc.*` primitives to match, - make `[TCP|UDS]Address` types `msgspec.Struct(frozen=True)` and thus drop any-and-all `addr._host =` style mutation throughout. |_ as such also drop their `.__init__()` and `.__eq__()` meths. |_ UDS tweaks to field names and thus `.__repr__()`. - move `[TCP|UDS]Address.[start/close]_listener()` meths to be mod-level equiv `start|close_listener()` funcs. - just hard code the `.ipc._types._key_to_transport/._addr_to_transport` table entries instead of all the prior fancy dynamic class property reading stuff (remember, "explicit is better then implicit"). Modified in `._runtime.Actor` internals, - drop the `._serve_forever()` and `.cancel_server()`, methods and `._server_down` waiting logic from `.cancel_soon()` - add `.[_]ipc_server` which is opened just after the `._service_n` and delegate to it for any equivalent publicly exposed instance attributes/properties.
2025-04-10 22:06:12 +00:00
eps: list = await ipc_server.listen_on(
accept_addrs=accept_addrs,
stream_handler_nursery=service_nursery,
)
log.runtime(
f'Booted IPC server\n'
f'{ipc_server}\n'
)
assert (
(eps[0].listen_tn)
is not service_nursery
)
except OSError as oserr:
# NOTE: always allow runtime hackers to debug
# tranport address bind errors - normally it's
# something silly like the wrong socket-address
# passed via a config or CLI Bo
entered_debug: bool = await debug._maybe_enter_pm(
Factor actor-embedded IPC-tpt-server to `ipc` subsys Primarily moving the `Actor._serve_forever()`-task-as-method and supporting actor-instance attributes to a new `.ipo._server` sub-mod which now encapsulates, - the coupling various `trio.Nursery`s (and their independent lifetime mgmt) to different `trio.serve_listener()`s tasks and `SocketStream` handler scopes. - `Address` and `SocketListener` mgmt and tracking through the idea of an "IPC endpoint": each "bound-and-active instance" of a served-listener for some (varied transport protocol's socket) address. - start and shutdown of the entire server's lifetime via an `@acm`. - delegation of starting/stopping tpt-protocol-specific `trio.abc.Listener`s to the corresponding `.ipc._<proto_key>` sub-module (newly defined mod-top-level instead of `Address` method) `start/close_listener()` funcs. Impl details of the `.ipc._server` sub-sys, - add new `IPCServer`, allocated with `open_ipc_server()`, and which encapsulates starting multiple-transport-proto-`trio.abc.Listener`s from an input set of `._addr.Address`s using, |_`IPCServer.listen_on()` which internally spawns tasks that delegate to a new `_serve_ipc_eps()`, a rework of what was (effectively) `Actor._serve_forever()` and which now, * allocates a new `IPCEndpoint`-struct (see below) for each address-listener pair alongside the specified listener-serving/stream-handling `trio.Nursery`s provided by the caller. * starts and stops each transport (socket's) listener by calling `IPCEndpoint.start/close_listener()` which in turn delegates to the underlying `inspect.getmodule(IPCEndpoint.addr)` backend tpt module's equivalent impl. * tracks all created endpoints in a `._endpoints: list[IPCEndpoint]` which is further exposed through public properties for introspection of served transport-protocols and their addresses. |_`IPCServer._[parent/stream_handler]_tn: Nursery`s which are either allocated (in which case, as the same instance) or provided by the caller of `open_ipc_server()` such that the same nursery-cancel-scope controls offered by `trio.serve_listeners(handler_nursery=)` are offered where the `._parent_tn` is used to spawn `_serve_ipc_eps()` tasks, and `._stream_handler_tn` is passed verbatim as `handler_nursery`. - a new `IPCEndpoint`-struct (as mentioned) which wraps each transport-proto's address + listener + allocated-supervising-nursery to encapsulate the "lifetime of a server IPC endpoint" such that eventually we can track and managed per-protocol/address/`.listen_on()`-call scoped starts/stops/restarts for the purposes of filtering/banning peer traffic. |_ also included is an unused `.peer_tpts` table which we can hopefully use to replace `Actor._peers` in a `Channel`-tracking transport-proto-aware way! Surrounding changes to `.ipc.*` primitives to match, - make `[TCP|UDS]Address` types `msgspec.Struct(frozen=True)` and thus drop any-and-all `addr._host =` style mutation throughout. |_ as such also drop their `.__init__()` and `.__eq__()` meths. |_ UDS tweaks to field names and thus `.__repr__()`. - move `[TCP|UDS]Address.[start/close]_listener()` meths to be mod-level equiv `start|close_listener()` funcs. - just hard code the `.ipc._types._key_to_transport/._addr_to_transport` table entries instead of all the prior fancy dynamic class property reading stuff (remember, "explicit is better then implicit"). Modified in `._runtime.Actor` internals, - drop the `._serve_forever()` and `.cancel_server()`, methods and `._server_down` waiting logic from `.cancel_soon()` - add `.[_]ipc_server` which is opened just after the `._service_n` and delegate to it for any equivalent publicly exposed instance attributes/properties.
2025-04-10 22:06:12 +00:00
oserr,
)
if not entered_debug:
Factor actor-embedded IPC-tpt-server to `ipc` subsys Primarily moving the `Actor._serve_forever()`-task-as-method and supporting actor-instance attributes to a new `.ipo._server` sub-mod which now encapsulates, - the coupling various `trio.Nursery`s (and their independent lifetime mgmt) to different `trio.serve_listener()`s tasks and `SocketStream` handler scopes. - `Address` and `SocketListener` mgmt and tracking through the idea of an "IPC endpoint": each "bound-and-active instance" of a served-listener for some (varied transport protocol's socket) address. - start and shutdown of the entire server's lifetime via an `@acm`. - delegation of starting/stopping tpt-protocol-specific `trio.abc.Listener`s to the corresponding `.ipc._<proto_key>` sub-module (newly defined mod-top-level instead of `Address` method) `start/close_listener()` funcs. Impl details of the `.ipc._server` sub-sys, - add new `IPCServer`, allocated with `open_ipc_server()`, and which encapsulates starting multiple-transport-proto-`trio.abc.Listener`s from an input set of `._addr.Address`s using, |_`IPCServer.listen_on()` which internally spawns tasks that delegate to a new `_serve_ipc_eps()`, a rework of what was (effectively) `Actor._serve_forever()` and which now, * allocates a new `IPCEndpoint`-struct (see below) for each address-listener pair alongside the specified listener-serving/stream-handling `trio.Nursery`s provided by the caller. * starts and stops each transport (socket's) listener by calling `IPCEndpoint.start/close_listener()` which in turn delegates to the underlying `inspect.getmodule(IPCEndpoint.addr)` backend tpt module's equivalent impl. * tracks all created endpoints in a `._endpoints: list[IPCEndpoint]` which is further exposed through public properties for introspection of served transport-protocols and their addresses. |_`IPCServer._[parent/stream_handler]_tn: Nursery`s which are either allocated (in which case, as the same instance) or provided by the caller of `open_ipc_server()` such that the same nursery-cancel-scope controls offered by `trio.serve_listeners(handler_nursery=)` are offered where the `._parent_tn` is used to spawn `_serve_ipc_eps()` tasks, and `._stream_handler_tn` is passed verbatim as `handler_nursery`. - a new `IPCEndpoint`-struct (as mentioned) which wraps each transport-proto's address + listener + allocated-supervising-nursery to encapsulate the "lifetime of a server IPC endpoint" such that eventually we can track and managed per-protocol/address/`.listen_on()`-call scoped starts/stops/restarts for the purposes of filtering/banning peer traffic. |_ also included is an unused `.peer_tpts` table which we can hopefully use to replace `Actor._peers` in a `Channel`-tracking transport-proto-aware way! Surrounding changes to `.ipc.*` primitives to match, - make `[TCP|UDS]Address` types `msgspec.Struct(frozen=True)` and thus drop any-and-all `addr._host =` style mutation throughout. |_ as such also drop their `.__init__()` and `.__eq__()` meths. |_ UDS tweaks to field names and thus `.__repr__()`. - move `[TCP|UDS]Address.[start/close]_listener()` meths to be mod-level equiv `start|close_listener()` funcs. - just hard code the `.ipc._types._key_to_transport/._addr_to_transport` table entries instead of all the prior fancy dynamic class property reading stuff (remember, "explicit is better then implicit"). Modified in `._runtime.Actor` internals, - drop the `._serve_forever()` and `.cancel_server()`, methods and `._server_down` waiting logic from `.cancel_soon()` - add `.[_]ipc_server` which is opened just after the `._service_n` and delegate to it for any equivalent publicly exposed instance attributes/properties.
2025-04-10 22:06:12 +00:00
log.exception('Failed to init IPC server !?\n')
else:
log.runtime('Exited debug REPL..')
raise
Factor actor-embedded IPC-tpt-server to `ipc` subsys Primarily moving the `Actor._serve_forever()`-task-as-method and supporting actor-instance attributes to a new `.ipo._server` sub-mod which now encapsulates, - the coupling various `trio.Nursery`s (and their independent lifetime mgmt) to different `trio.serve_listener()`s tasks and `SocketStream` handler scopes. - `Address` and `SocketListener` mgmt and tracking through the idea of an "IPC endpoint": each "bound-and-active instance" of a served-listener for some (varied transport protocol's socket) address. - start and shutdown of the entire server's lifetime via an `@acm`. - delegation of starting/stopping tpt-protocol-specific `trio.abc.Listener`s to the corresponding `.ipc._<proto_key>` sub-module (newly defined mod-top-level instead of `Address` method) `start/close_listener()` funcs. Impl details of the `.ipc._server` sub-sys, - add new `IPCServer`, allocated with `open_ipc_server()`, and which encapsulates starting multiple-transport-proto-`trio.abc.Listener`s from an input set of `._addr.Address`s using, |_`IPCServer.listen_on()` which internally spawns tasks that delegate to a new `_serve_ipc_eps()`, a rework of what was (effectively) `Actor._serve_forever()` and which now, * allocates a new `IPCEndpoint`-struct (see below) for each address-listener pair alongside the specified listener-serving/stream-handling `trio.Nursery`s provided by the caller. * starts and stops each transport (socket's) listener by calling `IPCEndpoint.start/close_listener()` which in turn delegates to the underlying `inspect.getmodule(IPCEndpoint.addr)` backend tpt module's equivalent impl. * tracks all created endpoints in a `._endpoints: list[IPCEndpoint]` which is further exposed through public properties for introspection of served transport-protocols and their addresses. |_`IPCServer._[parent/stream_handler]_tn: Nursery`s which are either allocated (in which case, as the same instance) or provided by the caller of `open_ipc_server()` such that the same nursery-cancel-scope controls offered by `trio.serve_listeners(handler_nursery=)` are offered where the `._parent_tn` is used to spawn `_serve_ipc_eps()` tasks, and `._stream_handler_tn` is passed verbatim as `handler_nursery`. - a new `IPCEndpoint`-struct (as mentioned) which wraps each transport-proto's address + listener + allocated-supervising-nursery to encapsulate the "lifetime of a server IPC endpoint" such that eventually we can track and managed per-protocol/address/`.listen_on()`-call scoped starts/stops/restarts for the purposes of filtering/banning peer traffic. |_ also included is an unused `.peer_tpts` table which we can hopefully use to replace `Actor._peers` in a `Channel`-tracking transport-proto-aware way! Surrounding changes to `.ipc.*` primitives to match, - make `[TCP|UDS]Address` types `msgspec.Struct(frozen=True)` and thus drop any-and-all `addr._host =` style mutation throughout. |_ as such also drop their `.__init__()` and `.__eq__()` meths. |_ UDS tweaks to field names and thus `.__repr__()`. - move `[TCP|UDS]Address.[start/close]_listener()` meths to be mod-level equiv `start|close_listener()` funcs. - just hard code the `.ipc._types._key_to_transport/._addr_to_transport` table entries instead of all the prior fancy dynamic class property reading stuff (remember, "explicit is better then implicit"). Modified in `._runtime.Actor` internals, - drop the `._serve_forever()` and `.cancel_server()`, methods and `._server_down` waiting logic from `.cancel_soon()` - add `.[_]ipc_server` which is opened just after the `._service_n` and delegate to it for any equivalent publicly exposed instance attributes/properties.
2025-04-10 22:06:12 +00:00
# TODO, just read direct from ipc_server?
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
accept_addrs: list[UnwrappedAddress] = actor.accept_addrs
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
# NOTE: only set the loopback addr for the
# process-tree-global "root" mailbox since
# all sub-actors should be able to speak to
# their root actor over that channel.
2022-08-03 19:29:34 +00:00
if _state._runtime_vars['_is_root']:
Use `enable_transports: list[str]` parameter Actually applying the input it in the root as well as all sub-actors by passing it down to sub-actors through runtime-vars as delivered by the initial `SpawnSpec` msg during child runtime init. Impl deats, - add a new `_state._runtime_vars['_enable_tpts']: list[str]` field set by the input param (if provided) to `.open_root_actor()`. - mk `current_ipc_protos()` return the runtime-var entry with instead the default in the `_runtime_vars: dict` set to `[_def_tpt_proto]`. - in `.open_root_actor()`, still error on this being a >1 `list[str]` until we have more testing infra/suites to audit multi-protos per actor. - return the new value (as 3rd element) from `Actor._from_parent()` as per the todo note; means `_runtime.async_main()` will allocate `accept_addrs` as tpt-specific `Address` entries and pass them to `IPCServer.listen_on()`. Also, - also add a new `_state._runtime_vars['_root_addrs']: list = []` field with the intent of fully replacing the `'_root_mailbox'` field since, * it will need to be a collection to support multi-tpt, * it's a more cohesive field name alongside `_registry_addrs`, * the root actor of every tree needs to have a dedicated addr set (separate from any host-singleton registry actor) so that all its subs can contact it for capabilities mgmt including debugger access/locking. - in the root, populate the field in `._runtime.async_main()` and for now just set '_root_mailbox' to the first entry in that list in anticipation of future multi-homing/transport support.
2025-06-17 15:33:36 +00:00
raddrs: list[Address] = _state._runtime_vars['_root_addrs']
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
for addr in accept_addrs:
Use `enable_transports: list[str]` parameter Actually applying the input it in the root as well as all sub-actors by passing it down to sub-actors through runtime-vars as delivered by the initial `SpawnSpec` msg during child runtime init. Impl deats, - add a new `_state._runtime_vars['_enable_tpts']: list[str]` field set by the input param (if provided) to `.open_root_actor()`. - mk `current_ipc_protos()` return the runtime-var entry with instead the default in the `_runtime_vars: dict` set to `[_def_tpt_proto]`. - in `.open_root_actor()`, still error on this being a >1 `list[str]` until we have more testing infra/suites to audit multi-protos per actor. - return the new value (as 3rd element) from `Actor._from_parent()` as per the todo note; means `_runtime.async_main()` will allocate `accept_addrs` as tpt-specific `Address` entries and pass them to `IPCServer.listen_on()`. Also, - also add a new `_state._runtime_vars['_root_addrs']: list = []` field with the intent of fully replacing the `'_root_mailbox'` field since, * it will need to be a collection to support multi-tpt, * it's a more cohesive field name alongside `_registry_addrs`, * the root actor of every tree needs to have a dedicated addr set (separate from any host-singleton registry actor) so that all its subs can contact it for capabilities mgmt including debugger access/locking. - in the root, populate the field in `._runtime.async_main()` and for now just set '_root_mailbox' to the first entry in that list in anticipation of future multi-homing/transport support.
2025-06-17 15:33:36 +00:00
waddr: Address = wrap_address(addr)
raddrs.append(addr)
else:
_state._runtime_vars['_root_mailbox'] = raddrs[0]
2022-08-03 19:29:34 +00:00
# Register with the arbiter if we're told its addr
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
log.runtime(
f'Registering `{actor.name}` => {pformat(accept_addrs)}\n'
# ^-TODO-^ we should instead show the maddr here^^
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
)
# TODO: ideally we don't fan out to all registrars
# if addresses point to the same actor..
# So we need a way to detect that? maybe iterate
# only on unique actor uids?
for addr in actor.reg_addrs:
try:
waddr = wrap_address(addr)
assert waddr.is_valid
except AssertionError:
await debug.pause()
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
async with get_registry(addr) as reg_portal:
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
for accept_addr in accept_addrs:
accept_addr = wrap_address(accept_addr)
Factor actor-embedded IPC-tpt-server to `ipc` subsys Primarily moving the `Actor._serve_forever()`-task-as-method and supporting actor-instance attributes to a new `.ipo._server` sub-mod which now encapsulates, - the coupling various `trio.Nursery`s (and their independent lifetime mgmt) to different `trio.serve_listener()`s tasks and `SocketStream` handler scopes. - `Address` and `SocketListener` mgmt and tracking through the idea of an "IPC endpoint": each "bound-and-active instance" of a served-listener for some (varied transport protocol's socket) address. - start and shutdown of the entire server's lifetime via an `@acm`. - delegation of starting/stopping tpt-protocol-specific `trio.abc.Listener`s to the corresponding `.ipc._<proto_key>` sub-module (newly defined mod-top-level instead of `Address` method) `start/close_listener()` funcs. Impl details of the `.ipc._server` sub-sys, - add new `IPCServer`, allocated with `open_ipc_server()`, and which encapsulates starting multiple-transport-proto-`trio.abc.Listener`s from an input set of `._addr.Address`s using, |_`IPCServer.listen_on()` which internally spawns tasks that delegate to a new `_serve_ipc_eps()`, a rework of what was (effectively) `Actor._serve_forever()` and which now, * allocates a new `IPCEndpoint`-struct (see below) for each address-listener pair alongside the specified listener-serving/stream-handling `trio.Nursery`s provided by the caller. * starts and stops each transport (socket's) listener by calling `IPCEndpoint.start/close_listener()` which in turn delegates to the underlying `inspect.getmodule(IPCEndpoint.addr)` backend tpt module's equivalent impl. * tracks all created endpoints in a `._endpoints: list[IPCEndpoint]` which is further exposed through public properties for introspection of served transport-protocols and their addresses. |_`IPCServer._[parent/stream_handler]_tn: Nursery`s which are either allocated (in which case, as the same instance) or provided by the caller of `open_ipc_server()` such that the same nursery-cancel-scope controls offered by `trio.serve_listeners(handler_nursery=)` are offered where the `._parent_tn` is used to spawn `_serve_ipc_eps()` tasks, and `._stream_handler_tn` is passed verbatim as `handler_nursery`. - a new `IPCEndpoint`-struct (as mentioned) which wraps each transport-proto's address + listener + allocated-supervising-nursery to encapsulate the "lifetime of a server IPC endpoint" such that eventually we can track and managed per-protocol/address/`.listen_on()`-call scoped starts/stops/restarts for the purposes of filtering/banning peer traffic. |_ also included is an unused `.peer_tpts` table which we can hopefully use to replace `Actor._peers` in a `Channel`-tracking transport-proto-aware way! Surrounding changes to `.ipc.*` primitives to match, - make `[TCP|UDS]Address` types `msgspec.Struct(frozen=True)` and thus drop any-and-all `addr._host =` style mutation throughout. |_ as such also drop their `.__init__()` and `.__eq__()` meths. |_ UDS tweaks to field names and thus `.__repr__()`. - move `[TCP|UDS]Address.[start/close]_listener()` meths to be mod-level equiv `start|close_listener()` funcs. - just hard code the `.ipc._types._key_to_transport/._addr_to_transport` table entries instead of all the prior fancy dynamic class property reading stuff (remember, "explicit is better then implicit"). Modified in `._runtime.Actor` internals, - drop the `._serve_forever()` and `.cancel_server()`, methods and `._server_down` waiting logic from `.cancel_soon()` - add `.[_]ipc_server` which is opened just after the `._service_n` and delegate to it for any equivalent publicly exposed instance attributes/properties.
2025-04-10 22:06:12 +00:00
if not accept_addr.is_valid:
breakpoint()
2022-08-03 19:29:34 +00:00
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
await reg_portal.run_from_ns(
'self',
'register_actor',
uid=actor.uid,
addr=accept_addr.unwrap(),
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
)
is_registered: bool = True
2022-08-03 19:29:34 +00:00
# init steps complete
task_status.started()
# Begin handling our new connection back to our
# parent. This is done last since we don't want to
# start processing parent requests until our channel
# server is 100% up and running.
if actor._parent_chan:
await root_nursery.start(
partial(
_rpc.process_messages,
Decouple actor-state from low-level ipc-server As much as is possible given we currently do some graceful cancellation join-waiting on any connected sub-actors whenever an active `local_nursery: AcrtorNursery` in the post-rpc teardown sequence of `handle_stream_from_peer()` is detected. In such cases we try to allow the higher level inter-actor (task) context(s) to fully cancelled-ack before conducting IPC machinery shutdown. The main immediate motivation for all this is to support unit testing the `.ipc._server` APIs but in the future may be useful for anyone wanting to use our modular IPC transport layer sin-"actors". Impl deats, - drop passing an `actor: Actor` ref from as many routines in `.ipc._server` as possible instead opting to use `._state.current_actor()` where abs needed; thus the fns dropping an `actor` input param are: - `open_ipc_server()` - `IPCServer.listen_on()` - `._serve_ipc_eps()` - `.handle_stream_from_peer()` - factor the above mentioned graceful remote-cancel-ack waiting into a new `maybe_wait_on_canced_subs()` which is called from `handle_stream_from_peer()` and delivers a maybe-`local_nursery: ActorNursery` for downstream logic; it's this new fn which primarily still needs to call `current_actor()`. - in `handle_stream_from_peer()` also use `current_actor()` to check if a handshake is needed (or if it was called as part of some actor-runtime-less operation like our unit test suite!). - also don't pass an `actor` to `._rpc.process_messages()` see how-n-why below.. Surrounding ipc-server client/caller adjustments, - `._rpc.process_messages()` no longer takes an `actor` input and now calls `current_actor()` instead. - `._portal.open_portal()` is adjusted to ^. - `._runtime.async_main()` is adjusted to the `.ipc._server`'s removal of `actor` ref passing. Also, - drop some server `log.info()`s to `.runtime()`
2025-06-11 20:44:47 +00:00
chan=actor._parent_chan,
2022-08-03 19:29:34 +00:00
shield=True,
)
)
Mega-refactor on `._invoke()` targeting `@context`s Since eventually we want to implement all other RPC "func types" as contexts underneath this starts the rework to move all the other cases into a separate func not only to simplify the main `._invoke()` body but also as a reminder of the intention to do it XD Details of re-factor: - add a new `._invoke_non_context()` which just moves all the old blocks for non-context handling to a single def. - factor what was basically just the `finally:` block handler (doing all the task bookkeeping) into a new `@acm`: `_errors_relayed_via_ipc()` with that content packed into the post-`yield` (also with a `hide_tb: bool` flag added of course). * include a `debug_kbis: bool` for when needed. - since the `@context` block is the only type left in the main `_invoke()` body, de-dent it so it's more grok-able B) Obviously this patch also includes a few improvements regarding context-cancellation-semantics (for the `context` RPC case) on the callee side in order to match previous changes to the `Context` api: - always setting any ctxc as the `Context._local_error`. - using the new convenience `.maybe_raise()` topically (for now). - avoiding any previous reliance on `Context.cancelled_caught` for anything public of meaning. Further included is more logging content updates: - being pedantic in `.cancel()` msgs about whether termination is caused by error or ctxc. - optional `._invoke()` traceback hiding via a `hide_tb: bool`. - simpler log headers throughout instead leveraging new `.__repr__()` on primitives. - buncha `<= <actor-uid>` sent some message emissions. - simplified handshake statuses reporting. Other subsys api changes we need to match: - change to `Channel.transport`. - avoiding any `local_nursery: ActorNursery` waiting when the `._implicit_runtime_started` is set. And yes, lotsa more comments for #TODOs dawg.. since there's always somethin!
2024-03-03 00:26:40 +00:00
log.runtime(
'Actor runtime is up!'
# 'Blocking on service nursery to exit..\n'
)
log.runtime(
"Service nursery complete\n"
"Waiting on root nursery to complete"
)
2022-08-03 19:29:34 +00:00
# Blocks here as expected until the root nursery is
# killed (i.e. this actor is cancelled or signalled by the parent)
except Exception as internal_err:
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
if not is_registered:
err_report: str = (
'\n'
"Actor runtime (internally) failed BEFORE contacting the registry?\n"
f'registrars -> {actor.reg_addrs} ?!?!\n\n'
'^^^ THIS IS PROBABLY AN INTERNAL `tractor` BUG! ^^^\n\n'
'\t>> CALMLY CANCEL YOUR CHILDREN AND CALL YOUR PARENTS <<\n\n'
'\tIf this is a sub-actor hopefully its parent will keep running '
'and cancel/reap this sub-process..\n'
'(well, presuming this error was propagated upward)\n\n'
'\t---------------------------------------------\n'
'\tPLEASE REPORT THIS TRACEBACK IN A BUG REPORT @ ' # oneline
'https://github.com/goodboy/tractor/issues\n'
'\t---------------------------------------------\n'
)
2022-08-03 19:29:34 +00:00
# TODO: I guess we could try to connect back
# to the parent through a channel and engage a debugger
# once we have that all working with std streams locking?
log.exception(err_report)
2022-08-03 19:29:34 +00:00
if actor._parent_chan:
await _rpc.try_ship_error_to_remote(
Improved log msg formatting in core As part of solving some final edge cases todo with inter-peer remote cancellation (particularly a remote cancel from a separate actor tree-client hanging on the request side in `modden`..) I needed less dense, more line-delimited log msg formats when understanding ipc channel and context cancels from console logging; this adds a ton of that to: - `._invoke()` which now does, - better formatting of `Context`-task info as multi-line `'<field>: <value>\n'` messages, - use of `trio.Task` (from `.lowlevel.current_task()` for full rpc-func namespace-path info, - better "msg flow annotations" with `<=` for understanding `ContextCancelled` flow. - `Actor._stream_handler()` where in we break down IPC peers reporting better as multi-line `|_<Channel>` log msgs instead of all jammed on one line.. - `._ipc.Channel.send()` use `pformat()` for repr of packet. Also tweak some optional deps imports for debug mode: - add `maybe_import_gb()` for attempting to import `greenback`. - maybe enable `stackscope` tree pprinter on `SIGUSR1` if installed. Add a further stale-debugger-lock guard before removal: - read the `._debug.Lock.global_actor_in_debug: tuple` uid and possibly `maybe_wait_for_debugger()` when the child-user is known to have a live process in our tree. - only cancel `Lock._root_local_task_cs_in_debug: CancelScope` when the disconnected channel maps to the `Lock.global_actor_in_debug`, though not sure this is correct yet? Started adding missing type annots in sections that were modified.
2024-02-19 17:25:08 +00:00
actor._parent_chan,
internal_err,
Improved log msg formatting in core As part of solving some final edge cases todo with inter-peer remote cancellation (particularly a remote cancel from a separate actor tree-client hanging on the request side in `modden`..) I needed less dense, more line-delimited log msg formats when understanding ipc channel and context cancels from console logging; this adds a ton of that to: - `._invoke()` which now does, - better formatting of `Context`-task info as multi-line `'<field>: <value>\n'` messages, - use of `trio.Task` (from `.lowlevel.current_task()` for full rpc-func namespace-path info, - better "msg flow annotations" with `<=` for understanding `ContextCancelled` flow. - `Actor._stream_handler()` where in we break down IPC peers reporting better as multi-line `|_<Channel>` log msgs instead of all jammed on one line.. - `._ipc.Channel.send()` use `pformat()` for repr of packet. Also tweak some optional deps imports for debug mode: - add `maybe_import_gb()` for attempting to import `greenback`. - maybe enable `stackscope` tree pprinter on `SIGUSR1` if installed. Add a further stale-debugger-lock guard before removal: - read the `._debug.Lock.global_actor_in_debug: tuple` uid and possibly `maybe_wait_for_debugger()` when the child-user is known to have a live process in our tree. - only cancel `Lock._root_local_task_cs_in_debug: CancelScope` when the disconnected channel maps to the `Lock.global_actor_in_debug`, though not sure this is correct yet? Started adding missing type annots in sections that were modified.
2024-02-19 17:25:08 +00:00
)
2022-08-03 19:29:34 +00:00
# always!
match internal_err:
case ContextCancelled():
log.cancel(
f'Actor: {actor.uid} was task-context-cancelled with,\n'
f'str(internal_err)'
)
case _:
log.exception(
'Main actor-runtime task errored\n'
f'<x)\n'
f' |_{actor}\n'
)
raise internal_err
2022-08-03 19:29:34 +00:00
finally:
teardown_report: str = (
'Main actor-runtime task completed\n'
Mega-refactor on `._invoke()` targeting `@context`s Since eventually we want to implement all other RPC "func types" as contexts underneath this starts the rework to move all the other cases into a separate func not only to simplify the main `._invoke()` body but also as a reminder of the intention to do it XD Details of re-factor: - add a new `._invoke_non_context()` which just moves all the old blocks for non-context handling to a single def. - factor what was basically just the `finally:` block handler (doing all the task bookkeeping) into a new `@acm`: `_errors_relayed_via_ipc()` with that content packed into the post-`yield` (also with a `hide_tb: bool` flag added of course). * include a `debug_kbis: bool` for when needed. - since the `@context` block is the only type left in the main `_invoke()` body, de-dent it so it's more grok-able B) Obviously this patch also includes a few improvements regarding context-cancellation-semantics (for the `context` RPC case) on the callee side in order to match previous changes to the `Context` api: - always setting any ctxc as the `Context._local_error`. - using the new convenience `.maybe_raise()` topically (for now). - avoiding any previous reliance on `Context.cancelled_caught` for anything public of meaning. Further included is more logging content updates: - being pedantic in `.cancel()` msgs about whether termination is caused by error or ctxc. - optional `._invoke()` traceback hiding via a `hide_tb: bool`. - simpler log headers throughout instead leveraging new `.__repr__()` on primitives. - buncha `<= <actor-uid>` sent some message emissions. - simplified handshake statuses reporting. Other subsys api changes we need to match: - change to `Channel.transport`. - avoiding any `local_nursery: ActorNursery` waiting when the `._implicit_runtime_started` is set. And yes, lotsa more comments for #TODOs dawg.. since there's always somethin!
2024-03-03 00:26:40 +00:00
)
# ?TODO? should this be in `._entry`/`._root` mods instead?
#
# teardown any actor-lifetime-bound contexts
ls: ExitStack = actor.lifetime_stack
# only report if there are any registered
cbs: list[Callable] = [
repr(tup[1].__wrapped__)
for tup in ls._exit_callbacks
]
if cbs:
cbs_str: str = '\n'.join(cbs)
teardown_report += (
'-> Closing actor-lifetime-bound callbacks\n\n'
f'}}>\n'
f' |_{ls}\n'
f' |_{cbs_str}\n'
)
# XXX NOTE XXX this will cause an error which
# prevents any `infected_aio` actor from continuing
# and any callbacks in the `ls` here WILL NOT be
# called!!
# await debug.pause(shield=True)
ls.close()
# XXX TODO but hard XXX
# we can't actually do this bc the debugger uses the
# _service_n to spawn the lock task, BUT, in theory if we had
# the root nursery surround this finally block it might be
# actually possible to debug THIS machinery in the same way
# as user task code?
#
2022-08-03 19:29:34 +00:00
# if actor.name == 'brokerd.ib':
Guarding for IPC failures in `._runtime._invoke()` Took me longer then i wanted to figure out the source of a failed-response to a remote-cancellation (in this case in `modden` where a client was cancelling a workspace layer.. but disconnects before receiving the ack msg) that was triggering an IPC error when sending the error msg for the cancellation of a `Actor._cancel_task()`, but since this (non-rpc) `._invoke()` task was trying to send to a now disconnected canceller it was resulting in a `BrokenPipeError` (or similar) error. Now, we except for such IPC errors and only raise them when, 1. the transport `Channel` is for sure up (bc ow what's the point of trying to send an error on the thing that caused it..) 2. it's definitely for handling an RPC task Similarly if the entire main invoke `try:` excepts, - we only hide the call-stack frame from the debugger (with `__tracebackhide__: bool`) if it's an RPC task that has a connected channel since we always want to see the frame when debugging internal task or IPC failures. - we don't bother trying to send errors to the context caller (actor) when it's a non-RPC request since failures on actor-runtime-internal tasks shouldn't really ever be reported remotely, only maybe raised locally. Also some other tidying, - this properly corrects for the self-cancel case where an RPC context is cancelled due to a local (runtime) task calling a method like `Actor.cancel_soon()`. We now set our own `.uid` as the `ContextCancelled.canceller` value so that other-end tasks know that the cancellation was due to a self-cancellation by the actor itself. We still need to properly test for this though! - add a more detailed module doc-str. - more explicit imports for `trio` core types throughout.
2024-01-02 14:08:39 +00:00
# with CancelScope(shield=True):
# await debug.breakpoint()
2022-08-03 19:29:34 +00:00
Mega-refactor on `._invoke()` targeting `@context`s Since eventually we want to implement all other RPC "func types" as contexts underneath this starts the rework to move all the other cases into a separate func not only to simplify the main `._invoke()` body but also as a reminder of the intention to do it XD Details of re-factor: - add a new `._invoke_non_context()` which just moves all the old blocks for non-context handling to a single def. - factor what was basically just the `finally:` block handler (doing all the task bookkeeping) into a new `@acm`: `_errors_relayed_via_ipc()` with that content packed into the post-`yield` (also with a `hide_tb: bool` flag added of course). * include a `debug_kbis: bool` for when needed. - since the `@context` block is the only type left in the main `_invoke()` body, de-dent it so it's more grok-able B) Obviously this patch also includes a few improvements regarding context-cancellation-semantics (for the `context` RPC case) on the callee side in order to match previous changes to the `Context` api: - always setting any ctxc as the `Context._local_error`. - using the new convenience `.maybe_raise()` topically (for now). - avoiding any previous reliance on `Context.cancelled_caught` for anything public of meaning. Further included is more logging content updates: - being pedantic in `.cancel()` msgs about whether termination is caused by error or ctxc. - optional `._invoke()` traceback hiding via a `hide_tb: bool`. - simpler log headers throughout instead leveraging new `.__repr__()` on primitives. - buncha `<= <actor-uid>` sent some message emissions. - simplified handshake statuses reporting. Other subsys api changes we need to match: - change to `Channel.transport`. - avoiding any `local_nursery: ActorNursery` waiting when the `._implicit_runtime_started` is set. And yes, lotsa more comments for #TODOs dawg.. since there's always somethin!
2024-03-03 00:26:40 +00:00
# Unregister actor from the registry-sys / registrar.
if (
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
is_registered
and not actor.is_registrar
2022-08-03 19:29:34 +00:00
):
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
failed: bool = False
for addr in actor.reg_addrs:
waddr = wrap_address(addr)
assert waddr.is_valid
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
with trio.move_on_after(0.5) as cs:
cs.shield = True
try:
async with get_registry(
addr,
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
) as reg_portal:
await reg_portal.run_from_ns(
'self',
'unregister_actor',
uid=actor.uid
)
except OSError:
failed = True
if cs.cancelled_caught:
2022-08-03 19:29:34 +00:00
failed = True
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
if failed:
teardown_report += (
f'-> Failed to unregister {actor.name} from '
f'registar @ {addr}\n'
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
)
2022-08-03 19:29:34 +00:00
# Ensure all peers (actors connected to us as clients) are finished
if (
(ipc_server := actor.ipc_server)
and
ipc_server.has_peers(check_chans=True)
):
teardown_report += (
f'-> Waiting for remaining peers {ipc_server._peers} to clear..\n'
)
log.runtime(teardown_report)
await ipc_server.wait_for_no_more_peers(
shield=True,
)
2022-08-03 19:29:34 +00:00
2024-07-04 19:06:15 +00:00
teardown_report += (
'-> All peer channels are complete\n'
)
2024-07-04 19:06:15 +00:00
teardown_report += (
'Actor runtime exiting\n'
f'>)\n'
f'|_{actor}\n'
)
log.info(teardown_report)
2022-08-03 19:29:34 +00:00
# TODO: rename to `Registry` and move to `.discovery._registry`!
2018-07-14 20:09:05 +00:00
class Arbiter(Actor):
'''
A special registrar (and for now..) `Actor` who can contact all
other actors within its immediate process tree and possibly keeps
a registry of others meant to be discoverable in a distributed
application. Normally the registrar is also the "root actor" and
thus always has access to the top-most-level actor (process)
nursery.
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
By default, the registrar is always initialized when and if no
other registrar socket addrs have been specified to runtime
init entry-points (such as `open_root_actor()` or
`open_nursery()`). Any time a new main process is launched (and
thus thus a new root actor created) and, no existing registrar
can be contacted at the provided `registry_addr`, then a new
one is always created; however, if one can be reached it is
used.
Normally a distributed app requires at least registrar per
logical host where for that given "host space" (aka localhost
IPC domain of addresses) it is responsible for making all other
host (local address) bound actors *discoverable* to external
actor trees running on remote hosts.
'''
2018-07-14 20:09:05 +00:00
is_arbiter = True
# TODO, implement this as a read on there existing a `._state` of
# some sort setup by whenever we impl this all as
# a `.discovery._registry.open_registry()` API
def is_registry(self) -> bool:
return self.is_arbiter
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
def __init__(
self,
*args,
**kwargs,
) -> None:
2021-12-02 17:34:27 +00:00
self._registry: dict[
tuple[str, str],
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
UnwrappedAddress,
2021-09-08 00:24:02 +00:00
] = {}
self._waiters: dict[
str,
# either an event to sync to receiving an actor uid (which
# is filled in once the actor has sucessfully registered),
# or that uid after registry is complete.
list[trio.Event | tuple[str, str]]
] = {}
super().__init__(*args, **kwargs)
async def find_actor(
self,
name: str,
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
) -> UnwrappedAddress|None:
for uid, addr in self._registry.items():
if name in uid:
return addr
2018-08-31 21:16:24 +00:00
return None
async def get_registry(
self
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
) -> dict[str, UnwrappedAddress]:
'''
Return current name registry.
This method is async to allow for cross-actor invocation.
'''
# NOTE: requires ``strict_map_key=False`` to the msgpack
# unpacker since we have tuples as keys (not this makes the
# arbiter suscetible to hashdos):
# https://github.com/msgpack/msgpack-python#major-breaking-changes-in-msgpack-10
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
return {
'.'.join(key): val
for key, val in self._registry.items()
}
2018-08-26 17:12:29 +00:00
async def wait_for_actor(
self,
name: str,
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
) -> list[UnwrappedAddress]:
'''
Wait for a particular actor to register.
This is a blocking call if no actor by the provided name is currently
registered.
'''
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
addrs: list[UnwrappedAddress] = []
addr: UnwrappedAddress
mailbox_info: str = 'Actor registry contact infos:\n'
for uid, addr in self._registry.items():
mailbox_info += (
f'|_uid: {uid}\n'
f'|_addr: {addr}\n\n'
Mega-refactor on `._invoke()` targeting `@context`s Since eventually we want to implement all other RPC "func types" as contexts underneath this starts the rework to move all the other cases into a separate func not only to simplify the main `._invoke()` body but also as a reminder of the intention to do it XD Details of re-factor: - add a new `._invoke_non_context()` which just moves all the old blocks for non-context handling to a single def. - factor what was basically just the `finally:` block handler (doing all the task bookkeeping) into a new `@acm`: `_errors_relayed_via_ipc()` with that content packed into the post-`yield` (also with a `hide_tb: bool` flag added of course). * include a `debug_kbis: bool` for when needed. - since the `@context` block is the only type left in the main `_invoke()` body, de-dent it so it's more grok-able B) Obviously this patch also includes a few improvements regarding context-cancellation-semantics (for the `context` RPC case) on the callee side in order to match previous changes to the `Context` api: - always setting any ctxc as the `Context._local_error`. - using the new convenience `.maybe_raise()` topically (for now). - avoiding any previous reliance on `Context.cancelled_caught` for anything public of meaning. Further included is more logging content updates: - being pedantic in `.cancel()` msgs about whether termination is caused by error or ctxc. - optional `._invoke()` traceback hiding via a `hide_tb: bool`. - simpler log headers throughout instead leveraging new `.__repr__()` on primitives. - buncha `<= <actor-uid>` sent some message emissions. - simplified handshake statuses reporting. Other subsys api changes we need to match: - change to `Channel.transport`. - avoiding any `local_nursery: ActorNursery` waiting when the `._implicit_runtime_started` is set. And yes, lotsa more comments for #TODOs dawg.. since there's always somethin!
2024-03-03 00:26:40 +00:00
)
if name == uid[0]:
addrs.append(addr)
if not addrs:
waiter = trio.Event()
self._waiters.setdefault(name, []).append(waiter)
await waiter.wait()
for uid in self._waiters[name]:
if not isinstance(uid, trio.Event):
addrs.append(self._registry[uid])
log.runtime(mailbox_info)
return addrs
2018-07-14 20:09:05 +00:00
async def register_actor(
2021-07-01 18:52:52 +00:00
self,
uid: tuple[str, str],
Allocate bind-addrs in subactors Previously whenever an `ActorNursery.start_actor()` call did not receive a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs in the parent, for UDS this obviously won't work nor is it ideal bc it's nicer to have the actor to be a socket server (who calls `Address.open_listener()`) define the socket-file-name containing their unique ID info such as pid, actor-uuid etc. As such this moves "random" generation of server addresses to the child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`; i.e. we do the allocation of the `Address.get_random()` addrs inside `._runtime.async_main()` instead of `Portal.start_actor()` and **only when** `accept_addrs`/`bind_addrs` was **not provided by the spawning parent**. Further this patch get's way more rigorous about the `SpawnSpec` processing in the child inside `Actor._from_parent()` such that we handle any invalid msgs **very loudly and pedantically!** Impl deats, - do the "random addr generation" in an explicit `for` loop (instead of prior comprehension) to allow for more detailed typing of the layered calls to the new `._addr` mod. - use a `match:/case:` for process any invalid `SpawnSpec` payload case where we can instead receive a `MsgTypeError` from the `chan.recv()` call in `Actor._from_parent()` to raise it immediately instead of triggering downstream type-errors XD |_ as per the big `#TODO` we prolly want to take from other callers of `Channel.recv()` (like in the `._rpc.process_messages()` loop). |_ always raise `InternalError` on non-match/fall-through case! |_ add a note about not being able to use `breakpoint()` in this section due to causality of `SpawnSpec._runtime_vars` not having been processed yet.. |_ always return a third element from `._from_rent()` eventually to be the `preferred_transports: list[str]` from the spawning rent. - use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)` for all actor creation (including in all the mods tweaked here). - Move to new type-alias-name `UnwrappedAddress` throughout.
2025-03-31 01:36:45 +00:00
addr: UnwrappedAddress
2018-08-26 17:12:29 +00:00
) -> None:
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
uid = name, hash = (str(uid[0]), str(uid[1]))
waddr: Address = wrap_address(addr)
if not waddr.is_valid:
# should never be 0-dynamic-os-alloc
await debug.pause()
Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 19:19:30 +00:00
self._registry[uid] = addr
# pop and signal all waiter events
events = self._waiters.pop(name, [])
self._waiters.setdefault(name, []).append(uid)
for event in events:
if isinstance(event, trio.Event):
event.set()
2018-07-14 20:09:05 +00:00
async def unregister_actor(
self,
uid: tuple[str, str]
) -> None:
2021-09-08 00:24:02 +00:00
uid = (str(uid[0]), str(uid[1]))
entry: tuple = self._registry.pop(uid, None)
if entry is None:
log.warning(f'Request to de-register {uid} failed?')