WIP: Create `examples/multihost/` script set to show some distributed cases! #3

Draft

goodboy wants to merge 305 commits from multihost_exs into aio_abandons

Author	SHA1	Message	Date
Jad Abou-Chakra	afb2501ceb	Decouple registery addresses from binding addresses	2025-03-23 00:58:51 -04:00
Tyler Goodlet	90f48512d1	Add in depth comment about module naming when used without pkg	2025-03-23 00:58:51 -04:00
Tyler Goodlet	87619e1b3f	Add a super naive multi-host-capable web-req proxier for @jc211	2025-03-23 00:58:51 -04:00
Tyler Goodlet	3d54885981	Continue supporting py3.11+ Apparently the only thing needing a guard was use of `asyncio.Queue.shutdown()` and the paired `QueueShutDown` exception? Cool.	2025-03-22 14:36:12 -04:00
Tyler Goodlet	bd19942328	Bump up to `pytest>=8.3.5` to match "GH actions" Ensure it's only for the `--dev` optional deps.	2025-03-22 14:36:12 -04:00
Tyler Goodlet	9919edc4bb	Mask top level import of `.hilevel` Since it isn't required until the landing of the new service-manager stuff in #12; was an oversight from commit `0607a31dddeba032a2cf7d9fe605edd9d7bb4846`.	2025-03-22 14:36:12 -04:00
Tyler Goodlet	888a3ae760	Add `.runtime()`-emit to `._invoke()` to report final result msg in the child	2025-03-22 14:36:12 -04:00
Tyler Goodlet	68d71c2df1	Add `MsgStream._stop_msg` use new `PldRx` API In particular ensuring we use `ctx._pld_rx.recv_msg_nowait()` from `.receive_nowait()` (which is called from `.aclose()`) such that we ALWAYS (can) set the surrounding `Context._result/._outcome_msg` attrs on reception of a final `Return`!! This fixes a final stream-teardown-race-condition-bug where prior we normally didn't set the `Context._result/._outcome_msg` in such cases. This is precisely because `.receive_nowait()` only returns the `pld` and when called from `.aclose()` this value is discarded, meaning so is its boxing `Return` despite consuming it from the underlying `._rx_chan`.. Longer term this should be solved differently by ensuring such races cases are handled at a higher scope like inside `Context._deliver_msg()` or the `Portal.open_context()` enter/exit blocks? Add a detailed warning note and todos for all this around the special case block!	2025-03-22 14:36:12 -04:00
Tyler Goodlet	f0c5b6fb18	Add `Context._outcome_msg` use new `PldRx` API Such that any `Return` is always capture for each ctx instance and set in `._deliver_msg()` normally; ensures we can at least introspect for it when missing (like in a recently discovered stream teardown race bug). Yes this augments the already existing `._result` which is dedicated for the `._outcome_msg.pld` in the non-error case; we might want to see if there's a nicer way to directly proxy ref to that without getting the pre-pld-decoded `Raw` form with `msgspec`? Also use the new `ctx._pld_rx.recv_msg()` and drop assigning `pld_rx._ctx`.	2025-03-22 14:36:12 -04:00
Tyler Goodlet	7d19c58373	Slight `PldRx` rework to simplify Namely renaming and tweaking the `MsgType` receiving methods, - `.recv_msg()` from what was `.recv_msg_w_pld()` which both receives the IPC msg from the underlying `._rx_chan` and then decodes its payload with `.decode_pld()`; it now also log reports on the different "stage of SC dialog protocol" msg types via a `match/case`. - a new `.recv_msg_nowait()` sync equivalent of ^ (was `.recv_pld_nowait()`) who's use was the source of a recently discovered bug where any final `Return.pld` is being consumed-n-discarded by by `MsgStream.aclose()` depending on ctx/stream teardown race conditions.. Also, - remove all the "instance persistent" ipc-ctx attrs, specifically the optional `_ipc`, `_ctx` and the `.wraps_ipc()` cm, since none of them were ever really needed/used; all methods which require a `Context/MsgStream` are explicitly always passed. - update a buncha typing namely to use the more generic-styled `PayloadT` over `Any` and obviously `MsgType[PayloadT]`.	2025-03-22 14:36:12 -04:00
Tyler Goodlet	830be005ea	Rename ext-types with `msgspec` suite module	2025-03-22 14:36:12 -04:00
Tyler Goodlet	5018284db2	Complete rename to parent->child IPC ctx peers Now changed in all comments docs and test-code content such that we aren't using the "caller"->"callee" semantics anymore.	2025-03-22 14:36:12 -04:00
Tyler Goodlet	0a56f62748	Mk `tests/__init__.py`, not sure where it went? I must have had a local touched file but never committed or something? Seems that new `pytest` requires a top level `tests` pkg in order for relative `.conftest` imports to work.	2025-03-22 14:36:12 -04:00
Tyler Goodlet	f999f8228a	Fix msg-draining on `parent_never_opened_stream`! Repairs a bug in `drain_to_final_msg()` where in the `Yield()` case block we weren't guarding against the `ctx._stream is None` edge case which should be treated a `continue`-draining (not a `break` or attr-error!!) situation since the peer task maybe be continuing to send `Yield` but has not yet sent an outcome msg (one of `Return/Error/ContextCancelled`) to terminate the loop. Ensure we explicitly warn about this case as well as `.cancel()` emit on a taskc. Thanks again to @guille for discovering this! Also add temporary `.info()`s around rxed `Return` msgs as part of trying to debug a different bug discovered while updating the context-semantics test suite (in a prior commit).	2025-03-22 14:36:12 -04:00
Tyler Goodlet	87e04c9311	Extend ctx semantics suite for streaming edge cases! Muchas grax to @guilledk for finding the first issue which kicked of this further scrutiny of the `tractor.Context` and `MsgStream` semantics test suite with a strange edge case where, - if the parent opened and immediately closed a stream while the remote child task started and continued (without terminating) to send msgs the parent's `open_context().__aexit__()` would not block on the child to complete! => this was seemingly due to a bug discovered inside the `.msg._ops.drain_to_final_msg()` stream handling case logic where we are NOT checking if `Context._stream` is non-`None`! As such this, - extends the `test_caller_closes_ctx_after_callee_opens_stream` (now renamed, see below) to include cases for all combinations of the child and parent sending before receiving on the stream as well as all placements of `Context.cancel()` in the parent before, around and after the stream open. - uses the new `expect_ctxc()` for expecting the taskc (`trio.Task` cancelled)` cases. - also extends the `test_callee_closes_ctx_after_stream_open` (also renamed) to include the case where the parent sends a msg before it receives. => this case has unveiled yet-another-bug where somehow the underlying `MsgStream._rx_chan: trio.ReceiveMemoryChannel` is allowing the child's `Return[None]` msg be consumed and NOT in a place where it is correctly set as `Context._result` resulting in the parent hanging forever inside `._ops.drain_to_final_msg()`.. Alongside, - start renaming using the new "remote-task-peer-side" semantics throughout the test module: "caller" -> "parent", "callee" -> "child".	2025-03-22 14:36:12 -04:00
Tyler Goodlet	e7cc91763c	Deliver a `MaybeBoxedError` from `.expect_ctxc()` Just like we do from the `.devx._debug.open_crash_handler()`, this allows checking various attrs on the raised `ContextCancelled` much like `with pytest.raises() as excinfo:`.	2025-03-22 14:36:12 -04:00
Tyler Goodlet	723a25b74d	Support `ctx: UnionType` annots for `@tractor.context` eps	2025-03-22 14:36:12 -04:00
Tyler Goodlet	49ecdc4d73	Avoid attr-err when `._ipc_msg==None` Seems this can happen in particular when we raise a `MessageTypeError` on the sender side of a `Context`, since there isn't any msg relayed from the other side (though i'm wondering if MTE should derive from RAE then considering this case?). Means `RemoteActorError.boxed_type = None` in such cases instead of raising an attr-error for the `None.boxed_type_str`.	2025-03-22 14:36:12 -04:00
Tyler Goodlet	defae151ec	Facepalm, fix logic misstep on child side Namely that `add_hooks: bool` should be the same as on the rent side.. Also, just drop the now unused `iter_maybe_sends`. This makes the suite entire greeeeen btw, including the new sub-suite which i hadn't runt before Bo	2025-03-22 14:36:12 -04:00
Tyler Goodlet	c48d153375	Rework IPC-using `test_caps_basesd_msging` tests Namely renaming and massively simplifying it to a new `test_ext_types_over_ipc` which avoids all the wacky "parent dictates what sender should be able to send beforehand".. Instead keep it simple and just always try to send the same small set of types over the wire with expect-logic to handle each case, - use the new `dec_hook`/`ext_types` args to `mk_[co]dec()` routines for pld-spec ipc transport. - always try to stream a small set of types from the child with logic to handle the cases expected to error. Other, - draft a `test_pld_limiting_usage` to check runtime raising of bad API usage; haven't run it yet tho. - move `test_custom_extension_types` to top of mod so that the `enc/dec_nsp()` hooks can be reffed from test parametrizations. - comment out (and maybe remove) the old routines for `iter_maybe_sends`, `test_limit_msgspec`, `chk_pld_type`. XXX TODO, turns out the 2 failing cases from this suite have exposed an an actual bug with `MsgTypeError` unpacking where the `ipc_msg=` input is being set to `None` ?? -> see the comment at the bottom of `._exceptions._mk_recv_mte()` which seems to describe the likely culprit?	2025-03-22 14:36:12 -04:00
Tyler Goodlet	123683d442	Raise RTE from `limit_plds()` on no `curr_ctx` Since it should only be used from within a `Portal.open_context()` scope, make sure the caller knows that! Also don't hide the frame in tb if the immediate function errors..	2025-03-22 14:36:12 -04:00
Tyler Goodlet	fbbecff394	Offer a `mods: list` to `dec_type_union()`; drop importing this-mod	2025-03-22 14:36:12 -04:00
Tyler Goodlet	9199913f70	Tweak type-error messages for when `ext_types` is missing	2025-03-22 14:36:12 -04:00
Tyler Goodlet	84be5cc549	Move `Union` serializers to new `msg.` mod Namely moving `enc/dec_type_union()` from the test mod to a new `tractor.msg._exts` for general use outside the test suite.	2025-03-22 14:36:12 -04:00
Tyler Goodlet	4a566546a3	Finally get type-extended `msgspec` fields workinn By using our new `PldRx` design we can, - pass through the pld-spec & a `dec_hook()` to our `MsgDec` which is used to configure the underlying `.dec: msgspec.msgpack.Decoder` - pass through a `enc_hook()` to `mk_codec()` and use it to conf the equiv `MsgCodec.enc` such that sent msg-plds are converted prior to transport. The trick ended up being just to always union the `mk_dec()` extension-types spec with the normaly with the `msgspec.Raw` pld-spec such that the `dec_hook()` is only invoked for payload types tagged by the encoder/sender side B) A variety of impl tweaks to make it all happen as well as various cleanups in the `.msg._codec` mod include, - `mk_dec()` no defaul `spec` arg, better doc string, accept the new `ext_types` arg, doing the union of that with `msgspec.Raw`. - proto-ed a now unused `mk_boxed_ext_struct()` which will likely get removed since it ended up that our `PayloadMsg` structs already cover the ext-type-hook requirement that the decoder is passed a `.type=msgspec.Struct` of some sort in order for `.dec_hook` to be used. - add a `unpack_spec_types()` util fn for getting the `set[Type]` from from a `Union[Type]` annotation instance. - mk the default `mk_codec(pc_pld_spec = Raw,)` since the `PldRx` design was already passing/overriding it and it doesn't make much sense to use `Any` anymore for the same reason; it will cause various `Context` apis to now break. \|_ also accept a `enc_hook()` and `ext_types` which are used to maybe config the `.msgpack.Encoder` - generally tweak a bunch of comments-as-docs and todos namely the ones that are completed after the pld-rx design was implemented. Also, - mask the non-functioning `'defstruct'` approach `inside `.msg.types.mk_msg_spec()` to prep for its removal. Adjust the test suite (rn called `test_caps_based_msging`), - add a new suite `test_custom_extension_types` and move and use the `enc/dec_nsp()` hooks to the mod level for its use. - prolly planning to drop the `test_limit_msgspec` suite since it's mostly replaced by the `test_pldrx_limiting` mod's version? - originally was tweaking a bunch in `test_codec_hooks_mod` but likely it will get mostly rewritten to be simpler and simply verify that ext-typed fields can be used over IPC `Context`s between actors (as originally intended for this sub-suite).	2025-03-22 14:36:12 -04:00
Tyler Goodlet	1c2e174406	Bump to `msgspec>=0.19.0` for py 3.13 support!	2025-03-22 14:33:46 -04:00
Tyler Goodlet	c19f6e3c6a	Bind another `_bexc` for debuggin	2025-03-22 14:32:27 -04:00
Tyler Goodlet	7e78223fb5	Mask ctlc borked REPL tests Namely the `tractor.pause_from_sync()` examples using both bg threads and `asyncio` which seem to go into bad states where SIGINT is ignored.. Deats, - add `maybe_expect_timeout()` cm to ensure the EOF hangs get `.xfail()`ed instead. - @pytest.mark.ctlcs_bish` `test_pause_from_sync` and don't expect the greenback prompt msg. - also mark `test_sync_pause_from_aio_task`.	2025-03-22 14:32:27 -04:00
Tyler Goodlet	e313cb5e30	Repair/update `stackscope` test Seems that on 3.13 it's not showing our script code in the output now? Gotta get an example for @oremanj to see what's up but really it'd be nice to just custom format stuff above `trio`'s runtime by def.. Anyway, update the `.devx._stackscope`, - log formatting to be a little more "sclangy" lookin. - change the per-actor "delimiter" lines style. - report the `signal.getsignal(SIGINT)` which i needed in the `sync_bp.py` with ctl-c causing a hang.. - mask the `_tree_dumped` duplicator log report as well as the "dumped fine" one. - add an example `pkill --signal SIGUSR1` cmdline. Tweak the test to cope with, - not showing our script lines now.. which i've commented in the `assert_before()` patts.. - to expect the newly formatted delimiter (ascii) lines to separate the root vs. hanger sub-actor sections.	2025-03-22 14:32:27 -04:00
Tyler Goodlet	b9febe6826	Comment-tag pause points in `asycnio_bp.py` Thought i already did this but, obvi needed these to make the expect matches pass in our test.	2025-03-22 14:32:27 -04:00
Tyler Goodlet	92d07233b0	Unpack errors from `pdb.bdb` Like any `bdb.BdbQuit` that might be relayed from a remote context after a REPl exit with the `quit` cmd. This fixes various issues while debugging where it may not be clear to the parent task that the child was terminated with a purposefully unrecoverable error.	2025-03-22 14:32:27 -04:00
Tyler Goodlet	5ff2740b9d	Add a mark to `pytest.xfail()` questionably conc py stuff (ur mam `.xfail()`s bish!)	2025-03-22 14:32:27 -04:00
Tyler Goodlet	915b5a5a86	Show frames when decode is handed bad input	2025-03-22 14:32:27 -04:00
Tyler Goodlet	60eca816e7	Be extra sure to re-raise EoCs from translator That is whenever `trio.EndOfChannel` is raised (presumably from the `._to_trio.receive()` call inside `LinkedTaskChannel.receive()`) we need to be extra certain that we let it bubble upward transparently DESPITE special exc-as-signal handling that is normally suppressed from the aio side; REPEAT we want to ALWAYS bubble any `trio_err == trio.EndOfChannel` in the `finally:` handler of `translate_aio_errors()` despite `chan._trio_to_raise == AsyncioTaskExited` such that the caller's iterable machinery will operate as normal when the inter-task stream is stopped (again, presumably by the aio side task terminating the inter-task stream). Main impl deats for this, - in the EoC handler block ensure we assign both `chan._trio_err` and the local `trio_err` as well as continue to re-raise. - add a case to the match block in the `finally:` handler which FOR SURE re-raises any `type(trio_err) is EndOfChannel`! Additionally fix a bad bug, - a ref bug where we were NOT using the `except BaseException as _trio_err` to assign to `chan._trio_err` (by accident was missing the leading `_`..) Unrelated impl tweak, - move all `maybe_raise_aio_side_err()` content back to inline with its parent func - makes it easier to use `tractor.pause()` mostly Bp - go back to trying to use `aio_task.set_exception(aio_taskc)` for now even though i'm pretty sure we're going to move to a try-fute-first style helper for this in the future. Adjust some tests to match/mk-them-green, - break from `aio_echo_server()` recv loop on `to_asyncio.TrioTaskExited` much like how you'd expect to (implicitly with a `for`) with a `trio.EndOfChannel`. - toss in a masked `value is None` pause point i needed for debugging inf looping caused by not re-raising EoCs per the main patch description. - add a debug-mode sized delay to root-infected test.	2025-03-22 14:32:27 -04:00
Tyler Goodlet	ef96833d6c	Fix an `aio_err` ref bug	2025-03-22 14:32:27 -04:00
Tyler Goodlet	2078bea7f7	Another loosie in the trioisms suite	2025-03-22 14:32:27 -04:00
Tyler Goodlet	adcb0272e5	Match `maybe_open_crash_handler()` to non-maybe version Such that it will deliver a `BoxedMaybeException` to the caller regardless whether `pdb` is set, and proxy through all `**kwargs`.	2025-03-22 14:32:27 -04:00
Tyler Goodlet	058f8f4ef8	Use `collapse_eg()` in broadcaster suite Around the test embedded `trio.open_nursery()` calls as expected. Also tidy up the various nursery var names.	2025-03-22 14:32:27 -04:00
Tyler Goodlet	d874513448	Draft some eg collapsing helpers Inside a new `.trionics._beg` and exposed from the subpkg ns in anticipation of the `strict_exception_groups=False` being removed by `trio` in py 3.15. Notes, - mk an embedded single-exc "extractor" using a `BaseExceptionGroup.exceptions` length check, when 1 return the lone child. - use the above in a new `@acm`, async bc it's most likely to be composed in an `async with` tuple-style sequence block, called `collapse_eg()` which acts a one line "absorber" for when the above mentioned flag is no logner supported by `trio.open_nursery()`. All untested atm fwiw.. but soon to be used in our test suite(s) likely!	2025-03-22 14:32:27 -04:00
Tyler Goodlet	b84088c364	Fix docs tests with yet another loosie-goosie So the KBI propagates up to the actor nursery scope and also avoid running any `examples/multihost/` subdir scripts.	2025-03-22 14:32:26 -04:00
Tyler Goodlet	1143dc2862	Another couple loose-ifies for discovery and advanced fault suites	2025-03-22 14:29:54 -04:00
Tyler Goodlet	4bbb1c363a	Add (masked) meta-debug-fixture for determining if `debug_mode` is set in harness..	2025-03-22 14:29:54 -04:00
Tyler Goodlet	7fb6e28307	Various test tweaks related to 3.13 egs Including changes like, - loose eg flagging in various test emedded `trio.open_nursery()`s. - changes to eg handling (like using `except*`). - added `debug_mode` integration to tests that needed some REPLin in order to figure out appropriate updates.	2025-03-22 14:29:54 -04:00
Tyler Goodlet	e8b78ae27a	Go to loose egs in `Actor` root & service nurseries (for now..)	2025-03-22 14:29:54 -04:00
Tyler Goodlet	36bca2844d	Fix `roundtripped` ref error in `validate_payload_msg()`	2025-03-22 14:29:54 -04:00
Tyler Goodlet	2008372693	Hide `open_nursery()` frame by def	2025-03-22 14:29:54 -04:00
Tyler Goodlet	0f103f49d4	Moar sclang log fmting tweaks	2025-03-22 14:29:54 -04:00
Tyler Goodlet	ea0643eab6	Add equiv of `AsyncioCancelled` for aio side Such that a `TrioCancelled` is raised in the aio task via `.set_exception()` to explicitly indicate and allow that task to handle a taskc request from the parent `trio.Task`.	2025-03-22 14:29:54 -04:00
Tyler Goodlet	985c5a4af7	More `debug_mode` test support, better nursery var names	2025-03-22 14:29:54 -04:00
Tyler Goodlet	08fa266de4	Add per-side graceful-exit/cancel excs-as-signals Such that any combination of task terminations/exits can be explicitly handled and "dual side independent" crash cases re-raised in egs. The main error-or-exit impl changes include, - use of new per-side "signaling exceptions": - TrioTaskExited\|TrioCancelled for signalling aio. - AsyncioTaskExited\|AsyncioCancelled for signalling trio. - NOT overloading the `LinkedTaskChannel._trio/aio_err` fields for err-as-signal relay and instead add a new pair of `._trio/aio_to_raise` maybe-exc-attrs which allow each side's task to specify what it would want the other side to raise to signal its/a termination outcome: - `._trio_to_raise: AsyncioTaskExited\|AsyncioCancelled` to signal, \|_ the aio task having returned while the trio side was still reading from the `asyncio.Queue` or is just not `.done()`. \|_ the aio task being self or trio-request cancelled where a `asyncio.CancelledError` is raised and caught but NOT relayed as is back to trio; instead signal a "more explicit" exc type. - `._aio_to_raise: TrioTaskExited\|TrioCancelled` to signal, \|_ the trio task having returned while the aio side was still reading from the mem chan and indicating that the trio side might not care any more about future streamed values (like the `Stop/EndOfChannel` equivs for ipc `Context`s). \|_ when the trio task canceld we do a `asyncio.Future.set_exception(TrioTaskExited())` to indicate to the aio side verbosely that it should cancel due to the trio parent. - `_aio/trio_err` are now left to only capturing the actual per-side task excs for introspection / other side's handling logic. - supporting "graceful exits" depending on API in use from `translate_aio_errors()` such that if either side exits but the other side isn't expect to consume the final `return`ed value, we just exit silently, which required: - adding a `suppress_graceful_exits: bool` flag. - adjusting the `maybe_raise_aio_side_err()` logic to use that flag and suppress only on certain combos of `._trio_to_raise/._trio_err`. - prefer to raise `._trio_to_raise` when the aio-side is the src and vice versa. - filling out pedantic logging for cancellation cases indicating which side is the cause. - add a `LinkedTaskChannel._aio_result` modelled after our `Context._result` a a similar `.wait_for_result()` interface which allows maybe accessing the aio task's final return value if desired when using the `open_channel_from()` API. - rename `cancel_trio()` done handler -> `signal_trio_when_done()` Also some fairly major test suite updates, - add a `delay: int` producing fixture which delivers a much larger timeout whenever `debug_mode` is set so that the REPL can be used without a surrounding cancel firing. - add a new `test_aio_exits_early_relays_AsyncioTaskExited` including a paired `exit_early: bool` flag to `push_from_aio_task()`. - adjust `test_trio_closes_early_causes_aio_checkpoint_raise` to expect a `to_asyncio.TrioTaskExited`.	2025-03-22 14:29:54 -04:00
Tyler Goodlet	266d8e0feb	Expose `._state.debug_mode()` predicate at top level	2025-03-22 14:29:54 -04:00
Tyler Goodlet	04bc7cbfa4	Another loose-egs flag in `test_child_manages_service_nursery`	2025-03-22 14:29:54 -04:00
Tyler Goodlet	cd1628e3a3	Handle egs on failed `request_root_stdio_lock()` Namely when the subactor fails to lock the root, in which case we try to be very verbose about how/what failed in logging as well as ensure we cancel the employed IPC ctx. Implement the outer `BaseException` handler to handle both styles, - match on an eg (or the prior std cancel excs) only raising a lone sub-exc from for former. - always `as _req_err:` and assign to a new func-global `req_err` to enable the above matching. Other, - raise `DebugStateError` on `status.subactor_uid != actor_uid`. - fix a `_repl_fail_report` ref error due to making silly assumptions about the `_repl_fail_msg` global; now copy from global as default. - various log-fmt and logic expression styling tweaks. - ignore `trio.Cancelled` by default in `open_crash_handler()`.	2025-03-22 14:29:54 -04:00
Tyler Goodlet	3a9a15ceb2	A couple more loose-egs flag flips Namely inside, - `ActorNursery.open_portal()` which uses `.trionics.maybe_open_nursery()` and is now adjusted to pass-through `**kwargs` for at least this flag. - inside the `.trionics.gather_contexts()`.	2025-03-22 14:29:54 -04:00
Tyler Goodlet	a25f093ba5	Disable tb colors in `._testing.mk_cmd()` Unset the appropriate cpython osenv var such that our `pexpect` script runs in the test suite can maintain original matching logic.	2025-03-22 14:29:54 -04:00
Tyler Goodlet	18528dde33	Log format tweaks for sclang reprs A space here, a newline there..	2025-03-22 14:29:54 -04:00
Tyler Goodlet	747f89c3ef	Expose `hide_tb: bool` from `.open_nursery()` Such that it gets passed through to `.open_root_actor()` in the `implicit_runtime==True` case - useful for debugging cases where `.devx._debug` APIs might be used to avoid REPL clobbering in subactors.	2025-03-22 14:29:54 -04:00
Tyler Goodlet	1f951a94f3	Another `is` fix..	2025-03-22 14:29:54 -04:00
Tyler Goodlet	4de48972aa	Unset `$PYTHON_COLORS` for test debugger suite.. Since obvi all our `pexpect` patterns aren't going to match with a heck-ton of terminal color escape sequences in the output XD	2025-03-22 14:29:54 -04:00
Tyler Goodlet	de4c33d158	Flip to `strict_exception_groups=False` in core tns Since it'll likely need a bit of detailing to get the test suite running identically with strict egs (exception groups), i've opted to just flip the switch on a few core nursery scopes for now until as such a time i can focus enough to port the matching internals.. Xp	2025-03-22 14:29:54 -04:00
Tyler Goodlet	c6ef88a4b2	Clean up some imports in `._clustering`	2025-03-22 14:29:54 -04:00
Tyler Goodlet	9a44c67728	Drop `asyncio`-canc error from `._exceptions`	2025-03-22 14:29:54 -04:00
Tyler Goodlet	8573cd3263	Tweak some test asserts to better `is` style	2025-03-22 14:29:54 -04:00
Tyler Goodlet	97b3b98893	Bump various (dev) deps and prefer sys python Since it turns out there's a few gotchas moving to python 3.13, - we need to pin to new(er) `trio` which now flips to strict exception groups (something to be handled in a follow up patch). - since we're now using `uv` we should (at least for now) prefer the system `python` (over astral's distis) since they compile for `libedit` in terms of what the (new) `readline.backend: str` will read as; this will break our tab-completion and vi-mode settings in the `pdbp` REPL without a user configuring a `~/.editrc` appropriately. - go back to using latest `pdbp` (not a local dev version) since it should work fine presuming the previous bullet is addressed. Lock bumps, - for now use latest `trio==0.29.0` (which i gotta feeling might have broken some existing attempts at strict-eg handling i've tried..) - update to latest `xonsh`, `pdbp` and its dep `tabcompleter` Other cleaning, - put back in various deps "comments" from `poetry` content. - drop the `xonsh-vox` and `xontrib-vox` dev deps; no `vox` support with `uv` rn anyway..	2025-03-22 14:29:52 -04:00
Tyler Goodlet	b7aa72465d	Draft test-doc for "out-of-band" `asyncio.Task`.. Since there's no way to activate `greenback`'s portal in such cases, we should at least have a test verifying our very loud error about the inability to support this usage..	2025-03-22 14:24:53 -04:00
Tyler Goodlet	1ff79f86b7	Raise "independent" task errors in an eg The (rare) condition is heavily detailed in new comments in the `cancel_trio()` callback but, more or less the idea here is to be extra pedantic in raising an `Exceptiongroup` of errors from each task (both `asyncio` and `trio`) whenever the 2 tasks raise "independently" - in the sense that it's not obviously one side's task causing an error (or cancellation) in the other. In this case we set the error for each side on the `LinkedTaskChannel` (via new attrs described later). As a synopsis, most of this work was refined out of supporting `infected_aio=True` mode in the root actor and in particular as part of getting that to work inside the `modden` daemon which at the time of writing was still using the `i3ipc` lib and thus `asyncio`. Impl deats, - extend the `LinkedTaskChannel` field/API set (and type it), - `._trio_task: trio.Task` for test/user introspection. - also "stage" some ideas for a more refined interface, - `.started()` to deliver the value yielded to the `trio.Task` parent. \|_ also includes some todos for how to implement this design underneath. - `._aio_first: Any\|None = None` to hold that value ^. - `.wait_aio_complete()` for syncing to the asyncio task. - some detailed logging around "asyncio cancelled trio" case. - Move `AsyncioCancelled` in this module. Styling changes, - generally more explicit var naming. - some todos for getting modern and fancy with typing.. NB, Let it be known this commit msg was written on a friday with the help of various "mr. white" solns.	2025-03-22 14:24:53 -04:00
Tyler Goodlet	f26d487000	Add a `tests/test_root_infect_asyncio` Might as well break apart the specific test set since there are some (minor) subtleties and the orig test mod is already getting pretty big XD Includes both the new "independent"-event-loops test as well as the std usage base case suite.	2025-03-22 14:24:53 -04:00
Tyler Goodlet	1075ea3687	Impl a proto "unmasker" `@acm` alongside our test Such that the suite verifies the wip `maybe_raise_from_masking_exc()` will raise from a `trio.Cancelled.__context__` since I can't think of any reason a `Cancelled` should ever be raised in-place of a non-`Cancelled` XD Not sure what should be raised instead (or maybe just a `log.warning()` emitted?) but this starts a draft for refinement at the least. Use the new `@pytest.mark.parametrize` explicit tuple-of-params form with an `pytest.param + `.mark.xfail()` for the default behaviour case.	2025-03-22 14:24:53 -04:00
Tyler Goodlet	2bd4cc9727	Add a "raise-from-`finally:`" example test Since i wasted 2 days just to find an example of this inside an `@acm`, figured I better reproduce for the purposes of maybe implementing a warning sys (inside our wip proto `open_taskman()`) when a nursery detects a single `Cancelled` in an eg where the `.__context__` is set to some non-cancel error (which likely means a cancel-causing source exception was suppressed by accident). Left in a buncha commented code using `maybe_open_nursery()` which i thought might be part of the issue but didn't end up being required; will likely remove on a follow up refinement.	2025-03-22 14:24:53 -04:00
Tyler Goodlet	a60837550e	Yield a boxed-maybe-error from `open_crash_handler()` Along the lines of something like `pytest.raises()` where the handled exception can be inspected from the `pdbp` REPL using its `.value` field B) This is super handy in particular for understanding `BaseException[Group]`s without manually adding surrounding handler code to assign the `except[*] Exception as exc_var:` particularly when trying to understand multi-cancelled eg trees.	2025-03-22 14:24:53 -04:00
Tyler Goodlet	72035a20d7	Add an inter-leaved-task error test Trying to replicate cases where errors are raised in both `trio` and `asyncio` tasks independently (at least in `.to_asyncio` API terms) with a new `test_trio_prestarted_task_bubbles` that generates 3 cases inside a `@acm` calls stack composing a `trio.Nursery` with a `to_asyncio.open_channel_from()` call where a set of `trio` tasks are started in a loop using `.start()` with various exc raising sequences, - the aio task raising before the last `trio` task spawns. - the aio task raising just after the last trio task spawns, but before it starts. - after the last trio task `.start()` call returns control to the parent - but (for now) did not error. TODO, still more cases to discover as i'm still fighting a `modden` bug of this sort atm.. Other, - tweak some other tests to have timeouts since some recent hangs were found.. - started mucking with py3.13 and thus adjustments for strict egs in some tests; full patchset to test suite likely coming soon!	2025-03-22 14:24:53 -04:00
Tyler Goodlet	32e760284f	Hm, `asyncio.Task._fut_waiter.set_exception()`? Since we can't use it to `Task.set_exception()` (since that task method never seems to work.. XD) and setting the private/internal always seems to do the desired raising in the task? I realize it's an internal `asyncio` runtime field but i'd rather take the risk of it breaking then having to rely on our own equivalent hack.. Also, it seems like the case where the task's associated (and internal) future-waiter field is null, we won't run into the (same?) prior hanging issues (maybe since there's nothing for `asyncio` internals to use to wait XD ??) when `Task.cancel()` is used..?? Main deats, - add and `Future.set_exception()` a new signal-exception `class TrioTaskExited(AsyncioCancelled):` whenever the trio-task exits gracefully and the asyncio-side task is still doing blocking work (of some sort) which seem to be predicated by a check that `._fut_waiter is not None`. - always call `asyncio.Queue.shutdown()` for the same^ as well as whenever we decide to call `Task.cancel()`; in that case the shutdown relays correctly? Some further refinements, - only warn about `Task.cancel()` usage when actually used ;) - more local scope vars setting in the exit phase of `translate_aio_errors()`. - also in ^ use explicit caught-exc var names for each error-type.	2025-03-22 14:24:53 -04:00
Tyler Goodlet	14fb56329b	Much more limited `asyncio.Task.cancel()` use Since it can not only cause the guest-mode run to abandon but also in some edge cases prevent `trio`-errors from propagating (at least on py3.12-13?) as discovered as part of supporting this mode officially in the root actor. As such try to avoid that method as much as possible instead opting to pass the `trio`-side error via the iter-task channel ref. Deats, - add a `LinkedTaskChannel._trio_err: BaseException\|None` which gets set whenver the `trio.Task` error is caught; ONLY set `AsyncioCancelled` when the `trio` task was for sure the cause, whether itself cancelled or errored. - always check for this error when exiting the `asyncio` side (even when terminated via a call to `asyncio.Task.cancel()` or during any other `CancelledError` handling such that the `asyncio`-task can expect to handle `AsyncioCancelled` due to the above^^ cases. - never `cs.cancel()` the `trio` side unless that cancel scope has not yet been `.cancel_called` whatsoever; it's a noop anyway. - only raise any exc from `asyncio.Task.result()` when `chan._aio_err` does not already match it since the existence of the pre-existing `task_err` means `asyncio` prolly intends (or has already) raised and interrupted the task elsewhere. Various supporting tweaks, - don't bother maybe-init-ing `greenback` from the actor entrypoint since we already need to (and do) bestow the portals to each `asyncio` task spawned using the `run_task()`/`open_channel_from()` API; further the init-ing should be done already by client code that enables infected mode (even in the root actor). \|_we should prolly also codify it from any `run_daemon(infected_aio=True, debug_mode=True)` usage we offer. - pass all the `_<field>`s to `Linked TaskChannel` explicitly in named kwarg style. - better sclang-style log reports throughout, particularly on teardowns. - generally more/better comments and docs around (not well understood) edge cases. - prep to just inline `maybe_raise_aio_side_err()` closure..	2025-03-22 14:24:53 -04:00
Tyler Goodlet	46f644e748	Expose `debug_filter` from `open_root_actor()` also Such that actor-runtime graceful cancel handling can be used throughout any process tree.	2025-03-22 14:24:53 -04:00
Tyler Goodlet	cdd0c5384a	Drop extra nl from boxed error fmt	2025-03-22 14:24:53 -04:00
Tyler Goodlet	1afef149d4	Raise explicitly on missing `greenback` portal When `.pause_from_sync()` is called from an `asyncio.Task` which was never bestowed a portal we want to be mega pedantic about it; indicate that the task was NOT spawned from our `.to_asyncio` API and likely by some out-of-our-control code (normally using `asyncio.ensure_future()/.create_task()`). Though `greenback` already errors on such usage, it's not always clear why no portal exists; explaining the situation of a 3rd-party-bg-spawned-task should avoid dev confusion for most cases. Impl deats, - distinguish between an actor in infected mode versus the actual caller of `.pause_from_sync()` being an `asyncio.Task` with more explicit `asyncio_task` and `is_infected_aio` vars. - ONLY in the case of being both an infected-mode-actor AND detecting that the caller is an `asyncio.Task`, check `greenback.has_portal()` such that when not bestowed we presume the aforementioned 3rd-party-bg-task case above and raise a new explicit RTE with a detailed explanatory message. - add some masked draft code for handling the speical case of a root actor `asyncio.Task` caller which could (in theory) not actually require gb portal use since the `Lock` can be acquired directly without IPC. \|_this will likely require factoring of various pause machinery funcs into a `_pause_from_root_task()` to mk the impl sane XD Other, - expose a new `debug_filter: Callable` which can be provided by the caller of `_maybe_enter_pm()` to predicate whether to enter the debugger REPL based on the caught `BaseException\|BaseExceptionGroup`; this is handy for customizing the meaning of "graceful cancellations" so as to avoid crash handling on expected egs of more then `trioCancelled`. \|_ make the default as it was implemented: `not is_multi_cancelled(err)` - pass-through a new `ignore: set[BaseException]` as `open_crash_handler(ignore_nested=ignore)` to allow for the same silent-cancellation-egs-swallowing as desired from outside the actor runtime.	2025-03-22 14:24:53 -04:00
Tyler Goodlet	11d4c83aed	Accept err-type override in `is_multi_cancelled()` Such that equivalents of `trio.Cancelled` from other runtimes such as `asyncio.CancelledError` and `subprocess.CalledProcessError` (with a `.returncode == -2`) can be gracefully ignored as needed by the caller. For example this is handy if you want to avoid debug-mode REPL entry on an exception-group full of only some subset of exception types since you expect certain tasks to raise such errors after having been cancelled by a request from some parent supervision sys (some "higher up" `trio.CancelScope`, a remote triggered `ContextCancelled` or just from and OS SIGINT). Impl deats, - offer a new `ignore_nested: set[BaseException]` param which by default we add `trio.Cancelled` to when no other types are provided. - use `ExceptionGroup.subgroup(tuple(ignore_nested)` to filter to egs of the "ignored sub-errors set" and return any such match (instead of `True`). - detail a comment on exclusion case.	2025-03-22 14:24:53 -04:00
Tyler Goodlet	72fc6fce24	Support passing pre-conf-ed `Logger` Such that we can hook into 3rd-party-libs more easily to monkey them and use our (prettier/hipper) console logging with something like (an example from the client project `modden`), ```python connection_mod = i3ipc.connection tractor_style_i3ipc_logger: logging.LoggingAdapter = tractor.log.get_console_log( _root_name=connection_mod.__name__, logger=i3ipc.connection_mod.logger, level='info', ) # monkey the instance-ref in 3rd-party module connection_mod.logger = our_logger ``` Impl deats, - expose as `get_console_log(logger: logging.Logger)` and add default failover logic. - toss in more typing, also for mod-global instance.	2025-03-22 14:24:53 -04:00
Tyler Goodlet	4a195eef4c	Support and test infected-`asyncio`-mode for root Such that you can use, ```python tractor.to_asyncio.run_as_asyncio_guest( trio_main=_trio_main, ) ``` to boostrap the root actor (and thus main parent process) to embed the actor-rumtime into an `asyncio` loop. Prove it all works with an subactor-free version of the aio echo-server test suite B)	2025-03-22 14:24:53 -04:00
Tyler Goodlet	a5b8e009fd	TOSQUASH: `9002f60` howtorelease.md file	2025-03-22 14:24:53 -04:00
Tyler Goodlet	ddf6222eb6	Draft a (pretty)`Struct.fields_diff()` For comparing a `msgspec.Struct` against an input `dict` presumably to be used as input for struct instantiation. The main diff with `.__sub__()` is that non-existing fields on either are reported (loudly).	2025-03-22 14:24:53 -04:00
Tyler Goodlet	9412745aaf	Spitballing how to expose custom `msgspec` type hooks Such that maybe we can eventually offer a nicer higher-level API which implements much of the boilerplate required by `msgspec` (like type-matched branching to serialization logic) via a type-table interface or something? Not sure if the idea is that useful so leaving it all as TODOs for now obviously.	2025-03-22 14:24:53 -04:00
Tyler Goodlet	4a5ab155e2	Add `notes_to_self/howtorelease.md` reminder doc	2025-03-22 14:24:53 -04:00
Tyler Goodlet	526187d1a0	Add TODO for a runtime-vars passing mechanism	2025-03-22 14:24:53 -04:00
Tyler Goodlet	c738f8b540	Change masked `.pause()` line	2025-03-22 14:24:53 -04:00
Tyler Goodlet	962941c56c	Type the inter-loop chans	2025-03-22 14:24:53 -04:00
Tyler Goodlet	b692979dda	Add TODO for a tb frame "filterer" sys..	2025-03-22 14:24:53 -04:00
Tyler Goodlet	5fcb46bbb9	Set `RemoteActorError.pformat(boxer_header=self.relay_uid)` by def	2025-03-22 14:24:53 -04:00
Tyler Goodlet	ec6b2e8738	Support custom `boxer_header: str` provided by `pformat_boxed_tb()` caller	2025-03-22 14:24:53 -04:00
Tyler Goodlet	e1575051f0	Expose a `_ctlc_ignore_header: str` for use in `sigint_shield()`	2025-03-22 14:24:53 -04:00
Tyler Goodlet	5f8ec63b0c	Change `tractor.breakpoint()` to new `.pause()` in test suite	2025-03-22 14:24:53 -04:00
Tyler Goodlet	a356233b47	Wrap `asyncio_bp.py` ex into test suite Ensuring we can at least use `breakpoint()` from an infected actor's `asyncio.Task` spawned via a `.to_asyncio` API. Also includes a little `tests/devx/` reorging, - start splitting out non-`tractor.pause()` tests into a new `test_pause_from_non_trio.py` for all the `.pause_from_sync()` use in bg-threaded or `asyncio` applications. - factor harness commonalities to the `devx/conftest` (namely the `do_ctlc()` masher). - mv `test_pause_from_sync` to the new non`-trio` mod. NOTE, the `ctlc=True` is still failing for `test_pause_from_asyncio_task` which is a user-happiness bug but not anything fundamentally broken - just need to handle the `asyncio` case in `.devx._debug.sigint_shield()`!	2025-03-22 14:24:53 -04:00
Tyler Goodlet	9af6271e99	Add `breakpoint()` hook restoration example + test	2025-03-22 14:24:53 -04:00
Tyler Goodlet	36021d1f2b	Rename `n: trio.Nursery` -> `tn` (task nursery)	2025-03-22 14:24:53 -04:00
Tyler Goodlet	7443e387b5	Messy-teardown `DebugStatus` related fixes Mostly fixing edge cases with `asyncio` and/or bg threads where the `.repl_release: trio.Event` needs to be used from the main `trio` thread OW confusing-but-valid teardown tracebacks can show under various races. Also improve, - log reporting for such internal bugs to make them more obvious on console via `log.exception()`. - only restore the SIGINT handler when runtime is (still) active. - reporting when `tractor.pause(shield=True)` should be used and unhiding the internal frames from the tb in that case. - for `pause_from_sync()` some deep fixes.. \|_add a `allow_no_runtime: bool = False` flag to allow not requiring the actor runtime to be active. \|_fix the `greenback` case-branch to only trigger on `not is_trio_thread`. \|_add a scope-global `repl_owner: Task\|Thread\|None = None` to avoid ref errors..	2025-03-22 14:24:53 -04:00
Tyler Goodlet	d9662d9b34	More `.pause_from_sync()` in bg-threads "polish" Various `try`/`except` blocks around external APIs that raise when not running inside an `tractor` and/or some async framework (mostly to avoid too-late/benign error tbs on certain classes of actor tree teardown): - for the `log.pdb()` prompts emitted before REPL console entry. - inside `DebugStatus.is_main_trio_thread()`'s call to `sniffio`. - in `_post_mortem()` by catching `NoRuntime` when called from a thread still active after the `.open_root_actor()` has already exited. Also, - create a dedicated `DebugStateError` for raising instead of `assert`s when we have actual debug-request inconsistencies (as seem to be most likely with bg thread usage of `breakpoint()`). - show the `open_crash_handler()` frame on `bdb.BdbQuit` (for now?)	2025-03-22 14:24:53 -04:00
Tyler Goodlet	84dbf53817	Hide `[maybe]_open_crash_handler()` frame by default	2025-03-22 14:24:53 -04:00
Tyler Goodlet	e898a41e22	Use our `._post_mortem` from `open_crash_handler()` Since it seems that `pdbp.xpm()` can sometimes lose the up-stack traceback info/frames? Not sure why but ours seems to work just fine from a `asyncio`-handler in `modden`'s use of `i3ipc` B) Also call `DebugStatus.shield_sigint()` from `pause_from_sync()` in the infected-`asyncio` case to get the same shielding behaviour as in all other usage!	2025-03-22 14:24:53 -04:00
Tyler Goodlet	46c9ee2551	Drop `asyncio_bp` loglevel setting by default	2025-03-22 14:24:53 -04:00
Tyler Goodlet	e7adeee549	First draft, `asyncio`-task, sync-pausing Bo Mostly due to magic from @oremanj where we slap in a little bit of `.from_asyncio`-type stuff to run a `trio`-task from `asyncio.Task` code! I'm not gonna go into tooo too much detail but basically the primary thing needed was a way to (blocking-ly) invoke a `trio.lowlevel.Task` from an `asyncio` one (which we now have with a new `run_trio_task_in_future()` thanks to draft code from the aforementioned jefe) which we now invoke from a dedicated aio case-branch inside `.devx._debug.pause_from_sync()`. Further include a case inside `DebugStatus.release()` to handle using the same func to set the `repl_release: trio.Event` from the aio side when releasing the REPL on exit cmds. Prolly more refinements to come ;{o	2025-03-22 14:24:53 -04:00
Tyler Goodlet	e10616fa4d	Fix multi-daemon debug test `break` signal.. It was expecting `AssertionError` as a proceed-in-test signal (by breaking from a continue loop), but `in_prompt_msg(raise_on_err=True)` was changed to raise `ValueError`; so instead just use as a predicate for the `break`. Also rework `in_prompt_msg()` to accept the `child: BaseSpawn` as input instead of `before: str` remove the casting boilerplate, and adjust all usage to match.	2025-03-22 14:24:53 -04:00
Tyler Goodlet	f24e6f6e48	Use "sclang"-style syntax in `to_asyncio` task logging Just like we've started doing throughout the rest of the actor runtime for reporting (and where "sclang" = "structured conc (s)lang", our little supervision-focused operations syntax i've been playing with in log msg content). Further tweaks: - report the `trio_done_fute` alongside the `main_outcome` value. - add a todo list for supporting `greenback` for pause points.	2025-03-22 14:24:53 -04:00
Tyler Goodlet	aac013ae5c	Pass `infect_asyncio` setting via runtime-vars The reason for this "duplication" with the `--asyncio` CLI flag (passed to the child during spawn) is 2-fold: - allows verifying inside `Actor._from_parent()` that the `trio` runtime was started via `.start_guest_run()` as well as if the `Actor._infected_aio` spawn-entrypoint value has been set (by the `._entry.<spawn-backend>_main()` whenever `--asyncio` is passed) such that any mismatch can be signaled via an `InternalError`. - enables checking the `._state._runtime_vars['_is_infected_aio']` value directly (say from a non-actor/`trio`-thread) instead of calling `._state.current_actor(err_on_no_runtime=False)` in certain edge cases. Impl/testing deats: - add `._state._runtime_vars['_is_infected_aio'] = False` default. - raise `InternalError` on any `--asyncio`-flag-passed vs. `_runtime_vars`-value-relayed-from-parent inside `Actor._from_parent()` and include a `Runner.is_guest` assert for good measure B) - set and relay `infect_asyncio: bool` via runtime-vars to child in `ActorNursery.start_actor()`. - verify `actor.is_infected_aio()`, `actor._infected_aio` and `_state._runtime_vars['_is_infected_aio']` are all set in test suite's `asyncio_actor()` endpoint.	2025-03-22 14:24:53 -04:00
Tyler Goodlet	ccbd35f273	Officially test proto-ed `stackscope` integration By re-purposing our `pexpect`-based console matching with a new `debugging/shield_hang_in_sub.py` example, this tests a few "hanging actor" conditions more formally: - that despite a hanging actor's task we can dump a `stackscope.extract()` tree on relay of `SIGUSR1`. - the actor tree will terminate despite a shielded forever-sleep by our "T-800" zombie reaper machinery activating and hard killing the underlying subprocess. Some test deats: - simulates the expect actions of a real user by manually using `os.kill()` to send both signals to the actor-tree program. - `pexpect`-matches against `log.devx()` emissions under normal `debug_mode == True` usage. - ensure we get the actual "T-800 deployed" `log.error()` msg and that the actor tree eventually terminates! Surrounding (re-org/impl/test-suite) changes: - allow disabling usage via a `maybe_enable_greenback: bool` to `open_root_actor()` but enable by def. - pretty up the actual `.devx()` content from `.devx._stackscope` including be extra pedantic about the conc-primitives for each signal event. - try to avoid double handles of `SIGUSR1` even though it seems the original (what i thought was a) problem was actually just double logging in the handler.. \|_ avoid double applying the handler func via `signal.signal()`, \|_ use a global to avoid double handle func calls and, \|_ a `threading.RLock` around handling. - move common fixtures and helper routines from `test_debugger` to `tests/devx/conftest.py` and import them for use in both test mods.	2025-03-22 14:24:51 -04:00
Tyler Goodlet	346e009730	Start a new `tests/devx/` tooling-subsuite-pkg	2025-03-22 14:24:01 -04:00
Tyler Goodlet	4ada92d2f7	Move `mk_cmd()` to `._testing` Since we're going to need it more generally for `.devx` sub-sys tooling tests. Also, up the sync-pause ctl-c delay another 10ms..	2025-03-22 14:23:58 -04:00
Tyler Goodlet	5cdd012417	Get multi-threaded sync-pausing fully workin! The final issue was making sure we do the same thing on ctl-c/SIGINT from the user. That is, if there's already a bg-thread in REPL, we `log.pdb()` about SIGINT shielding and re-draw the prompt; the same UX as normal actor-runtime-task behaviour. Reasons this wasn't workin.. and the fix: - `.pause_from_sync()` was overriding the local `repl` var with `None` delivered by (transitive) calls to `_pause(debug_func=None)`.. so remove all that and only assign it OAOO prior to thread-type case branching. - always call `DebugStatus.shield_sigint()` as needed from all requesting threads/tasks: - in `_pause_from_bg_root_thread()` BEFORE calling `._pause()` AND BEFORE yielding back to the bg-thread via `.started(out)` to ensure we're definitely overriding the handler in the `trio`-main-thread task before unblocking the requesting bg-thread. - from any requesting bg-thread in the root actor such that both its main-`trio`-thread scheduled task (as per above bullet) AND it are SIGINT shielded. - always call `.shield_sigint()` BEFORE any `greenback._await()` case don't entirely grok why yet, but it works)? - for `greenback._await()` case always set `bg_task` to the current one.. - tweaks to the `SIGINT` handler, now renamed `sigint_shield()` so as not to name-collide with the methods when editor-searching: - always try to `repr()` the REPL thread/task "owner" as well as the active `PdbREPL` instance. - add `.devx()` notes around the prompt flushing deats and comments for any root-actor-bg-thread edge cases. Related/supporting refinements: - add `get_lock()`/`get_debug_req()` factory funcs since the plan is to eventually implement both as `@singleton` instances per actor. - fix `acquire_debug_lock()`'s call-sig-bug for scheduling `request_root_stdio_lock()`.. - in `._pause()` only call `mk_pdb()` when `debug_func != None`. - add some todo/warning notes around the `cls.repl = None` in `DebugStatus.release()` `test_pause_from_sync()` tweaks: - don't use a `attach_patts.copy()`, since we always `break` on match. - do `pytest.fail()` on that ^ loop's fallthrough.. - pass `do_ctlc(child, patt=attach_key)` such that we always match the the current thread's name with the ctl-c triggered `.pdb()` emission. - oh yeah, return the last `before: str` from `do_ctlc()`. - in the script, flip `abandon_on_cancel=True` since when `False` it seems to cause `trio.run()` to hang on exit from the last bg-thread case?!?	2025-03-22 14:22:33 -04:00
Tyler Goodlet	701dd135eb	Another tweak to REPL entry `.pdb()` headers	2025-03-22 14:22:33 -04:00
Tyler Goodlet	060ee1457e	More failed REPL-lock-request refinements In `lock_stdio_for_peer()` better internal-error handling/reporting: - only `Lock._blocked.remove(ctx.cid)` if that same cid was added on entry to avoid needless key-errors. - drop all `Lock.release(force: bool)` usage remnants. - if `req_ctx.cancel()` fails mention it with `ctx_err.add_note()`. - add more explicit internal-failed-request log messaging via a new `fail_reason: str`. - use and use new `x)<=\n\|_` annots in any failure logging. Other cleanups/niceties: - drop `force: bool` flag entirely from the `Lock.release()`. - use more supervisor-op-annots in `.pdb()` logging with both `_pause/crash_msg: str` instead of double '\|' lines when `.pdb()`-reported from `._set_trace()`/`._post_mortem()`.	2025-03-22 14:22:20 -04:00
Tyler Goodlet	32e12c8b03	Todo a test for sync-pausing from non-main-root-tasks	2025-03-22 14:22:06 -04:00
Tyler Goodlet	50ba23e602	Use `delay=0` in pump loop.. Turns out it does work XD Prior presumption was from before I had the fute poll-loop so makes sense we needed more then one sched-tick's worth of context switch vs. now we can just keep looping-n-pumping as fast possible until the guest-run's main task completes. Also, - minimize the preface commentary (as per todo) now that we have tests codifying all the edge cases :finger_crossed: - parameter-ize the pump-loop-cycle delay and default it to 0.	2025-03-22 14:21:53 -04:00
Tyler Goodlet	ddbda17338	Solve our abandonment issues.. To make the recent set of tests pass this (hopefully) finally solves all `asyncio` embedded `trio` guest-run abandonment by ensuring we "pump the event loop" until the guest-run future is fully complete. Accomplished via simple poll loop of the form `while not trio_done_fut.done(): await asyncio.sleep(.1)` in the `aio_main()` task's exception teardown sequence. The loop does a naive 10ms "pump-via-sleep & poll" for the `trio` side to complete before finally exiting (and presumably raising) from the SIGINT cancellation. Other related cleanups and refinements: - use `asyncio.Task.result()` inside `cancel_trio()` since it also inline-raises any exception outcome and we can also log-report the result in non-error cases. - comment out buncha not-sure-we-need-it stuff in `cancel_trio()`. - remove the botched `AsyncioCancelled(CancelledError):` idea obvi XD - comment `greenback` init for now in `aio_main()` since (pretty sure) we don't ever want to actually REPL in that specific func-as-task? - always capture any `fute_err: BaseException` from the `main_outcome: Outcome` delivered by the `trio` side guest-run task. - add and raise a new super noisy `AsyncioRuntimeTranslationError` whenever we detect that the guest-run `trio_done_fut` has not completed before task exit; should avoid abandonment issues ever happening again without knowing!	2025-03-22 14:20:39 -04:00
Tyler Goodlet	199247309e	Demo-abandonment on shielded `trio`-side work Finally this reproduces the issue as it (originally?) exhibited inside `piker` where the `Actor.lifetime_stack` wasn't closed in cases where during `infected_aio`-actor cancellation/shutdown `trio` side tasks which are doing shielded (teardown) work are NOT being watched/waited on from the `aio_main()` task-closure inside `run_as_asyncio_guest()`! This is then the root cause of the guest-run being abandoned since if our `aio_main()` task-closure doesn't know it should allow the run to finish, it's going to call `loop.close()` eventually resulting in the `GeneratorExit` thrown into `trio._core._run.unrolled_run()`.. So, this extends the `test_sigint_closes_lifetime_stack()` suite to include cases for such shielded `trio`-task ops: - add a new `trio_side_is_shielded: bool` which will toggle whether to add a shielded 0.5s `trio.sleep()` loop to `manage_file()` which should outlive the `asyncio` event-loop shutdown sequence and result in an abandoned guest-run and thus a leaked file. - parametrize the existing suite with this case resulting in a total 16 test set B) This patch demonstrates the problem with our `aio_main()` task-closure impl via the now 4 failing tests, a fix is coming in a follow up commit!	2025-03-22 14:20:39 -04:00
Tyler Goodlet	10558b0986	Lel, revert `AsyncioCancelled` inherit, module.. Turns out it somehow breaks our `to_asyncio` error relay since obvi `asyncio`'s runtime seems to specially handle it (prolly via `isinstance()` ?) and it caused our `test_aio_cancelled_from_aio_causes_trio_cancelled()` to hang.. Further, obvi `unpack_error()` won't be able to find the type def if not kept inside `._exceptions`.. So given all that, revert the change/move as well as: - tweak the aio-from-aio cancel test to timeout. - do `trio.sleep()` conc with any bg aio task by moving out nursery block. - add a `send_sigint_to: str` parameter to `test_sigint_closes_lifetime_stack()` such that we test the SIGINT being relayed to just the parent or the child.	2025-03-22 14:20:38 -04:00
Tyler Goodlet	eaa5d23543	Hack `asyncio` to not abandon a guest-mode run? Took me a while to figure out what the heck was going on but, turns out `asyncio` changed their SIGINT handling in 3.11 as per: https://docs.python.org/3/library/asyncio-runner.html#handling-keyboard-interruption I'm not entirely sure if it's the 3.11 changes or possibly wtv further updates were made in 3.12 but more or less due to the way our current main task was written the `trio` guest-run was getting abandoned on SIGINTs sent from the OS to the infected child proc.. Note that much of the bug and soln cases are layed out in very detailed comment-notes both in the new test and `run_as_asyncio_guest()`, right above the final "fix" lines. Add new `test_infected_aio.test_sigint_closes_lifetime_stack()` test suite which reliably triggers all abandonment issues with multiple cases of different parent behaviour post-sending-SIGINT-to-child: 1. briefly sleep then raise a KBI in the parent which was originally demonstrating the file leak not being cleaned up by `Actor.lifetime_stack.close()` and simulates a ctl-c from the console (relayed in tandem by the OS to the parent and child processes). 2. do `Context.wait_for_result()` on the child context which would hang and timeout since the actor runtime would never complete and thus never relay a `ContextCancelled`. 3. both with and without running a `asyncio` task in the `manage_file` child actor; originally it seemed that with an aio task scheduled in the child actor the guest-run abandonment always was the "loud" case where there seemed to be some actor teardown but with tbs from python failing to gracefully exit the `trio` runtime.. The (seemingly working) "fix" required 2 lines of code to be run inside a `asyncio.CancelledError` handler around the call to `await trio_done_fut`: - `Actor.cancel_soon()` which schedules the actor runtime to cancel on the next `trio` runner cycle and results in a "self cancellation" of the actor. - "pumping the `asyncio` event loop" with a non-0 `.sleep(0.1)` XD \|_ seems that a "shielded" pump with some actual `delay: float >= 0` did the trick to get `asyncio` to allow the `trio` runner/loop to fully complete its guest-run without abandonment. Other supporting changes: - move `._exceptions.AsyncioCancelled`, our renamed `asyncio.CancelledError` error-sub-type-wrapper, to `.to_asyncio` and make it derive from `CancelledError` so as to be sure when raised by our `asyncio` x-> `trio` exception relay machinery that `asyncio` is getting the specific type it expects during cancellation. - do "summary status" style logging in `run_as_asyncio_guest()` wherein we compile the eventual `startup_msg: str` emitted just before waiting on the `trio_done_fut`. - shield-wait with `out: Outcome = await asyncio.shield(trio_done_fut)` even though it seems to do nothing in the SIGINT handling case..(I presume it might help avoid abandonment in a `asyncio.Task.cancel()` case maybe?)	2025-03-22 14:20:38 -04:00
Tyler Goodlet	904d8ce8ff	Denoise duplicate chan logging for now	2025-03-21 15:25:55 -04:00
Tyler Goodlet	f14fb53958	Report any external-rent-task-canceller during msg-drain As in whenever `Context.cancel()` is not (runtime internally) called (i.e. `._cancel_called` is not set), we can attempt to detect the parent `trio` nursery/cancel-scope that is the source. Emit the report with a `.cancel()` level and attempt to repr in "sclang" form as well as unhide the stack frame for debug/traceback-in.	2025-03-21 15:25:55 -04:00
Tyler Goodlet	49cd00232e	Add `indent: str` suport to `Context.pformat()` using `textwrap`	2025-03-21 15:25:55 -04:00
Tyler Goodlet	ae16368949	Add `tb_hide: bool` ctl flag to `_open_and_supervise_one_cancels_all_nursery()`	2025-03-21 15:25:55 -04:00
Tyler Goodlet	aa7448793a	Adjusts advanced fault tests to match new `TransportClosed` semantics	2025-03-21 15:25:55 -04:00
Tyler Goodlet	2df7ffd702	Finally implement peer-lookup optimization.. There's a been a todo for soo long for this XD Since all `Actor`'s store a set of `._peers` we can try a lookup on that table as a shortcut before pinging the registry Bo Impl deats: - add a new `._discovery.get_peer_by_name()` routine which attempts the `._peers` lookup by combining a copy of that `dict` + an entry added for `Actor._parent_chan` (since all subs have a parent and often the desired contact is just that connection). - change `.find_actor()` (for the `only_first == True` case), `.query_actor()` and `.wait_for_actor()` to call the new helper and deliver appropriate outputs if possible. Other, - deprecate `get_arbiter()` def and all usage in tests and examples. - drop lingering use of `arbiter_sockaddr` arg to various routines. - tweak the `Actor` doc str as well as some code fmting and a tweak to the `._stream_handler()`'s initial `con_status: str` logging value since the way it was could never be reached.. oh and `.warning()` on any new connections which already have a `_pre_chan: Channel` entry in `._peers` so we can start minimizing IPC duplications.	2025-03-21 15:25:55 -04:00
Tyler Goodlet	dba2d87baf	More-n-more scops annots in logging	2025-03-21 15:25:55 -04:00
Tyler Goodlet	276f88fd0c	Quieter `Stop` handling on ctx result capture In the `drain_to_final_msg()` impl, since a stream terminating gracefully requires this msg, there's really no reason to `log.cancel()` about it; go `.runtime()` level instead since we're trying de-noise under "normal operation". Also, - passthrough `hide_tb` to taskc-handler's `ctx.maybe_raise()` call. - raise `MessagingError` for the `MsgType` unmatched `case _:`. - detail the doc string motivation a little more.	2025-03-21 15:25:55 -04:00
Tyler Goodlet	b2087404e3	Use `._entry` proto-ed "lifetime ops" in logging As per a WIP scribbled out TODO in `._entry.nest_from_op()`, change a bunch of "supervisor/lifetime mgmt ops" related log messages to contain some supervisor-annotation "headers" in an effort to give a terser "visual indication" of how some execution/scope/storage primitive entity (like an actor/task/ctx/connection) is being operated on (like, opening/started/closed/cancelled/erroring) from a "supervisor action" POV. Also tweak a bunch more emissions to lower levels to reduce noise around normal inter-actor operations like process and IPC ctx supervision.	2025-03-21 15:25:55 -04:00
Tyler Goodlet	9bc7be30bf	Reraise RAEs in `MsgStream.receive()`; truncate tbs To avoid showing lowlevel details of exception handling around the underlying call to `return await self._ctx._pld_rx.recv_pld(ipc=self)`, any time a `RemoteActorError` is unpacked (an raised locally) we re-raise it directly from the captured `src_err` captured so as to present to the user/app caller-code an exception raised directly from the `.receive()` frame. This simplifies traceback call-stacks for any `log.exception()` or `pdb`-REPL output filtering out the lower `PldRx` frames by default.	2025-03-21 15:25:55 -04:00
Tyler Goodlet	1d9e60626c	Add `Portal.chan` property, to wrap `._chan` attr	2025-03-21 15:25:55 -04:00
Tyler Goodlet	ef7f34ca1c	More formal `TransportClosed` reporting/raising Since it was all ad-hoc defined inside `._ipc.MsgpackTCPStream._iter_pkts()` more or less, this starts formalizing a way for particular transport backends to indicate whether a disconnect condition should be re-raised in the RPC msg loop and if not what log level to report it at (if any). Based on our lone transport currently we try to suppress any logging noise from ephemeral connections expected during normal actor interaction and discovery subsys ops: - any short lived discovery related TCP connects are only logged as `.transport()` level. - both `.error()` and raise on any underlying `trio.ClosedResource` cause since that normally means some task touched transport layer internals that it shouldn't have. - do a `.warning()` on anything else unexpected. Impl deats: - extend the `._exceptions.TransportClosed` to accept an input log level, raise-on-report toggle and custom reporting & raising via a new `.report_n_maybe_raise()` method. - construct the TCs with inputs per case in (the newly named) `._iter_pkts(). - call ^ this method from the `TransportClosed` handler block inside the RPC msg loop thus delegating reporting levels and/or raising to the backend's per-case TC instantiating. Related `._ipc` changes: - mask out all the `MsgpackTCPStream._codec` debug helper stuff and drop any lingering cruft from the initial proto-ing of msg-codecs. - rename some attrs/methods: \|_`MsgpackTCPStream._iter_packets()` -> `._iter_pkts()` and `._agen` -> `_aiter_pkts`. \|_`Channel._aiter_recv()` -> `._aiter_msgs()` and `._agen` -> `_aiter_msgs`. - add `hide_tb: bool` support to `Channel.send()` and only show the frame on non-MTEs.	2025-03-21 15:25:55 -04:00
Tyler Goodlet	417f4f7255	Refine some `.trionics` docs and logging - allow passing and report the lib name (`trio` or `tractor`) from `maybe_open_nursery()`. - use `.runtime()` level when reporting `_Cache`-hits in `maybe_open_context()`. - tidy up some doc strings.	2025-03-21 15:25:55 -04:00
Tyler Goodlet	8de79372b7	Woops, set `.cancel()` level in custom levels table..	2025-03-21 15:25:55 -04:00
Tyler Goodlet	d105da0fcf	(Re)type annot some tests - For the (still not finished) `test_caps_based_msging`, switch to using the new `PayloadMsg`. - add `testdir` fixture type.	2025-03-21 15:25:55 -04:00
Tyler Goodlet	3eef9aeac5	Use `msgspec.Struct.__repr__()` failover impl In case the struct doesn't import a field type (which will cause the `.pformat()` to raise) just report the issue and try to fall back to the original `repr()` version.	2025-03-21 15:25:55 -04:00
Tyler Goodlet	521a2e353d	Don't use pretty struct stuff in `._invoke` It's too fragile to put in side core RPC machinery since `msgspec.Struct` defs can fail if a field type can't be looked up at creation time (like can easily happen if you conditionally import using `if TYPE_CHECKING:`) Also, - rename `cs` to `rpc_ctx_cs: CancelScope` since it's literally the wrapping RPC `Context._scope`. - report self cancellation via `explain: str` and add tail case for "unknown cause". - put a ?TODO? around what to do about KBIs if a context is opened from an `infected_aio`-actor task. - similar to our nursery and portal add TODO list for moving all `_invoke_non_context()` content out the RPC core and instead implement them as `.hilevel` endpoint helpers (maybe as decorators?)which under neath define `@context`-funcs.	2025-03-21 15:25:55 -04:00
Tyler Goodlet	6927767d39	Update `._entry` actor status log Log-report the different types of actor exit conditions including cancel via KBI, error or normal return with varying levels depending on case. Also, start proto-ing out this weird ascii-syntax idea for describing conc system states and implement the first bit in a `nest_from_op()` log-message fmter that joins and indents an obj `repr()` with a tree-like `'>)\n\|_'` header.	2025-03-21 15:25:55 -04:00
Tyler Goodlet	bd66450a79	Update `MsgTypeError` content matching to latest	2025-03-21 15:25:55 -04:00
Tyler Goodlet	9811db9ac5	Further formalize `greenback` integration Since we more or less require it for `tractor.pause_from_sync()` this refines enable toggles and their relay down the actor tree as well as more explicit logging around init and activation. Tweaks summary: - `.info()` report the module if discovered during root boot. - use a `._state._runtime_vars['use_greenback']: bool` activation flag inside `Actor._from_parent()` to determine if the sub should try to use it and set to `False` if mod-loading fails / not installed. - expose `maybe_init_greenback()` from `.devx` sugpkg. - comment out RTE in `._pause()` for now since we already have it in `.pause_from_sync()`. - always `.exception()` on `maybe_init_greenback()` import errors to clarify the underlying failure deats. - always explicitly report if `._state._runtime_vars['use_greenback']` was NOT set when `.pause_from_sync()` is called. Other `._runtime.async_main()` adjustments: - combine the "internal error call ur parents" message and the failed registry contact status into one new `err_report: str`. - drop the final exception handler's call to `Actor.lifetime_stack.close()` since we're already doing it in the `finally:` block and the earlier call has no currently known benefit. - only report on the `.lifetime_stack()` callbacks if any are detected as registered.	2025-03-21 15:25:55 -04:00
Tyler Goodlet	6af320273b	Always reset `._state._ctxvar_Context` to prior Not sure how I forgot this but, obviously it's correct context-var semantics to revert the current IPC `Context` (set in the latest `.open_context()` block) such that any prior instance is reset.. This ensures the sanity `assert`s pass inside `.msg._ops.maybe_limit_plds()` and just in general ensures for any task that the last opened `Context` is the one returned from `current_ipc_ctx()`.	2025-03-21 15:25:55 -04:00
Tyler Goodlet	74048b06a7	Prep for legacy RPC API factor-n-remove This change is adding commentary about the upcoming API removal and simplification of nursery + portal internals; no actual code changes are included. The plan to (re)move the old RPC methods: - `ActorNursery.run_in_actor()` - `Portal.run()` - `Portal.run_from_ns()` and any related impl internals out of each conc-primitive and instead into something like a `.hilevel.rpc` set of APIs which then are all implemented using the newer and more lowlevel `Context`/`MsgStream` primitives instead Bo Further, - formally deprecate the `Portal.result()` meth for `.wait_for_result()`. - only `log.info()` about runtime shutdown in the implicit root case.	2025-03-21 15:25:55 -04:00
Tyler Goodlet	5b9a2642f6	Add a `Context.portal`, more cancel tooing Might as well add a public maybe-getter for use on the "parent" side since it can be handy to check out-of-band cancellation conditions (like from `Portal.cancel_actor()`). Buncha bitty tweaks for more easily debugging cancel conditions: - add a `@.cancel_called.setter` for hooking into `.cancel_called = True` being set in hard to decipher "who cancelled us" scenarios. - use a new `self_ctxc: bool` var in `.cancel()` to capture the output state from `._is_self_cancelled(remote_error)` at call time so it can be compared against the measured value at crash-time (when REPL-ing it can often have already changed due to runtime teardown sequencing vs. the crash handler hook entry). - proxy `hide_tb` to `.drain_to_final_msg()` from `.wait_for_result()`. - use `remote_error.sender` attr directly instead of through `RAE.msgdata: dict` lookup. - change var name `our_uid` -> `peer_uid`; it's not "ours".. Other various docs/comment updates: - extend the main class doc to include some other name ideas. - change over all remaining `.result()` refs to `.wait_for_result()`. - doc more details on how we want `.outcome` to eventually signature.	2025-03-21 15:25:55 -04:00
Tyler Goodlet	778710efbb	Flip `infected_asyncio` status msg to `.runtime()`	2025-03-21 15:25:42 -04:00
Tyler Goodlet	4792ffcc04	Avoid actor-nursery-exit warns on registrees Since a local-actor-nursery-parented subactor might also use the root as its registry, we need to avoid warning when short lived IPC `Channel` connections establish and then disconnect (quickly, bc the apparently the subactor isn't re-using an already cached parente-peer<->child conn as you'd expect efficiency..) since such cases currently considered normal operation of our super shoddy/naive "discovery sys" XD As such, (un)guard the whole local-actor-nursery OR channel-draining waiting blocks with the additional `or Actor._cancel_called` branch since really we should also be waiting on the parent nurse to exit (at least, for sure and always) when the local `Actor` indeed has been "globally" cancelled-called. Further add separate timeout warnings for channel-draining vs. local-actor-nursery-exit waiting since they are technically orthogonal cases (at least, afaik). Also, - adjust the `Actor._stream_handler()` connection status log-emit to `.runtime()`, especially to reduce noise around the aforementioned ephemeral registree connection-requests. - if we do wait on a local actor-nurse to exit, report its `._children` table (which should help figure out going forward how useful the warning is, if at all).	2025-03-21 15:25:42 -04:00
Tyler Goodlet	3c1f56f8d9	Change `_Cache` reuse emit to `.runtime()`	2025-03-21 15:25:42 -04:00
Tyler Goodlet	682cf884c4	Expand `PayloadMsg` doc-str	2025-03-21 15:25:42 -04:00
Tyler Goodlet	8dcc49fce2	Break `_mk_msg_type_err()` into recv/send side funcs Name them `_mk_send_mte()`/`_mk_recv_mte()` and change the runtime to call each appropriately depending on location/usage. Also add some dynamic call-frame "unhide" blocks such that when we expect raised MTE from the aboves calls but we get a different unexpected error from the runtime, we ensure the call stack downward is shown in tbs/pdb. \|_ ideally in the longer run we come up with a fancier dynamic sys for this, prolly something in `.devx._frame_stack`?	2025-03-21 15:25:42 -04:00
Tyler Goodlet	b517dacf0a	Don't pass `ipc_msg` for send side MTEs Just pass `_bad_msg` such that it get's injected to `.msgdata` since with a send-side `MsgTypeError` we don't have a remote `._ipc_msg: Error` per say to include.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	d3680bfe6a	Add note about using `@acm` as decorator in 3.10	2025-03-21 15:25:42 -04:00
Tyler Goodlet	e863159c7f	Update pld-rx limiting test(s) to use deco input The tests only use one input spec (conveniently) so there's not much to change in the logic, - only pass the `maybe_msg_spec` to the child-side decorator and obvi drop the surrounding `msgops.limit_plds()` block in the child. - tweak a few `MsgDec` asserts, mostly dropping the `msg._ops._def_any_spec` state checks since the child-side won't have any pre pld-spec state given the runtime now applies the `pld_spec` before running the task's func body. - also allowed dropping the `finally:` which did a similar check outside the `.limit_plds()` block.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	ed42aa7e65	Proxy through `dec_hook` in `.limit_plds()` APIs	2025-03-21 15:25:42 -04:00
Tyler Goodlet	e8fee54534	Port debug request ep to use `@context(pld_spec)` Namely passing the `.__pld_spec__` directly to the `lock_stdio_for_peer()` decorator B) Also, allows dropping `apply_debug_pldec()` (which was a todo) and removing a `lock_stdio_for_peer()` indent level.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	aee1bf8456	Offer a `@context(pld_spec=<TypeAlias>)` API Instead of the WIP/prototyped `Portal.open_context()` offering a `pld_spec` input arg, this changes to a proper decorator API for specifying the "payload spec" on `@context` endpoints. The impl change details actually cover 2-birds: - monkey patch decorated functions with a new `._tractor_context_meta: dict[str, Any]` and insert any provided input `@context` kwargs: `_pld_spec`, `enc_hook`, `enc_hook`. - use `inspect.get_annotations()` to scan for a `func` arg type-annotated with `tractor.Context` and use the name of that arg as the RPC task-side injected `Context`, thus injecting the needed arg by type instead of by name (a longstanding TODO); raise a type-error when not found. - pull the `pld_spec` from the `._tractor_context_meta` attr both in the `.open_context()` parent-side and child-side `._invoke()`-cation of the RPC task and use the `msg._ops.maybe_limit_plds()` API to apply it internally in the runtime for each case.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	69fb7beff8	Log tbs from non-RAE `._invoke()`-RPC-task errors `RemoteActorError`s show this by default in their `.__repr__()`, and we obvi capture and embed the src traceback in an `Error` msg prior to transit, but for logging it's also handy to see the tb of any set `Context._remote_error` on console especially when trying to decipher remote error details at their origin actor. Also improve the log message description using `ctx.repr_state` and show any `ctx.outcome`.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	f5b1d0179e	Add `@context(pld_spec=<TypeAlias>)` TODO list Longer run we don't want `tractor` app devs having to call `msg._ops.limit_plds()` from every child endpoint.. so this starts a list of decorator API ideas and obviously ties in with an ideal final API design that will come with py3.13 and typed funcs. Obviously this is directly fueled by, - https://github.com/goodboy/tractor/issues/365 Other, - type with direct `trio.lowlevel.Task` import. - use `log.exception()` to show tbs for all error-terminations in `.open_context()` (for now) and always explicitly mention the `.side`.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	dee312cae1	Use `_debug._sync_pause_from_builtin()` as `breakpoint()` override	2025-03-21 15:25:42 -04:00
Tyler Goodlet	85fd312c22	Use new `._debug._repl_fail_msg` inside `test_pause_from_sync`	2025-03-21 15:25:42 -04:00
Tyler Goodlet	6754a80186	Make big TODO: for `devx._debug` refinements Hopefully would make grok-ing this fairly sophisticated sub-sys possible for any up-and-coming `tractor` hacker XD A lot of internal API and re-org ideas I discovered/realized as part of finishing the `__pld_spec__` and multi-threaded support. Particularly better isolation between root-actor vs subactor task APIs and generally less globally-state-ful stuff like `DebugStatus` and `Lock` method APIs would likely make a lot of the hard to follow edge cases more clear?	2025-03-21 15:25:42 -04:00
Tyler Goodlet	d3f7b83ea0	First proto: multi-threaded synced `pdb`-REPLs Functionally working for multi-threaded (via cpython threads spawned from `to_trio.to_thread.run_sync()`) alongside subactors, tested (for now) only with threads started inside the root actor (which seemed to have the most issues in terms of the impl and special cases..) using the new `tractor.pause_from_sync()` API! Main implementation changes to `.pause_from_sync()` ------ - ------ - from the root actor, we need to ensure bg thread case is handled specially since no IPC is used to request the TTY stdio mutex and `Lock` (API) usage is conducted entirely from a local task or thread; dedicated `Lock` usage for the root-actor already is branched inside `._pause()` and needs similar handling from a root bg-thread: \|_for the special case of a root bg thread we need to `trio`-main-thread schedule a bg task inside a new `_pause_from_bg_root_thread()`. The new task needs to implement most of what was is handled inside `._pause()` manually, mostly because in this root-actor-bg-thread case we have 2 constraints: 1. to enter `PdbREPL.interaction()` from the bg thread directly, 2. the task that `Lock._debug_lock.acquire()`s has to be the same that calls `.release() (a `trio.FIFOLock` constraint) \|_impl deats of this `_pause_from_bg_root_thread()` include: - (for now) calling `._pause()` to acquire the `Lock._debug_lock`. - setting its own `DebugStatus.repl_release`. - calling `.DebugStatus.shield_sigint()` to ensure the root's main thread uses the right handler when the bg one is REPL-ing. - wait manually on the `.repl_release()` to be set by the thread's dedicated `PdbREPL` exit. - manually calling `Lock.release()` from the same task that acquired it. - expect calls to `._pause()` to deliver a `tuple[Task, PdbREPL]` such that we always get the handle both to any newly created REPl instance and the (maybe) the scheduled bg task within which is runs. - add a single `message: str` style to `log.devx()` based on branching style for logging. - ensure both `DebugStatus.repl` and `.repl_task` are set just before calling `._set_trace()` to ensure the correct `Task\|Thread` is set when the REPL is finally entered from sync code. - add a wrapping caller `_sync_pause_from_builtin()` which passes in the new `called_from_builtin=True` to indicate `breakpoint()` caller usage, obvi pass in `api_frame`. Changes to `._pause()` in support of ^ ------ - ------ - `TaskStatus.started()` and return the `tuple[Task, PdbREPL]` to callers / starters. - only call `DebugStatus.shield_sigint()` when no `repl` passed bc some callers (like bg threads) may need to apply it at some specific point themselves. - tweak some asserts for the `debug_func == None` / non-`trio`-thread case. - add a mod-level `_repl_fail_msg: str` to be used when there's an internal `._pause()` failure for testing, easier to pexpect match. - more comprehensive logging for the root-actor branched case to (attempt to) indicate any of the 3 cases: - remote ctx from subactor has the `Lock`, - already existing root task or thread has it or, - some kinda stale `.locked()` situation where the root has the lock but we don't know why. - for root usage, revert to always `await Lock._debug_lock.acquire()`-ing despite `called_from_sync` since `.pause_from_sync()` was reworked to instead handle the special bg thread case in the new `_pause_from_bg_root_thread()` task. - always do `return _enter_repl_sync(debug_func)`. - try to report any `repl_task: Task\|Thread` set by the caller (particularly for the bg thread cases) as being the thread or task `._pause()` was called "on behalf of" Changes to `DebugStatus`/`Lock` in support of ^ ------ - ------ - only call `Lock.release()` from `DebugStatus.set_[quit/continue]()` when called from the main `trio` thread and always call `DebugStatus.release()` after to ensure `.repl_released()` is set after `._debug_lock.release()`. - only call `.repl_release.set()` from `trio` thread otherwise use `.from_thread.run()`. - much more refinements in `Lock.release()` for threading cases: - return `bool` to indicate whether lock was released by caller. - mask (in prep to drop) `_pause()` usage of `Lock.release.force=True)` since forcing a release can't ever avoid the RTE from `trio`.. same task must acquire/release. - don't allow usage from non-`trio`-main-threads, ever; there's no point since the same-task-needs-to-manage-`FIFOLock` constraint. - much more detailed logging using `message`-building-style for all caller (edge) cases. \|_ use a `we_released: bool` to determine failed-to-release edge cases which can happen if called from bg threads, ensure we `log.exception()` on any incorrect usage resulting in release failure. \|_ complain loudly if the release fails and some other task/thread still holds the lock. \|_ be explicit about "who" (which task or thread) the release is "on behalf of" by reading `DebugStatus.repl_task` since the caller isn't the REPL operator in many sync cases. - more or less drop `force` support, as mentioned above. - ensure we unset `._owned_by_root` if the caller is a root task. Other misc ------ - ------ - rename `lock_tty_for_child()` -> `lock_stdio_for_peer()`. - rejig `Lock.repr()` to show lock and event stats. - stage `Lock.stats` and `.owner` methods in prep for doing a singleton instance and `@property`s.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	d8dd0c0a81	Drop thread logging to make `log.pdb()` patts match in test	2025-03-21 15:25:42 -04:00
Tyler Goodlet	0c8bb88cc5	Catch `.pause_from_sync()` in root bg thread bugs! Originally discovered as while using `tractor.pause_from_sync()` from the `i3ipc` client running in a bg-thread that uses `asyncio` inside `modden`. Turns out we definitely aren't correctly handling `.pause_from_sync()` from the root actor when called from a `trio.to_thread.run_sync()` bg thread: - root-actor bg threads which can't `Lock._debug_lock.acquire()` since they aren't in `trio.Task`s. - even if scheduled via `.to_thread.run_sync(_debug._pause)` the acquirer won't be the task/thread which calls `Lock.release()` from `PdbREPL` hooks; this results in a RTE raised by `trio`.. - multiple threads will step on each other's stdio since cpython's GIL seems to ctx switch threads on every input from the user to the REPL loop.. Reproduce via reworking our example and test so that they catch and fail for all edge cases: - rework the `/examples/debugging/sync_bp.py` example to demonstrate the above issues, namely the stdio clobbering in the REPL when multiple threads and/or a subactor try to debug simultaneously. \|_ run one thread using a task nursery to ensure it runs conc with the nursery's parent task. \|_ ensure the bg threads run conc a subactor usage of `.pause_from_sync()`. \|_ gravely detail all the special cases inside a TODO comment. \|_ add some control flags to `sync_pause()` helper and don't use `breakpoint()` by default. - extend and adjust `test_debugger.test_pause_from_sync` to match (and thus currently fail) by ensuring exclusive `PdbREPL` attachment when the 2 bg root-actor threads are concurrently interacting alongside the subactor: \|_ should only see one of the `_pause_msg` logs at a time for either one of the threads or the subactor. \|_ ensure each attaches (in no particular order) before expecting the script to exit. Impl adjustments to `.devx._debug`: - drop `Lock.repl`, no longer used. - add `Lock._owned_by_root: bool` for the `.ctx_in_debug == None` root-actor-task active case. - always `log.exception()` for any `._debug_lock.release()` ownership RTE emitted by `trio`, like we used to.. - add special `Lock.release()` log message for the stale lock but `._owned_by_root == True` case; oh yeah and actually `log.devx(message)`.. - rename `Lock.acquire()` -> `.acquire_for_ctx()` since it's only ever used from subactor IPC usage; well that and for local root-task usage we should prolly add a `.acquire_from_root_task()`? - buncha `._pause()` impl improvements: \|_ type `._pause()`'s `debug_func` as a `partial` as well. \|_ offer `called_from_sync: bool` and `called_from_bg_thread: bool` for the special case handling when called from `.pause_from_sync()` \|_ only set `DebugStatus.repl/repl_task` when `debug_func != None` (OW ensure the `.repl_task` is not the current one). \|_ handle error logging even when `debug_func is None`.. \|_ lotsa detailed commentary around root-actor-bg-thread special cases. - when `._set_trace(hide_tb=False)` do `pdbp.set_trace(frame=currentframe())` so the `._debug` internal frames are always included. - by default always hide tracebacks for `.pause[_from_sync]()` internals. - improve `.pause_from_sync()` to avoid root-bg-thread crashes: \|_ pass new `called_from_xxx_` flags and ensure `DebugStatus.repl_task` is actually set to the `threading.current_thread()` when needed. \|_ manually call `Lock._debug_lock.acquire_nowait()` for the non-bg thread case. \|_ TODO: still need to implement the bg-thread case using a bg `trio.Task`-in-thread with an `trio.Event` set by thread REPL exit.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	0687dac97a	Move `Context.open_stream()` impl to `._streaming` Exactly like how it's organized for `Portal.open_context()`, put the main streaming API `@acm` with the `MsgStream` code and bind the method to the new module func. Other, - rename `Context.result()` -> `.wait_for_result()` to better match the blocking semantics and rebind `.result()` as deprecated. - add doc-str for `Context.maybe_raise()`.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	4589ff307c	Use `Context` repr APIs for RPC outcome logs Delegate to the new `.repr_state: str` and adjust log level based on error vs. cancel vs. result.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	c39427dc15	Drop sub-decoder proto-cruft from `.msg._codec` It ended up getting necessarily implemented as the `PldRx` though at a different layer and won't be needed as part of `MsgCodec` most likely, though this original idea did provide the source of inspiration for how things work now! Also Move the commented TODO proto for a codec hook factory from `.types` to `._codec` where it prolly better fits and update some msg related todo/questions.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	dc5d622e70	Woops, set `post_mortem=False` by default again!	2025-03-21 15:25:42 -04:00
Tyler Goodlet	319dda77b4	Finally, officially support shielded REPL-ing! It's been a long time prepped and now finally implemented! Offer a `shield: bool` argument from our async `._debug` APIs: - `await tractor.pause(shield=True)`, - `await tractor.post_mortem(shield=True)` ^-These-^ can now be used inside cancelled `trio.CancelScope`s, something very handy when introspecting complex (distributed) system tear/shut-downs particularly under remote error or (inter-peer) cancellation conditions B) Thanks to previous prepping in a prior attempt and various patches from the rigorous rework of `.devx._debug` internals around typed msg specs, there ain't much that was needed! Impl deats - obvi passthrough `shield` from the public API endpoints (was already done from a prior attempt). - put ad-hoc internal `with trio.CancelScope(shield=shield):` around all checkpoints inside `._pause()` for both the root-process and subactor case branches. Add a fairly rigorous example, `examples/debugging/shielded_pause.py` with a wrapping `pexpect` test, `test_debugger.test_shield_pause()` and ensure it covers as many cases as i can think of offhand: - multiple `.pause()` entries in a loop despite parent scope cancellation in a subactor RPC task which itself spawns a sub-task. - a `trio.Nursery.parent_task` which raises, is handled and tries to enter and unshielded `.post_mortem()`, which of course internally raises `Cancelled` in a `._pause()` checkpoint, so we catch the `Cancelled` again and then debug the debugger's internal cancellation with specific checks for the particular raising checkpoint-LOC. - do ^- the latter -^ for both subactor and root cases to ensure we can debug `._pause()` itself when it tries to REPL engage from a cancelled task scope Bo	2025-03-21 15:25:42 -04:00
Tyler Goodlet	59a3449455	Rename `PldRx.dec_msg()` -> `.decode_pld()` Keep the old alias, but i think it's better form to use longer names for internal public APIs and this name better reflects the functionality: decoding and returning a `PayloadMsg.pld` field.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	1ef1ebfa99	Add a `tractor.post_mortem()` API test + example Since turns out we didn't have a single example using that API Bo The test granular-ly checks all use cases: - `.post_mortem()` manual calls in both subactor and root. - ensuring built-in RPC crash handling activates after each manual one from ^. - drafted some call-stack frame checking that i commented out for now since we need to first do ANSI escape code removal due to the colorization that `pdbp` does by default. \|_ added a TODO with SO link on `assert_before()`. Also todo-staged a shielded-pause test to match with the already existing-but-needs-refinement example B)	2025-03-21 15:25:42 -04:00
Tyler Goodlet	a95b84e4fb	Change `reraise` to `post_mortem: bool` in `maybe_expect_raises()`	2025-03-21 15:25:42 -04:00
Tyler Goodlet	54d397b726	Always `.exception()` in `try_ship_error_to_remote()` on internal error	2025-03-21 15:25:42 -04:00
Tyler Goodlet	33e646fd6a	Pass `boxed_type` from `_mk_msg_type_err()` Such that we're boxing the interchanged lib's specific error `msgspec.ValidationError` in this case) type much like how a `ContextCancelled[trio.Cancelled]` is composed; allows for seemless multi-backend-codec support later as well B) Pass `ctx.maybe_raise(from_src_exc=src_err)` where needed in a couple spots; as `None` in the send-side `Started` MTE case to avoid showing the `._scope1.cancel_called` result in the traceback from the `.open_context()` child-sync phase.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	f120ee72f5	Add `from_src_exc: BaseException` to maybe raisers That is as a control to `Context._maybe_raise_remote_err()` such that if set to anything other then the default (`False` value), we do `raise remote_error from from_src_exc` such that caller can choose to suppress or override the `.__cause__` tb. Also tidy up and old masked TODO regarding calling `.maybe_raise()` after the caller exits from the `yield` in `.open_context()`..	2025-03-21 15:25:42 -04:00
Tyler Goodlet	08dc32fbb7	Better RAE `.pformat()`-ing for send-side MTEs Send-side `MsgTypeError`s actually shouldn't have any "boxed" traceback per say since they're raised in the transmitting actor's local task env and we (normally) don't want the ascii decoration added around the error's `._message: str`, that is not until the exc is `pack_error()`-ed before transit. As such, the presentation of an embedded traceback (and its ascii box) gets bypassed when only a `._message: str` is set (as we now do for pld-spec failures in `_mk_msg_type_err()`). Further this tweaks the `.pformat()` output to include the `._message` part to look like `<RemoteActorError( <._message> ) ..` instead of jamming it implicitly to the end of the embedded `.tb_str` (as was done implicitly by `unpack_error()`) and also adds better handling for the `with_type_header == False` case including forcing that case when we detect that the currently handled exc is the RAE in `.pformat()`. Toss in a lengthier doc-str explaining it all. Surrounding/supporting changes, - better `unpack_error()` message which just briefly reports the remote task's error type. - add public `.message: str` prop. - always set a `._extra_msgdata: dict` since some MTE props rely on it. - handle `.boxed_type == None` for `.boxed_type_str`. - maybe pack any detected input or `exc.message` in `pack_error()`. - comment cruft cleanup in `_mk_msg_type_err()`.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	fd0c14df80	Add `Error.message: str` Allows passing a custom error msg other then the traceback-str over the wire. Make `.tb_str` optional (in the blank `''` sense) since it's treated that way thus far in `._exceptions.pack_error()`.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	a1779a8fa9	Fix missing newline in task-cancel log-message	2025-03-21 15:25:42 -04:00
Tyler Goodlet	d154afd678	Don't need to pack an `Error` with send-side MTEs	2025-03-21 15:25:42 -04:00
Tyler Goodlet	f05abbcfee	Ensure only a boxed traceback for MTE on parent side	2025-03-21 15:25:42 -04:00
Tyler Goodlet	9330a75255	Ensure ctx error-state matches the MTE scenario Namely checking that `Context._remote_error` is set to the raised MTE in the invalid started and return value cases since prior to the recent underlying changes to the `Context.result()` impl, it would not match. Further, - do asserts for non-MTE raising cases in both the parent and child. - add todos for testing ctx-outcomes for per-side-validation policies i anticipate supporting and implied msg-dialog race cases therein.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	235db17c9c	Raise remote errors rxed during `Context` child-sync More specifically, if `.open_context()` is cancelled when awaiting the first `Context.started()` during the child task sync phase, check to see if it was due to `._scope.cancel_called` and raise any remote error via `.maybe_raise()` instead the `trio.Cancelled` like in every other remote-error handling case. Ensure we set `._scope[_nursery]` only after the `Started` has arrived and audited.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	f227ce6080	Don't (noisly) log about runtime cancel RPC tasks Since in the case of the `Actor._cancel_task()` related runtime eps we actually don't EVER register them in `Actor._rpc_tasks`.. logging about them is just needless noise, though maybe we should track them in a diff table; something like a `._runtime_rpc_tasks`? Drop the cancel-request-for-stale-RPC-task (`KeyError` case in `Actor._cancel_task()`) log-emit level in to `.runtime()`; it's generally not useful info other then for granular race condition eval when hacking the runtime.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	aa17635c4b	Raise send-side MTEs inline in `PldRx.dec_msg()` So when `is_started_send_side is True` we raise the newly created `MsgTypeError` (MTE) directly instead of doing all the `Error`-msg pack and unpack to raise stuff via `_raise_from_unexpected_msg()` since the raise should happen send side anyway and so doesn't emulate any remote fault like in a bad `Return` or `Started` without send-side pld-spec validation. Oh, and proxy-through the `hide_tb: bool` input from `.drain_to_final_msg()` to `.recv_msg_w_pld()`.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	b673d10e1b	Set remote errors in `_raise_from_unexpected_msg()` By calling `Context._maybe_cancel_and_set_remote_error(exc)` on any unpacked `Error` msg; provides for `Context.maybe_error` consistency to match all other error delivery cases.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	46a1a54aeb	Factor `.started()` validation into `.msg._ops` Filling out the helper `validate_payload_msg()` staged in a prior commit and adjusting all imports to match. Also add a `raise_mte: bool` flag for potential usage where the caller wants to handle the MTE instance themselves.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	d7ca1dfd94	Fix `test_basic_payload_spec` bad msg matching Expecting `Started` or `Return` with respective bad `.pld` values depending on what type of failure is test parametrized. This makes the suite run green it seems B)	2025-03-21 15:25:42 -04:00
Tyler Goodlet	deb61423c4	Drop `msg.types.Msg` for new replacement types The `TypeAlias` for the msg type-group is now `MsgType` and any user touching shuttle messages can now be typed as `PayloadMsg`. Relatedly, add MTE specific `Error._bad_msg[_as_dict]` fields which are handy for introspection of remote decode failures.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	ea5eeba0a0	Parameterize the `return_msg_type` in `._invoke()` Since we also handle a runtime-specific `CancelAck`, allow the caller-scheduler to pass in the expected return-type msg per the RPC msg endpoint loop.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	3ea4617120	Add `MsgTypeError` "bad msg" capture Such that if caught by user code and/or the runtime we can introspect the original msg which caused the type error. Previously this was kinda half-baked with a `.msg_dict` which was delivered from an `Any`-decode of the shuttle msg in `_mk_msg_type_err()` but now this more explicitly refines the API and supports both `PayloadMsg`-instance or the msg-dict style injection: - allow passing either of `bad_msg: PayloadMsg\|None` or `bad_msg_as_dict: dict\|None` to `MsgTypeError.from_decode()`. - expose public props for both ^ whilst dropping prior `.msgdict`. - rework `.from_decode()` to explicitly accept `extra_msgdata: dict` \|_ only overriding it from any `bad_msg_as_dict` if the keys are found in `_ipcmsg_keys`, except** for `_bad_msg` when `bad_msg` is passed. \|_ drop `.ipc_msg` passthrough. \|_ drop `msgdict` input. - adjust `.cid` to only pull from the `.bad_msg` if set. Related fixes/adjustments: - `pack_from_raise()` should pull `boxed_type_str` from `boxed_type.__name__`, not the `type()` of it.. also add a `hide_tb: bool` flag. - don't include `_msg_dict` and `_bad_msg` in the `_body_fields` set. - allow more granular boxed traceback-str controls: \|_ allow passing a `tb_str: str` explicitly in which case we use it verbatim and presume caller knows what they're doing. \|_ when not provided, use the more explicit `traceback.format_exception(exc)` since the error instance is a required input (we still fail back to the old `.format_exc()` call if for some reason the caller passes `None`; but that should be a bug right?). \|_ if a `tb: TracebackType` and a `tb_str` is passed, concat them. - in `RemoteActorError.pformat()` don't indent the `._message` part used for the `body` when `with_type_header == False`. - update `_mk_msg_type_err()` to use `bad_msg`/`bad_msg_as_dict` appropriately and drop passing `ipc_msg`.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	6819ec01d0	More correct/explicit `.started()` send-side validation In the sense that we handle it as a special case that exposed through to `RxPld.dec_msg()` with a new `is_started_send_side: bool`. (Non-ideal) `Context.started()` impl deats: - only do send-side pld-spec validation when a new `validate_pld_spec` is set (by default it's not). - call `self.pld_rx.dec_msg(is_started_send_side=True)` to validate the payload field from the just codec-ed `Started` msg's `msg_bytes` by passing the `roundtripped` msg (with it's `.pld: Raw`) directly. - add a `hide_tb: bool` param and proxy it to the `.dec_msg()` call. (Non-ideal) `PldRx.dec_msg()` impl deats: - for now we're packing the MTE inside an `Error` via a manual call to `pack_error()` and then setting that as the `msg` passed to `_raise_from_unexpected_msg()` (though really we should just raise inline?). - manually set the `MsgTypeError._ipc_msg` to the above.. Other, - more comprehensive `Context` type doc string. - various `hide_tb: bool` kwarg additions through `._ops.PldRx` meths. - proto a `.msg._ops.validate_payload_msg()` helper planned to get the logic from this version of `.started()`'s send-side validation so as to be useful more generally elsewhere.. (like for raising back `Return` values on the child side?). Warning: this commit may have been made out of order from required changes to `._exceptions` which will come in a follow up!	2025-03-21 15:25:42 -04:00
Tyler Goodlet	71518ea94a	Add basic payload-spec test suite Starts with some very basic cases: - verify both subactor-as-child-ctx-task send side validation (failures) as well as relay and raise on root-parent-side-task. - wrap failure expectation cases that bubble out of `@acm`s with a `maybe_expect_raises()` equiv wrapper with an embedded timeout. - add `Return` cases including invalid by `str` and valid by a `None`. Still ToDo: - commit impl changes to make the bulk of this suite pass. - adjust how `MsgTypeError`s format the local (`.started()`) send side `.tb_str` such that we don't do a "boxed" error prior to `pack_error()` being called normally prior to `Error` transit.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	4520183cdc	Even smarter `RemoteActorError.pformat()`-ing Related to the prior patch, re the new `with_type_header: bool`: - in the `with_type_header == True` use case make sure we keep the first `._message: str` line non-indented since it'll show just after the header-line's type path with ':'. - when `False` drop the `)>` `repr()`-instance style as well so that we just get the ascii boxed traceback as though it's the error message-`str` not the `repr()` of the error obj. Other, - hide `pack_from_raise()` call frame since it'll show in debug mode crash handling.. - mk `MsgTypeError.from_decode()` explicitly accept and proxy an optional `ipc_msg` and change `msgdict` to also be optional, only reading out the `**extra_msgdata` when provided. - expose a `_mk_msg_type_err(src_err_msg: Error\|None = None,)` for callers who which to inject a `._ipc_msg: Msgtype` to the MTE. \|_ add a note how we can't use it due to a causality-dilemma when pld validating `Started` on the send side..	2025-03-21 15:25:42 -04:00
Tyler Goodlet	5b14baaf58	Add debug check-n-wait inside `._spawn.soft_kill()` And IFF the `await wait_func(proc)` is cancelled such that we avoid clobbering some subactor that might be REPL-ing even though its parent actor is in the midst of (gracefully) cancelling it.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	18de9c1693	Mk `MsgDec.spec_str` have a more compact `	2025-03-21 15:25:42 -04:00
Tyler Goodlet	eb88511a8c	Call `.devx._debug.hide_runtime_frames()` by default From both `open_root_actor()` and `._entry._trio_main()`. Other `breakpoint()`-from-sync-func fixes: - properly disable the default hook using `"0"` XD - offer a `hide_tb: bool` from `open_root_actor()`. - disable hiding the `._trio_main()` frame, bc pretty sure it doesn't help anyone (either way) when REPL-ing/tb-ing from a subactor..?	2025-03-21 15:25:42 -04:00
Tyler Goodlet	66048da832	Port `Actor._stream_handler()` to use `.has_outcome`, fix indent bug..	2025-03-21 15:25:42 -04:00
Tyler Goodlet	6c992a2fea	Update debugger tests to expect new pformatting Mostly the result of the `RemoteActorError.pformat()` and our new `_pause/crash_msg: str`s which include the `trio.Task.__repr__()` in the `log.pdb()` message. Obvi use the `in_prompt_msg()` to accomplish where not used prior. ToDo later: -[ ] still some outstanding questions on how detailed inceptions should look, eg. in `test_multi_nested_subactors_error_through_nurseries()` \|_maybe we should be more pedantic at checking `.src_uid` vs. `.relay_uid` fields? -[ ] staged a placeholder test for verifying correct call-stack frame on crash handler REPL entry. -[ ] also need a test to verify that you can't pause from an already paused actor task such as can happen if you try to step through runtime code that has a recurrent entry to `._debug.pause()`.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	d530002d66	Move runtime frame hiding into helper func Call it `hide_runtime_frames()` and stick all the lines from the top of the `._debug` mod in there along with a little `log.devx()` emission on what gets hidden by default ;) Other, - fix ref-error where internal-error handler might trigger despite the debug `req_ctx` not yet having init-ed, such that we don't try to cancel or log about it when it never was fully created/initialize.. - fix assignment typo iniside `_set_trace()` for `task`.. lel	2025-03-21 15:25:42 -04:00
Tyler Goodlet	904c6895f7	Better context aware `RemoteActorError.pformat()` Such that when displaying with `.__str__()` we do not show the type header (style) since normally python's raising machinery already prints the type path like `'tractor._exceptions.RemoteActorError:'`, so doing it 2x is a bit ugly ;p In support, - include `.relay_uid` in `RemoteActorError.extra_body_fields`. - offer a `with_type_header: bool` to `.pformat()` and only put the opening type path and closing `')>'` tail line when `True`. - add `.is_inception() -> bool:` for an easy way to determine if the error is multi-hop relayed. - only repr the `'\|_relay_uid=<uid>'` field when an error is an inception. - tweak the invalid-payload case in `_mk_msg_type_err()` to explicitly state in the `message` how the `any_pld` value does not match the `MsgDec.pld_spec` by decoding the invalid `.pld` with an any-dec. - allow `_mk_msg_type_err(**mte_kwargs)` passthrough. - pass `boxed_type=cls` inside `MsgTypeError.from_decode()`.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	f0912c9859	Resolve remaining debug-request race causing hangs More or less by pedantically separating and managing root and subactor request syncing events to always be managed by the locking IPC context task-funcs: - for the root's "child"-side, `lock_tty_for_child()` directly creates and sets a new `Lock.req_handler_finished` inside a `finally:` - for the sub's "parent"-side, `request_root_stdio_lock()` does the same with a new `DebugStatus.req_finished` event and separates it from the `.repl_release` event (which indicates a "c" or "q" from user and thus exit of the REPL session) as well as sets a new `.req_task: trio.Task` to explicitly distinguish from the app-user-task that enters the REPL vs. the paired bg task used to request the global root's stdio mutex alongside it. - apply the `__pld_spec__` on "child"-side of the ctx using the new `Portal.open_context(pld_spec)` parameter support; drops use of any `ContextVar` malarky used prior for `PldRx` mgmt. - removing `Lock.no_remote_has_tty` since it was a nebulous name and from the prior "everything is in a `Lock`" design.. ------ - ------ More rigorous impl to handle various edge cases in `._pause()`: - rejig `_enter_repl_sync()` to wrap the `debug_func == None` case inside maybe-internal-error handler blocks. - better logic for recurrent vs. multi-task contention for REPL entry in subactors, by guarding using `DebugStatus.req_task` and by now waiting on the new `DebugStatus.req_finished` for the multi-task contention case. - even better internal error handling and reporting for when this code is hacked on and possibly broken ;p ------ - ------ Updates to `.pause_from_sync()` support: - add optional `actor`, `task` kwargs to `_set_trace()` to allow compat with the new explicit `debug_func` calling in `._pause()` and pass a `threading.Thread` for `task` in the `.to_thread()` usage case. - add an `except` block that tries to show the frame on any internal error. ------ - ------ Relatedly includes a buncha cleanups/simplifications somewhat in prep for some coming refinements (around `DebugStatus`): - use all the new attrs mentioned above as needed in the SIGINT shielder. - wait on `Lock.req_handler_finished` in `maybe_wait_for_debugger()`. - dropping a ton of masked legacy code left in during the recent reworks. - better comments, like on the use of `Context._scope` for shielding on the "child"-side to avoid the need to manage yet another cs. - add/change-to lotsa `log.devx()` level emissions for those infos which are handy while hacking on the debugger but not ideal/necessary to be user visible. - obvi add lotsa follow up todo notes!	2025-03-21 15:25:42 -04:00
Tyler Goodlet	3b5970f12b	Show runtime nursery frames on internal errors Much like other recent changes attempt to detect runtime-bug-causing crashes and only show the runtime-endpoint frame when present. Adds a `ActorNursery._scope_error: BaseException\|None` attr to aid with detection. Also toss in some todo notes for removing and replacing the `.run_in_actor()` method API.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	5668328c8f	Set `_ctxvar_Context` for child-side RPC tasks Just inside `._invoke()` after the `ctx: Context` is retrieved. Also try our best to not hide internal frames when a non-user-code crash happens, normally either due to a runtime RPC EP bug or a transport failure.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	e133911a44	Add error suppress flag to `current_ipc_ctx()`	2025-03-21 15:25:42 -04:00
Tyler Goodlet	09948d71c6	Shield channel closing in `_connect_chan()`	2025-03-21 15:25:42 -04:00
Tyler Goodlet	452094df27	Adjust `Portal` usage of `Context.pld_rx` Pass the new `ipc` arg and try to show api frames when an unexpected internal error is detected.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	e0dc1d73b2	Expose `tractor.current_ipc_ctx()` at pkg level	2025-03-21 15:25:42 -04:00
Tyler Goodlet	8881219eae	Allocate a `PldRx` per `Context`, new pld-spec API Since the state mgmt becomes quite messy with multiple sub-tasks inside an IPC ctx, AND bc generally speaking the payload-type-spec should map 1-to-1 with the `Context`, it doesn't make a lot of sense to be using `ContextVar`s to modify the `Context.pld_rx: PldRx` instance. Instead, always allocate a full instance inside `mk_context()` with the default `.pld_rx: PldRx` set to use the `msg._ops._def_any_pldec: MsgDec` In support, simplify the `.msg._ops` impl and APIs: - drop `_ctxvar_PldRx`, `_def_pld_rx` and `current_pldrx()`. - rename `PldRx._pldec` -> `._pld_dec`. - rename the unused `PldRx.apply_to_ipc()` -> `.wraps_ipc()`. - add a required `PldRx._ctx: Context` attr since it is needed internally in some meths and each pld-rx now maps to a specific ctx. - modify all recv methods to accept a `ipc: Context\|MsgStream` (instead of a `ctx` arg) since both have a ref to the same `._rx_chan` and there are only a couple spots (in `.dec_msg()`) where we need the `ctx` explicitly (which can now be easily accessed via a new `MsgStream.ctx` property, see below). - always show the `.dec_msg()` frame in tbs if there's a reference error when calling `_raise_from_unexpected_msg()` in the fallthrough case. - implement `limit_plds()` as light wrapper around getting the `current_ipc_ctx()` and mutating its `MsgDec` via `Context.pld_rx.limit_plds()`. - add a `maybe_limit_plds()` which just provides an `@acm` equivalent of `limit_plds()` handy for composing in a `async with ():` style block (avoiding additional indent levels in the body of async funcs). Obvi extend the `Context` and `MsgStream` interfaces as needed to match the above: - add a `Context.pld_rx` pub prop. - new private refs to `Context._started_msg: Started` and a `._started_pld` (mostly for internal debugging / testing / logging) and set inside `.open_context()` immediately after the syncing phase. - a `Context.has_outcome() -> bool:` predicate which can be used to more easily determine if the ctx errored or has a final result. - pub props for `MsgStream.ctx: Context` and `.chan: Channel` providing full `ipc`-arg compat with the `PldRx` method signatures.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	26d3ba7cc7	Make `request_root_stdio_lock()` post-mortem-able Finally got this working so that if/when an internal bug is introduced to this request task-func, we can actually REPL-debug the lock request task itself B) As in, if the subactor's lock request task internally errors we, - ensure the task always terminates (by calling `DebugStatus.release()`) and explicitly reports (via a `log.exception()`) the internal error. - capture the error instance and set as a new `DebugStatus.req_err` and always check for it on final teardown - in which case we also, - ensure it's reraised from a new `DebugRequestError`. - unhide the stack frames for `_pause()`, `_enter_repl_sync()` so that the dev can upward inspect the `_pause()` call stack sanely. Supporting internal impl changes, - add `DebugStatus.cancel()` and `.req_err`. - don't ever cancel the request task from `PdbREPL.set_[continue/quit]()` only when there's some internal error that would likely result in a hang and stale lock state with the root. - only release the root's lock when the current ask is also the owner (avoids bad release errors). - also show internal `._pause()`-related frames on any `repl_err`. Other temp-dev-tweaks, - make pld-dec change log msgs info level again while solving this final context-vars race stuff.. - drop the debug pld-dec instance match asserts for now since the problem is already caught (and now debug-able B) by an attr-error on the decoded-as-`dict` started msg, and instead add in a `log.exception()` trace to see which task is triggering the case where the debug `MsgDec` isn't set correctly vs. when we think it's being applied.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	6734dbb3cd	Always release debug request from `._post_mortem()` Since obviously the thread is likely expected to halt and raise after the REPL session exits; this was a regression from the prior impl. The main reason for this is that otherwise the request task will never unblock if the user steps through the crashed task using 'next' since the `.do_next()` handler doesn't by default release the request since in the `.pause()` case this would end the session too early. Other, - toss in draft `Pdb.user_exception()`, though doesn't seem to ever trigger? - only release `Lock._debug_lock` when already locked.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	29a001c4ef	Rename `.msg.types.Msg` -> `PayloadMsg`	2025-03-21 15:25:42 -04:00
Tyler Goodlet	2ddfe11d71	Modernize streaming example script - add typing, - apply multi-line call style, - use 'cancel' log level, - enable debug mode.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	316afdec55	Update tests for `PldRx` and `Context` changes Mostly adjustments for the new pld-receiver semantics/shim-layer which results more often in the direct delivery of `RemoteActorError`s from IPC API primitives (like `Portal.result()`) instead of being embedded in an `ExceptionGroup` bundled from an embedded nursery. Tossed usage of the `debug_mode: bool` fixture to a couple problematic tests while i was working on them. Also includes detailed assertion updates to the inter-peer cancellation suite in terms of, - `Context.canceller` state correctly matching the true src actor when expecting a ctxc. - any rxed `ContextCancelled` should instance match the `Context._local/remote_error` as should the `.msgdata` and `._ipc_msg`.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	bc660a533c	Hide some API frames, port to new `._debug` apis - start tossing in `__tracebackhide__`s to various eps which don't need to show in tbs or in the pdb REPL. - port final `._maybe_enter_pm()` to pass a `api_frame`. - start comment-marking up some API eps with `@api_frame` in prep for actually using the new frame-stack tracing.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	61183f6a97	Use `.recv_msg_w_pld()` for final `Portal.result()` Woops, due to a `None` test against the `._final_result`, any actual final `None` result would be received but not acked as such causing a spawning test to hang. Fix it by instead receiving and assigning both a `._final_result_msg: PayloadMsg` and `._final_result_pld`. NB: as mentioned in many recent comments surrounding this API layer, really this whole `Portal`-has-final-result interface/semantics should be entirely removed as should the `ActorNursery.run_in_actor()` API(s). Instead it should all be replaced by a wrapping "high level" API (`tractor.hilevel` ?) which combines a task nursery, `Portal.open_context()` and underlying `Context` APIs + an `outcome.Outcome` to accomplish the same "run a single task in a spawned actor and return it's result"; aka a "one-shot-task-actor".	2025-03-21 15:25:42 -04:00
Tyler Goodlet	8d5b40507c	Rename `.msg.types.Msg` -> `PayloadMsg`	2025-03-21 15:25:42 -04:00
Tyler Goodlet	194bb8f7fb	Adjust `._runtime` to report `DebugStatus.req_ctx` - inside the `Actor.cancel()`'s maybe-wait-on-debugger delay, report the full debug request status and it's affiliated lock request IPC ctx. - use the new `.req_ctx.chan.uid` to do the local nursery lookup during channel teardown handling. - another couple log fmt tweaks.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	c1747a290a	Rework and first draft of `.devx._frame_stack.py` Proto-ing a little suite of call-stack-frame annotation-for-scanning sub-systems for the purposes of both, - the `.devx._debug`er and its traceback and frame introspection needs when entering the REPL, - detailed trace-style logging such that we can explicitly report on "which and where" `tractor`'s APIs are used in the "app" code. Deats: - change mod name obvi from `._code` and adjust client mod imports. - using `wrapt` (for perf) implement a `@api_frame` annot decorator which both stashes per-call-stack-frame instances of `CallerInfo` in a table and marks the function such that API endpoints can be easily found via runtime stack scanning despite any internal impl changes. - add a global `_frame2callerinfo_cache: dict[FrameType, CallerInfo]` table for providing the per func-frame info caching. - Re-implement `CallerInfo` to require less (types of) inputs: \|_ `_api_func: Callable`, a ref to the (singleton) func def. \|_ `_api_frame: FrameType` taken from the `@api_frame` marked `tractor`-API func's runtime call-stack, from which we can determine the app code's `.caller_frame`. \|_`_caller_frames_up: int\|None` allowing the specific `@api_frame` to determine "how many frames up" the application / calling code is. And, a better set of derived attrs: \|_`caller_frame: FrameType` which finds and caches the API-eps calling frame. \|_`caller_frame: FrameType` which finds and caches the API-eps calling - add a new attempt at "getting a method ref from its runtime frame" with `get_ns_and_func_from_frame()` using a heuristic that the `CodeType.co_qualname: str` should have a "." in it for methods. - main issue is still that the func-ref lookup will require searching for the method's instance type by name, and that name isn't guaranteed to be defined in any particular ns.. \|_rn we try to read it from the `FrameType.f_locals` but that is going to obvi fail any time the method is called in a module where it's type is not also defined/imported. - returns both the ns and the func ref FYI.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	0c57e1a808	Even moar bitty `Context` refinements - set `._state._ctxvar_Context` just after `StartAck` inside `open_context_from_portal()` so that `current_ipc_ctx()` always works on the 'parent' side. - always set `.canceller` to any `MsgTypeError.src_uid` and otherwise to any maybe-detected `.src_uid` (i.e. for RAEs). - always set `.canceller` to us when we rx a ctxc which reports us as its canceller; this is a sanity check on definite "self cancellation". - adjust `._is_self_cancelled()` logic to only be `True` when `._remote_error` is both a ctxc with a `.canceller` set to us AND when `Context.canceller` is also set to us (since the change above) as a little bit of extra rigor. - fill-in/fix some `.repr_state` edge cases: - merge self-vs.-peer ctxc cases to one block and distinguish via nested `._is_self_cancelled()` check. - set 'errored' for all exception matched cases despite `.canceller`. - add pre-`Return` phase statuses: \|_'pre-started' and 'syncing-to-child' depending on side and when `._stream` has not (yet) been set. \|_'streaming' and 'streaming-finished' depending on side when `._stream` is set and whether it was stopped/closed. - tweak drainage log-message to use "outcome" instead of "result". - use new `.devx.pformat.pformat_cs()` inside `_maybe_cancel_and_set_remote_error()` but, IFF the log level is at least 'cancel'.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	17cf3d45ba	Move `_debug.pformat_cs()` into `devx.pformat`	2025-03-21 15:25:42 -04:00
Tyler Goodlet	04bd53ff10	Big debugger rework, more tolerance for internal err-hangs Since i was running into them (internal errors) during lock request machinery dev and was getting all sorts of difficult to understand hangs whenever i intro-ed a bug to either side of the ipc ctx; this all while trying to get the msg-spec working for `Lock` requesting subactors.. Deats: - hideframes for `@acm`s and `trio.Event.wait()`, `Lock.release()`. - better detail out the `Lock.acquire/release()` impls - drop `Lock.remote_task_in_debug`, use new `.ctx_in_debug`. - add a `Lock.release(force: bool)`. - move most of what was `_acquire_debug_lock_from_root_task()` and some of the `lock_tty_for_child().__a[enter/exit]()` logic into `Lock.[acquire/release]()` including bunch more logging. - move `lock_tty_for_child()` up in the module to below `Lock`, with some rework: - drop `subactor_uid: tuple` arg since we can just use the `ctx`.. - add exception handler blocks for reporting internal (impl) errors and always force release the lock in such cases. - extend `DebugStatus` (prolly will rename to `DebugRequest` btw): - add `.req_ctx: Context` for subactor side. - add `.req_finished: trio.Event` to sub to signal request task exit. - extend `.shield_sigint()` doc-str. - add `.release()` to encaps all the state mgmt previously strewn about inside `._pause()`.. - use new `DebugStatus.release()` to replace all the duplication: - inside `PdbREPL.set_[continue/quit]()`. - inside `._pause()` for the subactor branch on internal repl-invocation error cases, - in the `_enter_repl_sync()` closure on error, - replace `apply_debug_codec()` -> `apply_debug_pldec()` in tandem with the new `PldRx` sub-sys which handles the new `__pld_spec__`. - add a new `pformat_cs()` helper orig to help debug cs stack a corruption; going to move to `.devx.pformat` obvi. - rename `wait_for_parent_stdin_hijack()` -> `request_root_stdio_lock()` with improvements: - better doc-str and add todos, - use `DebugStatus` more stringently to encaps all subactor req state. - error handling blocks for cancellation and straight up impl errors directly around the `.open_context()` block with the latter doing a `ctx.cancel()` to avoid hanging in the shielded `.req_cs` scope. - similar exc blocks for the func's overall body with explicit `log.exception()` reporting. - only set the new `DebugStatus.req_finished: trio.Event` in `finally`. - rename `mk_mpdb()` -> `mk_pdb()` and don't cal `.shield_sigint()` implicitly since the caller usage does matter for this. - factor out `any_connected_locker_child()` from the SIGINT handler. - rework SIGINT handler to better handle any stale-lock/hang cases: - use new `Lock.ctx_in_debug: Context` to detect subactor-in-debug. and use it to cancel any lock request instead of the lower level - use `problem: str` summary approach to log emissions. - rework `_pause()` given all of the above, stuff not yet mentioned: - don't take `shield: bool` input and proxy to `debug_func()` (for now). - drop `extra_frames_up_when_async: int` usage, expect `**debug_func_kwargs` to passthrough an `api_frame: Frametype` (more on this later). - lotsa asserts around the request ctx vs. task-in-debug ctx using new `current_ipc_ctx()`. - asserts around `DebugStatus` state. - rework and simplify the `debug_func` hooks, `_set_trace()`/`_post_mortem()`: - make them accept a non-optional `repl: PdbRepl` and `api_frame: FrameType` which should be used to set the current frame when the REPL engages. - always hide the hook frames. - always accept a `tb: TracebackType` to `_post_mortem()`. \|_ copy and re-impl what was the delegation to `pdbp.xpm()`/`pdbp.post_mortem()` and instead call the underlying `Pdb.interaction()` ourselves with a `caller_frame` and tb instance. - adjust the public `.pause()` impl: - accept optional `hide_tb` and `api_frame` inputs. - mask opening a cancel-scope for now (can cause `trio` stack corruption, see notes) and thus don't use the `shield` input other then to eventually passthrough to `_post_mortem()`? \|_ thus drop `task_status` support for now as well. \|_ pretty sure correct soln is a debug-nursery around `._invoke()`. - since no longer using `extra_frames_up_when_async` inside `debug_func()`s ensure all public apis pass a `api_frame`. - re-impl our `tractor.post_mortem()` to directly call into `._pause()` instead of binding in via `partial` and mk it take similar input as `.pause()`. - drop `Lock.release()` from `_maybe_enter_pm()`, expose and pass expected frame and tb. - use necessary changes from all the above within `maybe_wait_for_debugger()` and `acquire_debug_lock()`. Lel, sorry thought that would be shorter.. There's still a lot more re-org to do particularly with `DebugStatus` encapsulation but it's coming in follow up.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	332ce97650	Allow `Stop` passthrough from `PldRx.recv_msg_w_pld()` Since we need to allow it (at the least) inside `drain_until_final_msg()` for handling stream-phase termination races where we don't want to have to handle a raised error from something like `Context.result()`. Expose the passthrough option via a `passthrough_non_pld_msgs: bool` kwarg. Add comprehensive comment to `current_pldrx()`.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	d3e13658ab	Add a "current IPC `Context`" `ContextVar` Expose it from `._state.current_ipc_ctx()` and set it inside `._rpc._invoke()` for child and inside `Portal.open_context()` for parent. Still need to write a few more tests (particularly demonstrating usage throughout multiple nested nurseries on each side) but this suffices as a proto for testing with some debugger request-from-subactor stuff. Other, - use new `.devx.pformat.add_div()` for ctxc messages. - add a block to always traceback dump on corrupted cs stacks. - better handle non-RAEs exception output-formatting in context termination summary log message. - use a summary for `start_status` for msg logging in RPC loop.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	d680e31e4f	Mk `drain_to_final_msg()` never raise from `Error` Since we usually want them raised from some (internal) call to `Context.maybe_raise()` and NOT directly from the drainage call, make it possible via a new `raise_error: bool` to both `PldRx.recv_msg_w_pld()` and `.dec_msg()`. In support, - rename `return_msg` -> `result_msg` since we expect to return `Error`s. - do a `result_msg` assign and `break` in the `case Error()`. - add `**dec_msg_kwargs` passthrough for other `.dec_msg()` calling methods. Other, - drop/aggregate todo-notes around the main loop's `ctx._pld_rx.recv_msg_w_pld()` call. - add (configurable) frame hiding to most payload receive meths.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	048c60f112	"Icons" in `._entry`'s subactor `.info()` messages Add a little `>` or `X` supervision icon indicating the spawning or termination of each sub-actor respectively.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	219d5c1745	Move pformatters into new `.devx.pformat` Since `._code` is prolly gonna get renamed (to something "frame & stack tools" related) and to give a bit better organization. Also adds a new `add_div()` helper, factored out of ctxc message creation in `._rpc._invoke()`, for adding a little "header line" divider under a given `message: str` with a little math to center it.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	467764d45e	Change to `RemoteActorError.pformat()` For more sane manual calls as needed in logging purposes. Obvi remap the dunder methods to it. Other: - drop `hide_tb: bool` from `unpack_error()`, shouldn't need it since frame won't ever be part of any tb raised from returned error. - add a `is_invalid_payload: bool` to `_raise_from_unexpected_msg()` to be used from `PldRx` where we don't need to decode the IPC msg, just the payload; make the error message reflect this case. - drop commented `._portal._unwrap_msg()` since we've replaced it with `PldRx`'s delegation to newer `._raise_from_unexpected_msg()`. - hide the `Portal.result()` frame by default, again.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	998c0f0bd5	Add todo for rigorous struct-type spec of `SpawnSpec` fields	2025-03-21 15:25:42 -04:00
Tyler Goodlet	ceaafc064e	Type annot the proc from `trio.lowlevel.open_process()`	2025-03-21 15:25:42 -04:00
Tyler Goodlet	7b6881cf0a	Fix attr name error, use public `MsgDec.dec`	2025-03-21 15:25:42 -04:00
Tyler Goodlet	2cdd5b5b8f	Reorg frames pformatters, add `Context.repr_state` A better spot for the pretty-formatting of frame text (and thus tracebacks) is in the new `.devx._code` module: - move from `._exceptions` -> `.devx._code.pformat_boxed_tb()`. - add new `pformat_caller_frame()` factored out the use case in `._exceptions._mk_msg_type_err()` where we dump a stack trace for bad `.send()` side IPC msgs. Add some new pretty-format methods to `Context`: - explicitly implement `.pformat()` and allow an `extra_fields: dict` which can be used to inject additional fields (maybe eventually by default) such as is now used inside `._maybe_cancel_and_set_remote_error()` when reporting the internal `._scope` state in cancel logging. - add a new `.repr_state -> str` which provides a single string status depending on the internal state of the IPC ctx in terms of the shuttle protocol's "phase"; use it from `.pformat()` for the `\|_state:`. - set `.started(complain_no_parity=False)` now since we presume decoding with `.pld: Raw` now with the new `PldRx` design. - use new `msgops.current_pldrx()` in `mk_context()`.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	1f4c780b98	Mk `process_messages()` return last msg; summary logging Not sure it's that useful (yet) but in theory would allow avoiding certain log level usage around transient RPC requests for discovery methods (like `.register_actor()` and friends); can't hurt to be able to introspect that last message for other future cases I'd imagine as well. Adjust the calling code in `._runtime` to match; other spots are using the `trio.Nursery.start()` schedule style and are fine as is. Improve a bunch more log messages throughout a few mods mostly by going to a "summary" single-emission style where possible/appropriate: - in `._runtime` more "single summary" status style log emissions: \|_mk `Actor.load_modules()` render a single mod loaded summary. \|_use a summary `con_status: str` for `Actor._stream_handler()` conn setup and an equiv (`con_teardown_status`) for connection teardowns. \|_similar thing in `Actor.wait_for_actor()`. - generally more usage of `.msg.pretty_struct` apis throughout `._runtime`.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	f9de439b87	First draft payload-spec limit API Add new task-scope oriented `PldRx.pld_spec` management API similar to `.msg._codec.limit_msg_spec()`, but obvi built to process and filter `MsgType.pld` values. New API related changes include: - new per-task singleton getter `msg._ops.current_pldrx()` which delivers the current (global) payload receiver via a new `_ctxvar_PldRx: ContextVar` configured with a default `_def_any_pldec: MsgDec[Any]` decoder. - a `PldRx.limit_plds()` which sets the decoder (`.type` underneath) for the specific payload rx instance. - `.msg._ops.limit_plds()` which obtains the current task-scoped `PldRx` and applies the pld spec via a new `PldRx.limit_plds()`. - rename `PldRx._msgdec` -> `._pldec`. - add `.pld_dec` as pub attr for -^ Unrelated adjustments: - use `.msg.pretty_struct.pformat()` where handy. - always pass `expect_msg: MsgType`. - add a `case Stop()` to `PldRx.dec_msg()` which will `log.warning()` when a stop is received by no stream was open on this receiving side since we rarely want that to raise since it's prolly just a runtime race or mistake in user code. Other:	2025-03-21 15:25:42 -04:00
Tyler Goodlet	49443d3a7e	Make `.msg.types.Msg.pld: Raw` only, since `PldRx`..	2025-03-21 15:25:42 -04:00
Tyler Goodlet	b78732781f	More bitty (runtime) logging tweaks	2025-03-21 15:25:42 -04:00
Tyler Goodlet	bf08066031	Use new `Msg[Co]Dec` repr meths in `._exceptions` Particularly when logging around `MsgTypeError`s. Other: - make `_raise_from_unexpected_msg()`'s `expect_msg` a non-default value arg, must always be passed by caller. - drop `'canceller'` from `_body_fields` ow it shows up twice for ctxc. - use `.msg.pretty_struct.pformat()`. - parameterize `RemoteActorError.reprol()` (repr-one-line method) to show `RemoteActorError[<self.boxed_type_str>]( ..` to make obvi the boxed remote error type. - re-impl `.boxed_type_str` as `str`-casting the `.boxed_type` value which is guaranteed to render non-`None`.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	3b38fa8673	Add more useful `MsgDec.__repr__()` Basically exact same as that for `MsgCodec` with the `.spec` displayed via a better (maybe multi-line) `.spec_str: str` generated from a common new set of helper mod funcs factored out msg-codec meths: - `mk_msgspec_table()` to gen a `MsgType` name -> msg table. - `pformat_msgspec()` to `str`-ify said table values nicely.q Also add a new `MsgCodec.msg_spec_str: str` prop which delegates to the above for the same.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	7910e1297b	Mk `.msg.pretty_struct.Struct.pformat()` a mod func More along the lines of `msgspec.struct` and also far more useful internally for pprinting `MsgTypes`. Of course add method aliases.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	0efc4c1b87	Use `Context.[peer_]side` in ctxc messages	2025-03-21 15:25:42 -04:00
Tyler Goodlet	83e3a75c10	Add `Context.peer_side: str` property, mk static-meth private.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	3fb99f2ba5	Flip back `StartAck` timeout to `inf`..	2025-03-21 15:25:42 -04:00
Tyler Goodlet	94d8bef2d6	Another `._rpc` mod passthrough - tweaking logging to include more `MsgType` dumps on IPC faults. - removing some commented cruft. - comment formatting / cleanups / add-ons. - more type annots. - fill out some TODO content.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	e46046a746	Try out `msgspec` encode-buffer optimization As per the reco: https://jcristharif.com/msgspec/perf-tips.html#reusing-an-output-buffe BUT, seems to cause this error in `pikerd`.. `BufferError: Existing exports of data: object cannot be re-sized` Soo no idea? Maybe there's a tweak needed that we can glean from tests/examples in the `msgspec` repo? Disabling for now.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	875081e7a2	Set `Context._stream` in `Portal.open_stream_from()`..	2025-03-21 15:25:42 -04:00
Tyler Goodlet	6819cf908a	Use `Context._stream` in `_raise_from_unexpected_msg()` Instead of expecting it to be passed in (as it was prior), when determining if a `Stop` msg is a valid end-of-channel signal use the `ctx._stream: MsgStream\|None` attr which must be set by any stream opening API; either of: - `Context.open_stream()` - `Portal.open_stream_from()` Adjust the case block logic to match with fallthrough from any EoC to a closed error if necessary. Change the `_type: str` to match the failing IPC-prim name in the tail case we raise a `MessagingError`. Other: - move `.sender: tuple` uid attr up to `RemoteActorError` since `Error` optionally defines it as a field and for boxed `StreamOverrun`s (an ignore case we check for in the runtime during cancellation) we want it readable from the boxing rae. - drop still unused `InternalActorError`.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	9e5bdd26d7	First draft "payload receiver in a new `.msg._ops` As per much tinkering, re-designs and preceding rubber-ducking via many "commit msg novelas", finally this adds the (hopefully) final missing layer for typed msg safety: `tractor.msg._ops.PldRx` (or `PayloadReceiver`? haven't decided how verbose to go..) Design justification summary: ------ - ------ - need a way to be as-close-as-possible to the `tractor`-application such that when `MsgType.pld: PayloadT` validation takes place, it is straightforward and obvious how user code can decide to handle any resulting `MsgTypeError`. - there should be a common and optional-yet-modular way to modify how data delivered via IPC (possibly embedded as user defined, type-constrained `.pld: msgspec.Struct`s) can be handled and processed during fault conditions and/or IPC "msg attacks". - support for nested type constraints within a `MsgType.pld` field should be simple to define, implement and understand at runtime. - a layer between the app-level IPC primitive APIs (`Context`/`MsgStream`) and application-task code (consumer code of those APIs) should be easily customized and prove-to-be-as-such through demonstrably rigorous internal (sub-sys) use! -> eg. via seemless runtime RPC eps support like `Actor.cancel()` -> by correctly implementing our `.devx._debug.Lock` REPL TTY mgmt dialog prot, via a dead simple payload-as-ctl-msg-spec. There are some fairly detailed doc strings included so I won't duplicate that content, the majority of the work here is actually somewhat of a factoring of many similar blocks that are doing more or less the same `msg = await Context._rx_chan.receive()` with boilerplate for `Error`/`Stop` handling via `_raise_from_no_key_in_msg()`. The new `PldRx` basically provides a shim layer for this common "receive msg, decode its payload, yield it up to the consuming app task" by pairing the RPC feeder mem-chan with a msg-payload decoder and expecting IPC API internals to use one API instead of re-implementing the same pattern all over the place XD `PldRx` breakdown ------ - ------ - for now only expects a `._msgdec: MsgDec` which allows for override-able `MsgType.pld` validation and most obviously used in the impl of `.dec_msg()`, the decode message method. - provides multiple mem-chan receive options including: \|_ `.recv_pld()` which does the e2e operation of receiving a payload item. \|_ a sync `.recv_pld_nowait()` version. \|_ a `.recv_msg_w_pld()` which optionally allows retreiving both the shuttling `MsgType` as well as it's `.pld` body for use cases where info on both is important (eg. draining a `MsgStream`). Dirty internal changeover/implementation deatz: ------ - ------ - obvi move over all the IPC "primitives" that previously had the duplicate recv-n-yield logic: - `MsgStream.receive[_nowait]()` delegating instead to the equivalent `PldRx.recv_pld[_nowait]()`. - add `Context._pld_rx: PldRx`, created and passed in by `mk_context()`; use it for the `.started()` -> `first: Started` retrieval inside `open_context_from_portal()`. - all the relevant `Portal` invocation methods: `.result()`, `.run_from_ns()`, `.run()`; also allows for dropping `_unwrap_msg()` and `.Portal_return_once()` outright Bo - rename `Context.ctx._recv_chan` -> `._rx_chan`. - add detailed `Context._scope` info for logging whether or not it's cancelled inside `_maybe_cancel_and_set_remote_error()`. - move `._context._drain_to_final_msg()` -> `._ops.drain_to_final_msg()` since it's really not necessarily ctx specific per say, and it does kinda fit with "msg operations" more abstractly ;)	2025-03-21 15:25:42 -04:00
Tyler Goodlet	5d4681df4b	Add a `MsgDec` for receive-only decoding In prep for a "payload receiver" abstraction that will wrap `MsgType.pld`-IO delivery from `Context` and `MsgStream`, adds a small `msgspec.msgpack.Decoder` shim which delegates an API similar to `MsgCodec` and is offered via a `.msg._codec.mk_dec()` factory. Detalles: - move over the TODOs/comments from `.msg.types.Start` to to `MsgDec.spec` since it's probably the ideal spot to start thinking about it from a consumer code PoV. - move codec reversion assert and log emit into `finally:` block. - flip default `.types._tractor_codec = mk_codec_ipc_pld(ipc_pld_spec=Raw)` in prep for always doing payload-delayed decodes. - make `MsgCodec._dec` private with public property getter. - change `CancelAck` to NOT derive from `Return` so it's mutex in `match/case:` handling.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	baee808654	Move `MsgTypeError` maker func to `._exceptions` Since it's going to be used from the IPC primitive APIs (`Context`/`MsgStream`) for similarly handling payload type spec validation errors and bc it's really not well situation in the IPC module XD Summary of (impl) tweaks: - obvi move `_mk_msg_type_err()` and import and use it in `._ipc`; ends up avoiding a lot of ad-hoc imports we had from `._exceptions` anyway! - mask out "new codec" runtime log emission from `MsgpackTCPStream`. - allow passing a (coming in next commit) `codec: MsgDec` (message decoder) which supports the same required `.pld_spec_str: str` attr. - for send side logging use existing `MsgCodec..pformat_msg_spec()`. - rename `_raise_from_no_key_in_msg()` to the now more appropriate `_raise_from_unexpected_msg()`, but leaving alias for now.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	2ed43373c5	Drop more `dict`-msg cruft from `._exceptions`	2025-03-21 15:25:42 -04:00
Tyler Goodlet	d982daa886	Mark `.pld` msgs as also taking `msgspec.Raw`	2025-03-21 15:25:42 -04:00
Tyler Goodlet	97fc2a6628	Go back to `ContextVar` for codec mgmt Turns out we do want per-task inheritance particularly if there's to be per `Context` dynamic mutation of the spec; we don't want mutation in some task to affect any parent/global setting. Turns out since we use a common "feeder task" in the rpc loop, we need to offer a per `Context` payload decoder sys anyway in order to enable per-task controls for inter-actor multi-task-ctx scenarios.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	5bf27aca2c	Proto in new `Context` refinements As per some newly added features and APIs: - pass `portal: Portal` to `Actor.start_remote_task()` from `open_context_from_portal()` marking `Portal.open_context()` as always being the "parent" task side. - add caller tracing via `.devx._code.CallerInfo/.find_caller_info()` called in `mk_context()` and (for now) a `__runtimeframe__: int = 2` inside `open_context_from_portal()` such that any enter-er of `Portal.open_context()` will be reported. - pass in a new `._caller_info` attr which is used in 2 new meths: - `.repr_caller: str` for showing the name of the app-code-func. - `.repr_api: str` for showing the API ep, which for now we just hardcode to `Portal.open_context()` since ow its gonna show the mod func name `open_context_from_portal()`. - use those new props ^ in the `._deliver_msg()` flow body log msg content for much clearer msg-flow tracing Bo - add `Context._cancel_on_msgerr: bool` to toggle whether a delivered `MsgTypeError` should trigger a `._scope.cancel()` call. - also (temporarily) add separate `.cancel()` emissions for both cases as i work through hacking out the maybe `MsgType.pld: Raw` support.	2025-03-21 15:25:42 -04:00
Tyler Goodlet	85c9a8e628	Tweak `current_actor()` failure msg	2025-03-21 15:25:42 -04:00
Tyler Goodlet	69b509d09e	Add some `bytes` annots	2025-03-21 15:25:42 -04:00
Tyler Goodlet	41499c6d9e	TOSQUASH `77a15eb` use `DebugStatus` in `._rpc`	2025-03-21 15:25:42 -04:00
Tyler Goodlet	be0ded2a22	Annotate nursery and portal methods for `CallerInfo` scanning	2025-03-21 15:25:41 -04:00
Tyler Goodlet	7d71fce558	`NamespacePath._mk_fqnp()` handle `__mod__` for methods Need to use `__self__.__mod__` in the method case i guess..	2025-03-21 15:25:41 -04:00
Tyler Goodlet	cbb9bbcbca	Use `DebugStatus` around subactor lock requests Breaks out all the (sub)actor local conc primitives from `Lock` (which is now only used in and by the root actor) such that there's an explicit distinction between a task that's "consuming" the `Lock` (remotely) vs. the root-side service tasks which do the actual acquire on behalf of the requesters. `DebugStatus` changeover deats: ------ - ------ - move all the actor-local vars over `DebugStatus` including: - move `_trio_handler` and `_orig_sigint_handler` - `local_task_in_debug` now `repl_task` - `_debugger_request_cs` now `req_cs` - `local_pdb_complete` now `repl_release` - drop all ^ fields from `Lock.repr()` obvi.. - move over the `.[un]shield_sigint()` and `.is_main_trio_thread()` methods. - add some new attrs/meths: - `DebugStatus.repl` for the currently running `Pdb` in-actor singleton. - `.repr()` for pprint of state (like `Lock`). - Note: that even when a root-actor task is in REPL, the `DebugStatus` is still used for certain actor-local state mgmt, such as SIGINT handler shielding. - obvi change all lock-requester code bits to now use a `DebugStatus` in their local actor-state instead of `Lock`, i.e. change usage from `Lock` in `._runtime` and `._root`. - use new `Lock.get_locking_task_cs()` API in when checking for sub-in-debug from `._runtime.Actor._stream_handler()`. Unrelated to topic-at-hand tweaks: ------ - ------ - drop the commented bits about hiding `@[a]cm` stack frames from `_debug.pause()` and simplify to only one block with the `shield` passthrough since we already solved the issue with cancel-scopes using `@pdbp.hideframe` B) - this includes all the extra logging about the extra frame for the user (good thing i put in that wasted effort back then eh..) - put the `try/except BaseException` with `log.exception()` around the whole of `._pause()` to ensure we don't miss in-func errors which can cause hangs.. - allow passing in `portal: Portal` to `Actor.start_remote_task()` such that `Portal` task spawning methods are always denoted correctly in terms of `Context.side`. - lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.	2025-03-21 15:25:41 -04:00
Tyler Goodlet	ef3a7fbaa8	The src error to `_raise_from_no_key_in_msg()` is always an attr-error now!	2025-03-21 15:25:41 -04:00
Tyler Goodlet	14583307ee	First draft, sub-msg-spec for debugger `Lock` sys Since it's totes possible to have a spec applied that won't permit `str`s, might as well formalize a small msg set for subactors to request the tree-wide TTY `Lock`. BTW, I'm prolly not going into every single change here in this first WIP since there's still a variety of broken stuff mostly to do with races on the codec apply being done in a `trio.lowleve.RunVar`; it should be re-done with a `ContextVar` such that each task does NOT mutate the global setting.. New msg set and usage is simply: - `LockStatus` which is the reponse msg delivered from `lock_tty_for_child()` - `LockRelease` a one-off request msg from the subactor to drop the `Lock` from a `MsgStream.send()`. - use these msgs throughout the root and sub sides of the locking ctx funcs: `lock_tty_for_child()` & `wait_for_parent_stdin_hijack()` The codec is now applied in both the root and sub `Lock` request tasks: - for root inside `lock_tty_for_child()` before the `.started()`. - for subs, inside `wait_for_parent_stdin_hijack()` since we only want to affect the codec for the locking task. - (hence the need for ctx-var as mentioned above but currently this can cause races which will break against other app tasks competing for the codec setting). - add a `apply_debug_codec()` helper for use in both cases. - add more detailed logging to both the root and sub side of `Lock` requesting funcs including requiring that the sub-side task "uid" (a `tuple[str, int]` = (trio.Task.name, id(trio.Task)` be provided (more on this later). A main issue discovered while proto-testing all this was the ability of a sub to "double lock" (leading to self-deadlock) via an error in `wait_for_parent_stdin_hijack()` which, for ex., can happen in debug mode via crash handling of a `MsgTypeError` received from the root during a codec applied msg-spec race! Originally I was attempting to solve this by making the SIGINT override handler more resilient but this case is somewhat impossible to detect by an external root task other then checking for duplicate ownership via the new `subactor_task_uid`. => SO NOW, we always stick the current task uid in the `Lock._blocked: set` and raise an rte on a double request by the same remote task. Included is a variety of small refinements: - finally figured out how to mark a variety of `.__exit__()` frames with `pdbp.hideframe()` to actually hide them B) - add cls methods around managing `Lock._locking_task_cs` from root only. - re-org all the `Lock` attrs into those only used in root vs. subactors and proto-prep a new `DebugStatus` actor-singleton to be used in subs. - add a `Lock.repr()` to contextually print the current conc primitives. - rename our `Pdb`-subtype to `PdbREPL`. - rigor out the SIGINT handler a bit, originally to try and hack-solve the double-lock issue mentioned above, but now just with better logging and logic for most (all?) possible hang cases that should be hang-recoverable after enough ctrl-c mashing by the user.. well hopefully: - using `Lock.repr()` for both root and sub cases. - lots more `log.warn()`s and handler reversions on stale lock or cs detection. - factor `._pause()` impl a little better moving the actual repl entry to a new `_enter_repl_sync()` (originally for easier wrapping in the sub case with `apply_codec()`).	2025-03-21 15:25:41 -04:00
Tyler Goodlet	59966e5650	Tweak a couple more log message fmts	2025-03-21 15:25:41 -04:00
Tyler Goodlet	ca43f15aa0	More msg-spec tests tidying - Drop `test_msg_spec_xor_pld_spec()` since we no longer support `ipc_msg_spec` arg to `mk_codec()`. - Expect `MsgTypeError`s around `.open_context()` calls when `add_codec_hooks == False`. - toss in some `.pause()` points in the subactor ctx body whilst hacking out a `.pld` protocol for debug mode TTY locking.	2025-03-21 15:25:41 -04:00
Tyler Goodlet	36bf58887d	Pass a `use_greenback: bool` runtime var to subs Such that the top level `maybe_enable_greenback` from `open_root_actor()` can toggle the entire actor tree's usage. Read the rtv in `._rpc` tasks and only enable if set. Also, rigor up the `._rpc.process_messages()` loop to handle `Error()` and `case _:` separately such that we now raise an explicit rte for unknown / invalid msgs. Use "parent" / "child" for side descriptions in loop comments and put a fat comment before the `StartAck` in `_invoke()`.	2025-03-21 15:25:41 -04:00
Tyler Goodlet	7ca746e96e	Use `_raise_from_no_key_in_msg(allow_msgs)` Instead of `allow_msg_keys` since we've fully flipped over to struct-types for msgs in the runtime. - drop the loop from `MsgStream.receive_nowait()` since `Yield/Return.pld` getting will handle both (instead of a loop of `dict`-key reads).	2025-03-21 15:25:41 -04:00
Tyler Goodlet	956ff11863	Add `MsgTypeError.expected_msg_type` Which matches with renaming `.payload_msg` -> `.expected_msg` which is the value we attempt to construct from a vanilla-msgppack decode-to-`dict` and then construct manually into a `MsgType` using `.msg.types.from_dict_msg()`. Add a todo to use new `use_pretty` flag which currently conflicts with `._exceptions.pformat_boxed_type()` prefix formatting..	2025-03-21 15:25:41 -04:00
Tyler Goodlet	515d5faa0a	Add `from_dict_msg(user_pretty: bool)` flag Allows for optionally (and dynamically) constructing the "expected" `MsgType` from a `dict` into a `pretty_struct.Struct`, mostly for logging usage.	2025-03-21 15:25:41 -04:00
Tyler Goodlet	2995a6afb7	IPC ctx refinements around `MsgTypeError` awareness Add a bit of special handling for msg-type-errors with a dedicated log-msg detailing which `.side: str` is the sender/causer and avoiding a `._scope.cancel()` call in such cases since the local task might be written to handle and tolerate the badly (typed) IPC msg. As part of ^, change the ctx task-pair "side" semantics from "caller" -> "callee" to be "parent" -> "child" which better matches the cross-process SC-linked-task supervision hierarchy, and `trio.Nursery.parent_task`; in `trio` the task that opens a nursery is also named the "parent". Impl deats / fixes around the `.side` semantics: - ensure that `._portal: Portal` is set ASAP after `Actor.start_remote_task()` such that if the `Started` transaction fails, the parent-vs.-child sides are still denoted correctly (since `._portal` being set is the predicate for that). - add a helper func `Context.peer_side(side: str) -> str:` which inverts from "child" to "parent" and vice versa, useful for logging info. Other tweaks: - make `_drain_to_final_msg()` return a tuple of a maybe-`Return` and the list of other `pre_result_drained: list[MsgType]` such that we don't ever have to warn about the return msg getting captured as a pre-"result" msg. - Add some strictness flags to `.started()` which allow for toggling whether to error or warn log about mismatching roundtripped `Started` msgs prior to IPC transit.	2025-03-21 15:25:41 -04:00
Tyler Goodlet	9381d21281	Extend recv-side `MsgTypeError` default message Display the new `MsgCodec.pld_spec_str` and format the incorrect field value to be placed entirely (txt block wise) right of the "type annot" part of the line: Iow if you had a bad `dict` value where something else should be it'd look something like this: <Started( \|_pld: NamespacePath = {'cid': '3e0ca00c-7d32-4d2a-a0c2-ac2e12453871', 'locked': True, 'msg_type': 'LockStatus', 'subactor_uid': ['sub', 'af7ccb69-1dab-491f-84f7-2ec42c32d137']}	2025-03-21 15:25:41 -04:00
Tyler Goodlet	9ea5aa1cde	TOSQUASH `322e015d` Fix `mk_codec()` input arg	2025-03-21 15:25:41 -04:00
Tyler Goodlet	304590abaa	Tweak some `pformat_boxed_tb()` indent inputs - add some `tb_str: str` indent-prefix args for diff indent levels for the body vs. the surrounding "ascii box". - ^-use it-^ from `RemoteActorError.__repr()__` obvi. - use new `msg.types.from_dict_msg()` in impl of `MsgTypeError.payload_msg`, handy for showing what the message "would have looked like in `Struct` form" had it not failed it's type constraints.	2025-03-21 15:25:41 -04:00
Tyler Goodlet	797f7f6d63	Add custom `MsgCodec.__repr__()` Sure makes console grokability a lot better by showing only the customizeable fields. Further, clean up `mk_codec()` a bunch by removing the `ipc_msg_spec` param since we don't plan to support another msg-set (for now) which allows cleaning out a buncha logic that was mostly just a source of bugs.. Also, - add temporary `log.info()` around codec application. - throw in some sanity `assert`s to `limit_msg_spec()`. - add but mask out the `extend_msg_spec()` idea since it seems `msgspec` won't allow `Decoder.type` extensions when using a custom `dec_hook()` for some extension type.. (not sure what approach to take here yet).	2025-03-21 15:25:41 -04:00
Tyler Goodlet	d4d1dca812	Expose `tractor.msg.PayloadT` from subpkg	2025-03-21 15:25:41 -04:00
Tyler Goodlet	213e7dbb67	Add msg-from-dict constructor helper Handy for re-constructing a struct-`MsgType` from a `dict` decoded from wire-bytes wherein the msg failed to decode normally due to a field type error but you'd still like to show the "potential" msg in struct form, say inside a `MsgTypeError`'s meta data. Supporting deats: - add a `.msg.types.from_dict_msg()` to implement it (the helper). - also a `.msg.types._msg_table: dict[str, MsgType]` for supporting this func ^ as well as providing just a general `MsgType`-by-`str`-name lookup. Unrelated: - Drop commented idea for still supporting `dict`-msg set via `enc/dec_hook()`s that would translate to/from `MsgType`s, but that would require a duplicate impl in the runtime.. so eff that XD	2025-03-21 15:25:41 -04:00
Tyler Goodlet	162feec6e9	Relay `MsgTypeError`s upward in RPC loop via `._deliver_ctx_payload()`	2025-03-21 15:25:41 -04:00
Tyler Goodlet	7bb6a53581	Start tidying up `._context`, use `pack_from_raise()` Mostly removing commented (and replaced) code blocks lingering from the ctxc semantics work and new typed-msg-spec `MsgType`s handling AND use the new `._exceptions.pack_from_raise()` helper to construct `StreamOverrun` msgs. Deaterz: - clean out the drain loop now that it's implemented to handle our struct msg types including the `dict`-msg bits left in as fallback-reminders, any notes/todos better summarized at the top of their blocks, remove any `_final_result_is_set()` related duplicate/legacy tidbits. - use a `case Error()` block in drain loop with fallthrough to `_:` always resulting in an rte raise. - move "XXX" notes into the doc-string for `._deliver_msg()` as a "rules" section. - use `match:` syntax for logging the `result_or_err: MsgType` outcome from the final `.result()` call inside `open_context_from_portal()`. - generally speaking use `MsgType` type annotations throughout!	2025-03-21 15:25:41 -04:00
Tyler Goodlet	6628fa00d9	Refine `MsgTypeError` handling to relay-up-on-`.recv()` Such that `Channel.recv()` + `MsgpackTCPStream.recv()` originating msg-type-errors are not raised at the IPC transport layer but instead relayed up the runtime stack for eventual handling by user-app code via the `Context`/`MsgStream` layer APIs. This design choice leads to a substantial amount of flexibility and modularity, and avoids `MsgTypeError` handling policies from being coupled to a particular backend IPC transport layer: - receive-side msg-type errors, as can be raised and handled in the `.open_stream()` "nasty" phase of a ctx, whilst being packed at the `MsgCodec`/transport layer (keeping the underlying src decode error coupled to the specific transport + interchange lib) and then relayed upward to app code for custom handling like a normal Error` msg. - the policy options for handling such cases could be implemented as `@acm` wrappers around `.open_context()`/`.open_stream()` blocks (and their respective delivered primitives) OR just plain old async generators around `MsgStream.receive()` such that both built-in policy handling and custom user-app solutions can be swapped without touching any `tractor` internals or providing specialized "registry APIs". -> eg. the ignore and relay-invalid-msg-to-sender approach can be more easily implemented as embedded `try: except MsgTypeError:` blocks around `MsgStream.receive()` possibly applied as either of an injected wrapper type around a stream or an async gen that `async for`s from the stream. - any performance based AOT-lang extensions used to implement a policy for handling recv-side errors space can avoid knowledge of the lower level IPC `Channel` (and-downward) primitives. - `Context` consuming code can choose to let all msg-type-errs bubble and handle them manually (like any other remote `Error` shuttled exception). - we can keep (as before) send-side msg type checks can be raised locally and cause offending senders to error and adjust before the streaming phase of an IPC ctx. Impl (related) deats: - obvi make `MsgpackTCPStream.recv()` yield up any `MsgTypeError` constructed by `_mk_msg_type_err()` such that the exception will eventually be relayed up to `._rpc.process_messages()` and from their delivered to the corresponding ctx-task. - in support of ^, make `Channel.recv()` detect said mtes and use the new `pack_from_raise()` to inject the far end `Actor.uid` for the `Error.src_uid`. - keep raising the send side equivalent (when strict enabled) errors inline immediately with no upward `Error` packing or relay. - improve `_mk_msg_type_err()` cases handling with far more detailed `MsgTypeError` "message" contents pertaining to `msgspec` specific failure-fixing-tips and type-spec mismatch info: * use `.from_decode()` constructor in recv-side case to inject the non-spec decoded `msg_dict: dict` and use the new `MsgCodec.pld_spec_str: str` when clarifying the type discrepancy with the offending field. * on send-side, if we detect that an unsupported field type was described in the original `src_type_error`, AND there is no `msgpack.Encoder.enc_hook()` set, that the real issue is likely that the user needs to extend the codec to support the non-std/custom type with a hook and link to `msgspec` docs. * if one of a `src_type/validation_error` is provided, set that error as the `.__cause__` in the new mte.	2025-03-21 15:25:41 -04:00
Tyler Goodlet	7a050e5edb	Expose `MsgType` and extend `MsgCodec` API a bit Make a new `MsgType: TypeAlias` for the union of all msg types such that it can be used in annots throughout the code base; just make `.msg.__msg_spec__` delegate to it. Add some new codec methods: - `pld_spec_str`: for the `str`-casted value of the payload spec, generally useful in logging content. - `msg_spec_items()`: to render a `dict` of msg types to their `str()`-casted values with support for singling out a specific `MsgType`, type by input `msg` instance. - `pformat_msg_spec()`: for rendering the (partial) `.msg_spec` as a formatted `str` useful in logging. Oh right, add a `Error._msg_dict: dict` in support of the previous commit (for `MsgTypeError` packing as RAEs) such that our error msg type can house a non-type-spec decoded wire-bytes for error reporting/analysis purposes.	2025-03-21 15:25:41 -04:00
Tyler Goodlet	6e72f2ef13	Unify `MsgTypeError` as a `RemoteActorError` subtype Since in the receive-side error case the source of the exception is the sender side (normally causing a local `TypeError` at decode time), might as well bundle the error in remote-capture-style using boxing semantics around the causing local type error raised from the `msgspec.msgpack.Decoder.decode()` and with a traceback packed from `msgspec`-specific knowledge of any field-type spec matching failure. Deats on new `MsgTypeError` interface: - includes a `.msg_dict` to get access to any `Decoder.type`-applied load of the original (underlying and offending) IPC msg into a `dict` form using a vanilla decoder which is normally packed into the instance as a `._msg_dict`. - a public getter to the "supposed offending msg" via `.payload_msg` which attempts to take the above `.msg_dict` and load it manually into the corresponding `.msg.types.MsgType` struct. - a constructor `.from_decode()` to make it simple to build out error instances from a failed decode scope where the aforementioned `msgdict: dict` from the vanilla decode can be provided directly. - ALSO, we now pack into `MsgTypeError` directly just like ctxc in `unpack_error()` This also completes the while-standing todo for `RemoteActorError` to contain a ref to the underlying `Error` msg as `._ipc_msg` with public `@property` access that `defstruct()`-creates a pretty struct version via `.ipc_msg`. Internal tweaks for this include: - `._ipc_msg` is the internal literal `Error`-msg instance if provided with `.ipc_msg` the dynamic wrapper as mentioned above. - `.__init__()` now can still take variable `**extra_msgdata` (similar to the `dict`-msgdata as before) to maintain support for subtypes which are constructed manually (not only by `pack_error()`) and insert their own attrs which get placed in a `._extra_msgdata: dict` if no `ipc_msg: Error` is provided as input. - the `.msgdata` is now a merge of any `._extra_msgdata` and a `dict`-casted form of any `._ipc_msg`. - adjust all previous `.msgdata` field lookups to try equivalent field reads on `._ipc_msg: Error`. - drop default single ws indent from `.tb_str` and do a failover lookup to `.msgdata` when `._ipc_msg is None` for the manually constructed subtype-instance case. - add a new class attr `.extra_body_fields: list[str]` to allow subtypes to declare attrs they want shown in the `.__repr__()` output, eg. `ContextCancelled.canceller`, `StreamOverrun.sender` and `MsgTypeError.payload_msg`. - ^-rework defaults pertaining to-^ with rename from `_msgdata_keys` -> `_ipcmsg_keys` with latter now just loading directly from the `Error` fields def and `_body_fields: list[str]` just taking that value and removing the not-so-useful-in-REPL or already shown (i.e. `.tb_str: str`) field names. - add a new mod level `.pack_from_raise()` helper for auto-boxing RAE subtypes constructed manually into `Error`s which is normally how `StreamOverrun` and `MsgTypeError` get created in the runtime. - in support of the above expose a `src_uid: tuple` override to `pack_error()` such that the runtime can provide any remote actor id when packing a locally-created yet remotely-caused RAE subtype. - adjust all typing to expect `Error`s over `dict`-msgs. Adjust some tests to match these changes: - context and inter-peer-cancel tests to make their `.msgdata` related checks against the new `.ipc_msg` as well and `.tb_str` directly. - toss in an extra sleep to `sleep_a_bit_then_cancel_peer()` to keep the 'canceller' ctx child task cancelled by it's parent in the 'root' for the rte-raised-during-ctxc-handling case (apparently now it's returning too fast, cool?).	2025-03-21 15:25:41 -04:00
Tyler Goodlet	28a8d15071	Rename `Actor._push_result()` -> `._deliver_ctx_payload()` Better describes the internal RPC impl/latest-architecture with the msgs delivered being those which either define a `.pld: PayloadT` that gets passed up to user code, or the error-msg subset that similarly is raised in a ctx-linked task.	2025-03-21 15:25:41 -04:00
Tyler Goodlet	c9d2993338	Caps-msging test tweaks to get correct failures These are likely temporary changes but still needed to actually see the desired/correct failures (of which 5 of 6 tests are supposed to fail rn) mostly to do with `Start` and `Return` msgs which are invalid under each test's applied msg-spec. Tweak set here: - bit more `print()`s in root and sub for grokin test flow. - never use `pytes.fail()` in subactor.. should know this by now XD - comment out some bits that can't ever pass rn and make the underlying expected failues harder to grok: - the sub's child-side-of-ctx task doing sends should only fail for certain msg types like `Started` + `Return`, `Yield`s are processed receiver/parent side. - don't expect `sent` list to match predicate set for the same reason as last bullet. The outstanding msg-type-semantic validation questions are: - how to handle `.open_context()` with an input `kwargs` set that doesn't adhere to the currently applied msg-spec? - should the initial `@acm` entry fail before sending to the child side? - where should received `MsgTypeError`s be raised, at the `MsgStream` `.receive()` or lower in the stack? - i'm thinking we should mk `MsgTypeError` derive from `RemoteActorError` and then have it be delivered as an error to the `Context`/`MsgStream` for per-ctx-task handling; would lead to more flexible/modular policy overrides in user code outside any defaults we provide.	2025-03-21 15:25:41 -04:00
Tyler Goodlet	a13160d920	Finally drop masked `chan.send(None)` related code blocks	2025-03-21 15:25:41 -04:00
Tyler Goodlet	e9f1d8e8be	Detail out EoC-by-self log msg	2025-03-21 15:25:41 -04:00
Tyler Goodlet	6c672a67e2	Use `object()` when checking for error field value Since the field value could be `None` or some other type with truthy-ness evaluating to `False`..	2025-03-21 15:25:41 -04:00
Tyler Goodlet	344d8ebc0c	Flatten out RPC loop with `match:`/`case:` Mainly expanding out the runtime endpoints for cancellation to separate cases and flattening them with the main RPC-request-invoke block, moving the non-cancel runtime case (where we call `getattr(actor, funcname)`) inside the main `Start` case (for now) which branches on `ns=="self"`. Also, add a new IPC msg `class CancelAck(Return):` which is always included in the default msg-spec such that runtime cancellation (and eventually all) endpoints return that msg (instead of a `Return`) and thus sidestep any currently applied `MsgCodec` such that the results (`bool`s for most cancel methods) are never violating the current type limit(s) on `Msg.pld`. To support this expose a new variable `return_msg: Return\|CancelAck` param from `_invoke()`/`_invoke_non_context)()` and set it to `CancelAck` in the appropriate endpoint case-blocks of the msg loop. Clean out all the lingering legacy `chan.send(<dict-msg>)` commented codez from the invoker funcs, with more cleaning likely to come B)	2025-03-21 15:25:41 -04:00
Tyler Goodlet	78b08e2a91	Drop `None`-sentinel cancels RPC loop mechanism Pretty sure we haven't needed it for a while, it was always generally hazardous in terms of IPC msg types, AND it's definitely incompatible with a dynamically applied typed msg spec: you can't just expect a `None` to be willy nilly handled all the time XD For now I'm masking out all the code and leaving very detailed surrounding notes but am not removing it quite yet in case for strange reason it is needed by some edge case (though I haven't found according to the test suite). Backstory: ------ - ------ Originally (i'm pretty sure anyway) it was added as a super naive "remote cancellation" mechanism (back before there were specific `Actor` methods for such things) that was mostly (only?) used before IPC `Channel` closures to "more gracefully cancel" the connection's parented RPC tasks. Since we now have explicit runtime-RPC endpoints for conducting remote cancellation of both tasks and full actors, it should really be removed anyway, because: - a `None`-msg setinel is inconsistent with other RPC endpoint handling input patterns which (even prior to typed msging) had specific msg-value triggers. - the IPC endpoint's (block) implementation should use `Actor.cancel_rpc_tasks(parent_chan=chan)` instead of a manual loop through a `Actor._rpc_tasks.copy()`.. Deats: - mask the `Channel.send(None)` calls from both the `Actor._stream_handler()` tail as well as from the `._portal.open_portal()` was connected block. - mask the msg loop endpoint block and toss in lotsa notes. Unrelated tweaks: - drop `Actor._debug_mode`; unused. - make `Actor.cancel_server()` return a `bool`. - use `.msg.pretty_struct.Struct.pformat()` to show any msg that is ignored (bc invalid) in `._push_result()`.	2025-03-21 15:25:41 -04:00
Tyler Goodlet	4e769e45e4	Factor `MsgpackTCPStream` msg-type checks Add both the `.send()` and `.recv()` handling blocks to a common `_raise_msg_type_err()` which includes detailed error msg formatting: - the `.recv()` side case does introspection of the `Msg` fields and attempting to report the exact (field type related) issue - `.send()` side does some boxed-error style tb formatting like `RemoteActorError`. - add a `strict_types: bool` to `.send()` to allow for just warning on bad inputs versus raising, but always raise from any `Encoder` type error.	2025-03-21 15:25:41 -04:00
Tyler Goodlet	dbb5e7dc78	Expose `MsgTypeError` from pkg	2025-03-21 15:25:41 -04:00
Tyler Goodlet	abc9e68f33	Make `Context.started()` a type checked IPC send As detailed in the surrounding notes, it's pretty advantageous to always have the child context task ensure the first msg it relays back is msg-type checked against the current spec and thus `MsgCodec`. Implement the check via a simple codec-roundtrip of the `Started` msg such that the `.pld` payload is always validated before transit. This ensures the child will fail early and notify the parent before any streaming takes place (i.e. the "nasty" dialog protocol phase). The main motivation here is to avoid inter-actor task syncing bugs that are hard(er) to recover from and/or such as if an invalid typed msg is sent to the parent, who then ignores it (depending on config), and then the child thinks the parent is in some presumed state while the parent is still thinking a first msg has yet to arrive. Doing the stringent check on the sender side (i.e. the child is sending the "first" application msg via `.started()`) avoids/sidesteps dealing with such syncing/coordinated-state problems by keeping the entire IPC dialog in a "cheap" or "control" style transaction up until a stream is opened. Iow, the parent task's `.open_context()` block entry can't occur until the child side is definitely (as much as is possible with IPC msg type checking) in a correct state spec wise. During any streaming phase in the dialog the msg-type-checking is NOT done for performance (the "nasty" protocol phase) and instead any type errors are relayed back from the receiving side. I'm still unsure whether to take the same approach on the `Return` msg, since at that point erroring early doesn't benefit the parent task if/when a msg-type error occurs? Definitely more to ponder and tinker out here.. Impl notes: - a gotcha with the roundtrip-codec-ed msg is that it often won't match the input `value` bc in the `msgpack` case many native python sequence/collection types will map to a common array type due to the surjection that `msgpack`'s type-sys imposes. - so we can't assert that `started == rt_started` but it may be useful to at least report the diff of the type-reduced payload so that the caller can at least be notified how the input `value` might be better type-casted prior to call, for ex. pre-casting to `list`s. - added a `._strict_started: bool` that could provide the stringent checking if desired in the future. - on any validation error raise our `MsgTypeError` from it. - ALSO change over the lingering `.send_yield()` deprecated meth body to use a `Yield()`.	2025-03-21 15:25:41 -04:00
Tyler Goodlet	1544849bbf	Factor boxed-err formatting into new `pformat_boxed_tb()` helper for use elsewhere	2025-03-21 15:25:41 -04:00
Tyler Goodlet	fc6419251b	Add buncha notes on `Start` field for "params" Such that the current `kwargs: dict` field can eventually be strictly msg-typed (eventually directly from a `@context` def) using modern typed python's hippest syntactical approach B) Also proto a new `CancelAck(Return)` subtype msg for supporting msg-spec agnostic `Actor.cancel_xx()` method calls in the runtime such that a user can't break cancellation (and thus SC) by dynamically setting a codec that doesn't allow `bool` results (as an eg. in this case). Note that the msg isn't used yet in `._rpc` but that's a comin!	2025-03-21 15:25:41 -04:00
Tyler Goodlet	f1dd6474bf	Extend codec test to for msg-spec parameterizing Set a diff `Msg.pld` spec per test and then send multiple types to a child actor making sure the child can only send certain types over a stream and fails with validation or decode errors ow. The test is also param-ed both with and without hooks demonstrating how a custom type, `NamespacePath`, needs them for effective use. The subactor IPC context child is passed a `expect_ipc_send: dict` which relays the values along with their expected `.send()`-ability. Deats on technical refinements: ------ - ------ - added a `iter_maybe_sends()` send-value-as-msg-auditor and predicate generator (literally) so as to be able to pre-determine if given the current codec and `send_values` which values are expected to be IPC transmittable. - as per ^, the diff value-msgs are first round-tripped inside a `Started` msg using the configured codec in the parent/root actor before bothering with using IPC primitives + a subactor; this is how the `expect_ipc_send` table is generated initially. - for serializing the specs (`Union[Type]`s as required by `msgspec`), added a pair of codec hooks: `enc/dec_type_union()` (that ideally we move into a `.msg` submod eventually) which code the type-values as a `list[str]` of names. - the `dec_` hook had to be modified to NOT raise an error when an invalid/unhandled value arrives, this is because we do NOT want the RPC msg handling loop to raise on the `async for msg in chan:` and instead prefer to ignore and warn (for now, but eventually respond with error msg - see notes in hook body) these msgs when sent during a streaming phase; `Context.started()` will however error on a bad input for the current msg-spec since it is part of the "cheap" dialog (again see notes in `._context`) wherein the `Started` msg is always roundtripped prior to `Channel.send()` to guarantee the child adheres to its own spec. - tossed in lotsa `print()`s for console groking of the run progress. Further notes on typed-msging breaking cancellation: ------ - ------ - turns out since the runtime's cancellation implementation, being done with `Actor.cancel()` methods and friends will actually break when a stringent spec is applied (eg. a single type-spec) since the return values from said methods are generally `bool`s.. - this means we do indeed need special handling of "runtime RPC method invocations" since ideally a user's msg-spec choices do not break core functionality on them XD => The obvi solution is to add a/some special sub-`Msg` types for such cases, possibly just a `RuntimeReturn(Return)` type that will always include a `.pld: bool` for these cancel methods such that their results are always handled without msg type errors. More to come on a (hopefully) elegant solution to that last bit!	2025-03-21 15:25:41 -04:00
Tyler Goodlet	5a79a17dbb	Use `._testing.break_ipc()` in final advanced fault test child ctx	2025-03-21 15:25:41 -04:00
Tyler Goodlet	13ecb151db	Start a new `._testing.fault_simulation` Since I needed the `break_ipc()` helper from the `examples/advanced_faults/ipc_failure_during_stream.py` used in the `test_advanced_faults` suite, might as well move it into a pkg-wide importable module. Also changed the default break method to be `socket_close` which just calls `Stream.socket.close()` underneath in `trio`. Also tweak that example to not keep sending after the stream has been broken since with new `trio` that will raise `ClosedResourceError` and in the wrapping test we generally speaking want to see a hang and then cancel via simulated user sent SIGINT/ctl-c.	2025-03-21 15:25:41 -04:00
Tyler Goodlet	335997966c	Flip default codec to our `Msg`-spec Yes, this is "the switch" and will likely cause the test suite to bail until a few more fixes some in. Tweaked a couple `.msg` pkg exports: - remove `__spec__` (used by modules) and change it to `__msg_types: lists[Msg]` as well as add a new `__msg_spec__: TypeAlias`, being the default `Any` paramed spec. - tweak the naming of `msg.types` lists of runtime vs payload msgs to: `._runtime_msgs` and `._payload_msgs`. - just build `__msg_types__` out of the above 2 lists.	2025-03-21 15:25:41 -04:00
Tyler Goodlet	e72bc5c208	TOSQUASH `f2ce4a3`, timeout bump	2025-03-21 15:25:41 -04:00
Tyler Goodlet	7908c9575e	Woops, only pack `Error(cid=cid)` if input is not `None`	2025-03-21 15:25:41 -04:00
Tyler Goodlet	8d8a47ef7b	WIP porting runtime to use `Msg`-spec	2025-03-21 15:25:41 -04:00
Tyler Goodlet	afabef166e	Add timeouts around some context test bodies Since with my in-index runtime-port to our native msg-spec it seems these ones are hanging B( - `test_one_end_stream_not_opened()` - `test_maybe_allow_overruns_stream()` Tossing in some `trio.fail_after()`s seems to at least gnab them as failures B)	2025-03-21 15:25:41 -04:00
Tyler Goodlet	b5bdd20eb5	Get `test_codec_hooks_mod` working with `Msg`s Though the runtime hasn't been changed over in this patch (it was in the local index at the time however), the test does now demonstrate that using a `Started` the correctly typed `.pld` will codec correctly when passed manually to `MsgCodec.encode/decode()`. Despite not having the runtime ported to the new shuttle msg set (meaning the mentioned test will fail without the runtime port patch), I was able to get this first original test working that limits payload packets as a `Msg.pld: NamespacePath`this as long as we spec `enc/dec_hook()`s then the `Msg.pld` will be processed correctly as per: https://jcristharif.com/msgspec/extending.html#mapping-to-from-native-types in both the `Any` and `NamespacePath\|None` spec cases. ^- turns out in this case -^ that the codec hooks only get invoked on the unknown-fields NOT the entire `Struct`-msg. A further gotcha was merging a `\|None` into the `pld_spec` since this test spawns a subactor and opens a context via `send_back_nsp()` and that func has no explicit `return` - so of course it delivers a `Return(pld=None)` which will fail if we only spec `NamespacePath`.	2025-03-21 15:25:41 -04:00
Tyler Goodlet	405c2a27e6	Get msg spec type limiting working with a `RunVar` Since `contextvars.ContextVar` seems to reset to the default in every new task, switching to using `trio.lowlevel.RunVar` kinda gets close to what we'd like where a child scope can override what's in the rent but ideally without modifying the rent's. I tried `tricycle.TreeVar` as well but it also seems to reset across (embedded) nurseries in our runtime; need to try it again bc apparently that's not how it's suppose to work? NOTE that for now i'm keeping the `.msg.types._ctxvar_MsgCodec` set to the `msgspec` default (`Any` types) so that the test suite will still pass until the runtime is ported to the new msg-spec + codec. Surrounding and in support of all this the `Msg`-set impl deats changed a bit as well as various stuff in `.msg` sub-mods: - drop the `.pld` struct types for `Error`, `Start`, `StartAck` since we don't really need the `.pld` payload field in those cases since they're runtime control msgs for starting RPC tasks and handling remote errors; we can just put the fields directly on each msg since the user will never want/need to override the `.pld` field type. - add a couple new runtime msgs and include them in `msg.__spec__` and make them NOT inherit from `Msg` since they are runtime-specific and thus have no need for `.pld` type constraints: - `Aid` the actor-id identity handshake msg. - `SpawnSpec`: the spawn data passed from a parent actor down to a a child in `Actor._from_parent()` for which we need a shuttle protocol msg, so might as well make it a pendatic one ;) - fix some `Actor.uid` field types that were type-borked on `Error` - add notes about how we need built-in `debug_mode` msgs in order to avoid msg-type errors when using the TTY lock machinery and a different `.pld` spec then the default `Any` is in use.. -> since `devx._debug.lock_tty_for_child()` and it's client side `wait_for_parent_stdin_hijack()` use `Context.started('Locked')` and `MsgStream.send('pdb_unlock')` string values as their `.pld` contents we'd need to either always do a `ipc_pld_spec \| str` or pre-define some dedicated `Msg` types which get `Union`-ed in for this? - break out `msg.pretty_struct.Struct._sin_props()` into a helper func `iter_fields()` since the impl doesn't require a struct instance. - as mentioned above since `ContextVar` didn't work as anticipated I next tried `tricycle.TreeVar` but that too didn't seem to keep the `apply_codec()` setting intact across `Portal.open_context()`/`Context.open_stream()` (it kept reverting to the default `.pld: Any` default setting) so I finalized on a trio.lowlevel.RunVar` for now despite it basically being a `global`.. -> will probably come back to test this with `TreeVar` and some hot tips i picked up from @mikenerone in the `trio` gitter, which i put in comments surrounding proto-code.	2025-03-21 15:25:41 -04:00
Tyler Goodlet	8d716f2113	Be mega pedantic with msg-spec building Turns out the generics based payload speccing API, as in https://jcristharif.com/msgspec/supported-types.html#generic-types, DOES WORK properly as long as we don't rely on inheritance from `Msg` a parent `Generic`.. So let's get real pedantic in the `mk_msg_spec()` internals as well as verification in the test suite! Fixes in `.msg.types`: - implement (as part of tinker testing) multiple spec union building methods via a `spec_build_method: str` to `mk_msg_spec()` and leave a buncha notes around what did and didn't work: - 'indexed_generics' is the only method THAT WORKS and the one that you'd expect being closest to the `msgspec` docs (link above). - 'defstruct' using dynamically defined msgs => doesn't work! - 'types_new_class' using dynamically defined msgs but with `types.new_clas()` => ALSO doesn't work.. - explicitly separate the `.pld` type-constrainable by user code msg set into `types._payload_spec_msgs` putting the others in a `types._runtime_spec_msgs` and the full set defined as `.__spec__` (moving it out of the pkg-mod and back to `.types` as well). - for the `_payload_spec_msgs` msgs manually make them inherit `Generic[PayloadT]` and (redunantly) define a `.pld: PayloadT` field. - make `IpcCtxSpec.functype` an in line `Literal`. - toss in some TODO notes about choosing a better `Msg.cid` type. Fixes/tweaks around `.msg._codec`: - rename `MsgCodec.ipc/pld_msg_spec` -> `.msg/pld_spec` - make `._enc/._dec` non optional fields - wow, ^facepalm^ , make sure `._ipc.MsgpackTCPStream.__init__()` uses `mk_codec()` since `MsgCodec` can't be (easily) constructed directly. Get more detailed in testing: - inside the `chk_pld_type()` helper ensure `roundtrip` is always set to some value, `None` by default but a bool depending on legit outcome. - drop input `generic`; no longer used. - drop the masked `typedef` loop from `Msg.__subclasses__()`. - for add an `expect_roundtrip: bool` and use to jump into debugger when any expectation doesn't match the outcome. - use new `MsgCodec` field names (as per first section above). - ensure the encoded msg matches the decoded one from both the ad-hoc decoder and codec loaded values. - ensure the pld checking is only applied to msgs in the `types._payload_spec_msgs` set by `typef.__name__` filtering since `mk_msg_spec()` now returns the full `.types.Msg` set.	2025-03-21 15:25:41 -04:00
Tyler Goodlet	c79c2d7ffd	Tweak msging tests to match codec api changes Mostly adjusting input args/logic to various spec/codec signatures and new runtime semantics: - `test_msg_spec_xor_pld_spec()` to verify that a shuttle prot spec and payload spec are necessarily mutex and that `mk_codec()` enforces it. - switch to `ipc_msg_spec` input in `mk_custom_codec()` helper. - drop buncha commented cruft from `test_limit_msgspec()` including no longer needed type union instance checks in dunder attributes.	2025-03-21 15:25:41 -04:00
Tyler Goodlet	e0d7ed48e8	Drop `MsgCodec.decoder()/.encoder()` design Instead just instantiate `msgpack.Encoder/Decoder` instances inside `mk_codec()` and assign them directly as `._enc/._dec` fields. Explicitly take in named-args to both and proxy to the coder/decoder instantiation calls directly. Shuffling some codec internals: - rename `mk_codec()` inputs as `ipc_msg_spec` and `ipc_pld_spec`, make them mutex such that a payload type spec can't be passed if the built-in msg-spec isn't used. => expose `MsgCodec.ipc_pld_spec` directly from `._dec.type` => presume input `ipc_msg_spec` is `Any` by default when no `ipc_pld_spec` is passed since we have no way atm to enable a similar type-restricted-payload feature without a wrapping "shuttle protocol" ;) - move all the payload-sub-decoders stuff prototyped in GH#311 (inside `.types`) to `._codec` as commented-for-later-maybe `MsgCodec` methods including: - `.mk_pld_subdec()` for registering - `.enc/dec_payload()` for sub-codec field loading. - also comment out `._codec.mk_tagged_union_dec()` as the orig tag-to-decoder table factory, now mostly superseded by `.types.mk_msg_spec()` which takes the generic parameterizing approach instead. - change naming to `types.mk_msg_spec(payload_type_union)` input, making it more explicit that it expects a `Union[Type]`. Oh right, and start exposing all the `.types.Msg` subtypes in the `.msg` subpkg in prep for usage throughout the runtime B)	2025-03-21 15:25:41 -04:00
Tyler Goodlet	9e16cfe8fd	Change to multi-line-static-`dict` style msgs Re-arranging such that element-orders are line-arranged to our new IPC `.msg.types.Msg` fields spec in prep for replacing the current `dict`-as-msg impls with the `msgspec.Struct` native versions!	2025-03-21 15:25:41 -04:00
Tyler Goodlet	6cd74a5dba	Tweak msg-spec test suite mod name	2025-03-21 15:25:41 -04:00
Tyler Goodlet	fe9406be9b	Init def of "SC shuttle prot" with "msg-spec-limiting" As per the long outstanding GH issue this starts our rigorous journey into an attempt at a type-safe, cross-actor SC, IPC protocol Bo boop -> https://github.com/goodboy/tractor/issues/36 The idea is to "formally" define our SC "shuttle (dialog) protocol" by specifying a new `.msg.types.Msg` subtype-set which can fully encapsulate all IPC msg schemas needed in order to accomplish cross-process SC! The msg set deviated a little in terms of (type) names from the existing `dict`-msgs currently used in the runtime impl but, I think the name changes are much better in terms of explicitly representing the internal semantics of the actor runtime machinery/subsystems and the IPC-msg-dialog required for SC enforced RPC. ------ - ------ In cursory, the new formal msgs-spec includes the following msg-subtypes of a new top-level `Msg` boxing type (that holds the base field schema for all msgs): - `Start` to request RPC task scheduling by passing a `FuncSpec` payload (to replace the currently used `{'cmd': ... }` dict msg impl) - `StartAck` to allow the RPC task callee-side to report a `IpcCtxSpec` payload immediately back to the caller (currently responded naively via a `{'functype': ... }` msg) - `Started` to deliver the first value from `Context.started()` (instead of the existing `{'started': ... }`) - `Yield` to shuttle `MsgStream.send()`-ed values (instead of our `{'yield': ... }`) - `Stop` to terminate a `Context.open_stream()` session/block (over `{'stop': True }`) - `Return` to deliver the final value from the `Actor.start_remote_task()` (which is a `{'return': ... }`) - `Error` to box `RemoteActorError` exceptions via a `.pld: ErrorData` payload, planned to replace/extend the current `RemoteActorError.msgdata` mechanism internal to `._exceptions.pack/unpack_error()` The new `tractor.msg.types` includes all the above msg defs as well an API for rendering a "payload type specification" using a `payload_type_spec: Union[Type]` that can be passed to `msgspec.msgpack.Decoder(type=payload_type_spec)`. This ensures that (for a subset of the above msg set) `Msg.pld: PayloadT` data is type-parameterized using `msgspec`'s new `Generic[PayloadT]` field support and thus enables providing for an API where IPC `Context` dialogs can strictly define the allowed payload-datatype-set via type union! Iow, this is the foundation for supporting `Channel`/`Context`/`MsgStream` IPC primitives which are type checked/safe as desired in GH issue: - https://github.com/goodboy/tractor/issues/365 Misc notes on current impl(s) status: ------ - ------ - add a `.msg.types.mk_msg_spec()` which uses the new `msgspec` support for `class MyStruct[Struct, Generic[T]]` parameterize-able fields and delivers our boxing SC-msg-(sub)set with the desired `payload_types` applied to `.pld`: - https://jcristharif.com/msgspec/supported-types.html#generic-types - as a note this impl seems to need to use `type.new_class()` dynamic subtype generation, though i don't really get why still.. but without that the `msgspec.msgpack.Decoder` doesn't seem to reject `.pld` limited `Msg` subtypes as demonstrated in the new test. - around this ^ add a `.msg._codec.limit_msg_spec()` cm which exposes this payload type limiting API such that it can be applied per task via a `MsgCodec` in app code. - the orig approach in https://github.com/goodboy/tractor/pull/311 was the idea of making payload fields `.pld: Raw` wherein we could have per-field/sub-msg decoders dynamically loaded depending on the particular application-layer schema in use. I don't want to lose the idea of this since I think it might be useful for an idea I have about capability-based-fields(-sharing, maybe using field-subset encryption?), and as such i've kept the (ostensibly) working impls in TODO-comments in `.msg._codec` wherein maybe we can add a `MsgCodec._payload_decs: dict` table for this later on. \|_ also left in the `.msg.types.enc/decmsg()` impls but renamed as `enc/dec_payload()` (but reworked to not rely on the lifo codec stack tables; now removed) such that we can prolly move them to `MsgCodec` methods in the future. - add an unused `._codec.mk_tagged_union_dec()` helper which was originally factored out the #311 proto-code but didn't end up working as desired with the new parameterized generic fields approach (now in `msg.types.mk_msg_spec()`) Testing/deps work: ------ - ------ - new `test_limit_msgspec()` which ensures all the `.types` content is correct but without using the wrapping APIs in `._codec`; i.e. using a in-line `Decoder` instead of a `MsgCodec`. - pin us to `msgspec>=0.18.5` which has the needed generic-types support (which took me way too long yester to figure out when implementing all this XD)!	2025-03-21 15:25:41 -04:00
Tyler Goodlet	b589bef1b6	Move the pretty-`Struct` stuff to a `.pretty_struct` Leave all the proto native struct-msg stuff in `.types` since i'm thinking it's the right name for the mod that will hold all the built-in SCIPP msgspecs longer run. Obvi the naive codec stack stuff needs to be cleaned out/up and anything useful moved into `._codec` ;)	2025-03-21 15:25:41 -04:00
Tyler Goodlet	79c71bfbaf	Merge original content from PR #311 into `.msg.types` for now	2025-03-21 15:25:41 -04:00
Tyler Goodlet	68f170fde1	Re-think, `msgspec`-multi-typed msg dialogs The greasy details are strewn throughout a `msgspec` issue: https://github.com/jcrist/msgspec/issues/140 and specifically this code was mostly written as part of POC example in this comment: https://github.com/jcrist/msgspec/issues/140#issuecomment-1177850792 This work obviously pertains to our desire and prep for typed messaging and capabilities aware msg-oriented-protocols in #196. I added a "wants to have" method to `Context` showing how I think we could offer a pretty neat msg-type-set-as-capability-for-protocol system. XXX NOTE XXX: this commit was rewritten during a rebase from a very old version as per the prior commit.	2025-03-21 15:25:41 -04:00
Tyler Goodlet	10b52ba98a	WIP tagged union message type API XXX NOTE XXX: this is a heavily modified commit from the original (`ec226463`) which was super out of date when rebased onto the current branch. I went through a manual conflict rework and removed all the legacy segments as well as rename-moved this original mod `tractor.msg.py` -> `tractor.msg/_old_msg.py`. Further the `NamespacePath` type def was discarded from this mod since it was from a super old version which was already moved to a `.msg.ptr` submod. As per original questions and discussion with `msgspec` author: - https://github.com/jcrist/msgspec/issues/25 - https://github.com/jcrist/msgspec/issues/140 this prototypes a new (but very naive) `msgspec.Struct` codec implementation which will be more filled out in the next commit.	2025-03-21 15:25:41 -04:00
Tyler Goodlet	65192e80c1	Proto `MsgCodec`, an interchange fmt modify API Fitting in line with the issues outstanding: - #36: (msg)spec-ing out our SCIPP (structured-con-inter-proc-prot). (https://github.com/goodboy/tractor/issues/36) - #196: adding strictly typed IPC msg dialog schemas, more or less better described as "dialog/transaction scoped message specs" using `msgspec`'s tagged unions and custom codecs. (https://github.com/goodboy/tractor/issues/196) - #365: using modern static type-annots to drive capability based messaging and RPC. (statically https://github.com/goodboy/tractor/issues/365) This is a first draft of a new API for dynamically overriding IPC msg codecs for a given interchange lib from any task in the runtime. Right now we obviously only support `msgspec` but ideally this API holds general enough to be used for other backends eventually (like `capnproto`, and apache arrow). Impl is in a new `tractor.msg._codec` with: - a new `MsgCodec` type for encapsing `msgspec.msgpack.Encoder/Decoder` pairs and configuring any custom enc/dec_hooks or typed decoding. - factory `mk_codec()` for creating new codecs ad-hoc from a task. - `contextvars` support for a new `trio.Task` scoped `_ctxvar_MsgCodec: ContextVar[MsgCodec]` named 'msgspec_codec'. - `apply_codec()` for temporarily modifying the above per task as needed around `.open_context()` / `.open_stream()` operation. A new test (suite) in `test_caps_msging.py`: - verify a parent and its child can enable the same custom codec (in this case to transmit `NamespacePath`s) with tons of pedantic ctx-vars checks. - ToDo: still need to implement #36 msg types in order to be able to get decodes working (as in `MsgStream.receive()` will deliver an already created `NamespacePath` obj) since currently all msgs come packed in `dict`-msg wrapper packets.. -> use the proto from PR #35 to get nested `msgspec.Raw` processing up and running Bo	2025-03-21 15:25:41 -04:00
Tyler Goodlet	4e71b57bf5	Prepare to offer (dynamic) `.msg.Codec` overrides By simply allowing an input `codec: tuple` of funcs for now to the `MsgpackTCPStream` transport but, ideally wrapping this in a `Codec` type with an API for dynamic extension of the interchange lib's msg processing settings. Right now we're tied to `msgspec.msgpack` for this transport but with the right design this can likely extend to other libs in the future. Relates to starting feature work toward #36, #196, #365.	2025-03-21 15:25:41 -04:00

WIP: Create examples/multihost/ script set to show some distributed cases! #3

305 Commits (main)

WIP: Create `examples/multihost/` script set to show some distributed cases! #3