Compare commits

..

114 Commits

Author SHA1 Message Date
Tyler Goodlet 8d12ece8d6 Mask top level import of `.hilevel`
Since it isn't required until the landing of the new service-manager
stuff in #12; was an oversight
from commit `0607a31dddeba032a2cf7d9fe605edd9d7bb4846`.
2025-03-25 12:59:08 -04:00
Tyler Goodlet 8908e1283b Add `.runtime()`-emit to `._invoke()` to report final result msg in the child 2025-03-25 12:59:08 -04:00
Tyler Goodlet 51f72801d2 Add `MsgStream._stop_msg` use new `PldRx` API
In particular ensuring we use `ctx._pld_rx.recv_msg_nowait()` from
`.receive_nowait()` (which is called from `.aclose()`) such that we
ALWAYS (can) set the surrounding `Context._result/._outcome_msg` attrs
on reception of a final `Return`!!

This fixes a final stream-teardown-race-condition-bug where prior we
normally didn't set the `Context._result/._outcome_msg` in such cases.
This is **precisely because**  `.receive_nowait()` only returns the
`pld` and when called from `.aclose()` this value is discarded, meaning
so is its boxing `Return` despite consuming it from the underlying
`._rx_chan`..

Longer term this should be solved differently by ensuring such races
cases are handled at a higher scope like inside `Context._deliver_msg()`
or the `Portal.open_context()` enter/exit blocks? Add a detailed warning
note and todos for all this around the special case block!
2025-03-25 12:59:08 -04:00
Tyler Goodlet 01b5955cce Add `Context._outcome_msg` use new `PldRx` API
Such that any `Return` is always capture for each ctx instance and set
in `._deliver_msg()` normally; ensures we can at least introspect for it
when missing (like in a recently discovered stream teardown race bug).
Yes this augments the already existing `._result` which is dedicated for
the `._outcome_msg.pld` in the non-error case; we might want to see if
there's a nicer way to directly proxy ref to that without getting the
pre-pld-decoded `Raw` form with `msgspec`?

Also use the new `ctx._pld_rx.recv_msg()` and drop assigning
`pld_rx._ctx`.
2025-03-25 12:59:08 -04:00
Tyler Goodlet b0ab77a99d Slight `PldRx` rework to simplify
Namely renaming and tweaking the `MsgType` receiving methods,
- `.recv_msg()` from what was `.recv_msg_w_pld()` which both receives
  the IPC msg from the underlying `._rx_chan` and then decodes its
  payload with `.decode_pld()`; it now also log reports on the different
  "stage of SC dialog protocol" msg types via a `match/case`.
- a new `.recv_msg_nowait()` sync equivalent of ^ (*was*
  `.recv_pld_nowait()`) who's use was the source of a recently
  discovered bug where any final `Return.pld` is being
  consumed-n-discarded by by `MsgStream.aclose()` depending on
  ctx/stream teardown race conditions..

Also,
- remove all the "instance persistent" ipc-ctx attrs, specifically the
  optional `_ipc`, `_ctx` and the `.wraps_ipc()` cm, since none of them
  were ever really needed/used; all methods which require
  a `Context/MsgStream` are explicitly always passed.
- update a buncha typing namely to use the more generic-styled
  `PayloadT` over `Any` and obviously `MsgType[PayloadT]`.
2025-03-25 12:59:08 -04:00
Tyler Goodlet f713a7a859 Rename ext-types with `msgspec` suite module 2025-03-25 12:59:08 -04:00
Tyler Goodlet 85f1a66b41 Complete rename to parent->child IPC ctx peers
Now changed in all comments docs **and** test-code content such that we
aren't using the "caller"->"callee" semantics anymore.
2025-03-25 12:59:08 -04:00
Tyler Goodlet fde6d84b9d Fix msg-draining on `parent_never_opened_stream`!
Repairs a bug in `drain_to_final_msg()` where in the `Yield()` case
block we weren't guarding against the `ctx._stream is None` edge case
which should be treated a `continue`-draining (not a `break` or
attr-error!!) situation since the peer task maybe be continuing to send
`Yield` but has not yet sent an outcome msg (one of
`Return/Error/ContextCancelled`) to terminate the loop. Ensure we
explicitly warn about this case as well as `.cancel()` emit on a taskc.

Thanks again to @guille for discovering this!

Also add temporary `.info()`s around rxed `Return` msgs as part of
trying to debug a different bug discovered while updating the
context-semantics test suite (in a prior commit).
2025-03-25 12:59:08 -04:00
Tyler Goodlet 37b8d77d98 Extend ctx semantics suite for streaming edge cases!
Muchas grax to @guilledk for finding the first issue which kicked of
this further scrutiny of the `tractor.Context` and `MsgStream` semantics
test suite with a strange edge case where,
- if the parent opened and immediately closed a stream while the remote
  child task started and continued (without terminating) to send msgs
  the parent's `open_context().__aexit__()` would **not block** on the
  child to complete!
=> this was seemingly due to a bug discovered inside the
  `.msg._ops.drain_to_final_msg()` stream handling case logic where we
  are NOT checking if `Context._stream` is non-`None`!

As such this,
- extends the `test_caller_closes_ctx_after_callee_opens_stream` (now
  renamed, see below) to include cases for all combinations of the child
  and parent sending before receiving on the stream as well as all
  placements of `Context.cancel()` in the parent before, around and after
  the stream open.
- uses the new `expect_ctxc()` for expecting the taskc (`trio.Task`
  cancelled)` cases.
- also extends the `test_callee_closes_ctx_after_stream_open` (also
  renamed) to include the case where the parent sends a msg before it
  receives.
=> this case has unveiled yet-another-bug where somehow the underlying
  `MsgStream._rx_chan: trio.ReceiveMemoryChannel` is allowing the
  child's `Return[None]` msg be consumed and NOT in a place where it is
  correctly set as `Context._result` resulting in the parent hanging
  forever inside `._ops.drain_to_final_msg()`..

Alongside,
- start renaming using the new "remote-task-peer-side" semantics
  throughout the test module: "caller" -> "parent", "callee" -> "child".
2025-03-25 12:59:08 -04:00
Tyler Goodlet ef7a585570 Deliver a `MaybeBoxedError` from `.expect_ctxc()`
Just like we do from the `.devx._debug.open_crash_handler()`, this
allows checking various attrs on the raised `ContextCancelled` much like
`with pytest.raises() as excinfo:`.
2025-03-25 12:59:08 -04:00
Tyler Goodlet 4d9f6e733a Support `ctx: UnionType` annots for `@tractor.context` eps 2025-03-25 12:59:08 -04:00
Tyler Goodlet e57ac63f56 Avoid attr-err when `._ipc_msg==None`
Seems this can happen in particular when we raise a `MessageTypeError`
on the sender side of a `Context`, since there isn't any msg relayed
from the other side (though i'm wondering if MTE should derive from RAE
then considering this case?).

Means `RemoteActorError.boxed_type = None` in such cases instead of
raising an attr-error for the `None.boxed_type_str`.
2025-03-25 12:59:08 -04:00
Tyler Goodlet 32ef2764b0 Facepalm, fix logic misstep on child side
Namely that `add_hooks: bool` should be the same as on the rent side..
Also, just drop the now unused `iter_maybe_sends`.

This makes the suite entire greeeeen btw, including the new sub-suite
which i hadn't runt before Bo
2025-03-25 12:59:08 -04:00
Tyler Goodlet b6e490295b Rework IPC-using `test_caps_basesd_msging` tests
Namely renaming and massively simplifying it to a new
`test_ext_types_over_ipc` which avoids all the wacky "parent dictates
what sender should be able to send beforehand"..

Instead keep it simple and just always try to send the same small set of
types over the wire with expect-logic to handle each case,

- use the new `dec_hook`/`ext_types` args to `mk_[co]dec()` routines for
  pld-spec ipc transport.
- always try to stream a small set of types from the child with logic to
  handle the cases expected to error.

Other,
- draft a `test_pld_limiting_usage` to check runtime raising of bad API
  usage; haven't run it yet tho.
- move `test_custom_extension_types` to top of mod so that the
  `enc/dec_nsp()` hooks can be reffed from test parametrizations.
- comment out (and maybe remove) the old routines for
  `iter_maybe_sends`, `test_limit_msgspec`, `chk_pld_type`.

XXX TODO, turns out the 2 failing cases from this suite have exposed an
an actual bug with `MsgTypeError` unpacking where the `ipc_msg=` input
is being set to `None` ?? -> see the comment at the bottom of
`._exceptions._mk_recv_mte()` which seems to describe the likely
culprit?
2025-03-25 12:59:08 -04:00
Tyler Goodlet 08e15f441f Raise RTE from `limit_plds()` on no `curr_ctx`
Since it should only be used from within a `Portal.open_context()`
scope, make sure the caller knows that!

Also don't hide the frame in tb if the immediate function errors..
2025-03-25 12:59:08 -04:00
Tyler Goodlet 8cdf3d819f Offer a `mods: list` to `dec_type_union()`; drop importing this-mod 2025-03-25 12:59:08 -04:00
Tyler Goodlet 9b89f87ea1 Tweak type-error messages for when `ext_types` is missing 2025-03-25 12:59:08 -04:00
Tyler Goodlet 29c3895527 Move `Union` serializers to new `msg.` mod
Namely moving `enc/dec_type_union()` from the test mod to a new
`tractor.msg._exts` for general use outside the test suite.
2025-03-25 12:59:08 -04:00
Tyler Goodlet c302b74008 Finally get type-extended `msgspec` fields workinn
By using our new `PldRx` design we can,
- pass through the pld-spec & a `dec_hook()` to our `MsgDec` which is
  used to configure the underlying `.dec: msgspec.msgpack.Decoder`
- pass through a `enc_hook()` to `mk_codec()` and use it to conf the
  equiv `MsgCodec.enc` such that sent msg-plds are converted prior
  to transport.

The trick ended up being just to always union the `mk_dec()`
extension-types spec with the normaly with the `msgspec.Raw` pld-spec
such that the `dec_hook()` is only invoked for payload types tagged
by the encoder/sender side B)

A variety of impl tweaks to make it all happen as well as various
cleanups in the `.msg._codec` mod include,

- `mk_dec()` no defaul `spec` arg, better doc string, accept the new
  `ext_types` arg, doing the union of that with `msgspec.Raw`.
- proto-ed a now unused `mk_boxed_ext_struct()` which will likely get
  removed since it ended up that our `PayloadMsg` structs already cover
  the ext-type-hook requirement that the decoder is passed
  a `.type=msgspec.Struct` of some sort in order for `.dec_hook` to be
  used.
- add a `unpack_spec_types()` util fn for getting the `set[Type]` from
  from a `Union[Type]` annotation instance.
- mk the default `mk_codec(pc_pld_spec = Raw,)` since the `PldRx` design
  was already passing/overriding it and it doesn't make much sense to
  use `Any` anymore for the same reason; it will cause various `Context`
  apis to now break.
  |_ also accept a `enc_hook()` and `ext_types` which are used to maybe
     config the `.msgpack.Encoder`
- generally tweak a bunch of comments-as-docs and todos namely the ones
  that are completed after the pld-rx design was implemented.

Also,
- mask the non-functioning `'defstruct'` approach `inside
  `.msg.types.mk_msg_spec()` to prep for its removal.

Adjust the test suite (rn called `test_caps_based_msging`),
- add a new suite `test_custom_extension_types` and move and
  use the `enc/dec_nsp()` hooks to the mod level for its use.
- prolly planning to drop the `test_limit_msgspec` suite since it's
  mostly replaced by the `test_pldrx_limiting` mod's version?
- originally was tweaking a bunch in `test_codec_hooks_mod` but likely
  it will get mostly rewritten to be simpler and simply verify that
  ext-typed fields can be used over IPC `Context`s between actors (as
  originally intended for this sub-suite).
2025-03-25 12:59:08 -04:00
Tyler Goodlet e42bc33bd6 Move bp to-match-comments on same line for py3.13
In the `examples/debugging/restore_builtin_breakpoint.py` i had put the
pattern-comment lines on the line following the `breakpoint()` bc it
seems that's where `pdb` would always "stop" and print the line to
console? So the test would only pass by actually ensuring that in the
`pexpect` capture..

Now on 3.13 it seems that the `pdb` line halting must have been fixed;
it now renders to console the same `breakpoint()` line?
Anyway it works as you'd expect now but **only** on 3.13 so after this
change we might have to adjust the tests to `pytest.xfail()` on earlier
versions.
2025-03-25 12:54:12 -04:00
Tyler Goodlet bb916fd815 Drop explicit `tabcompleter` dep, `pdpp` already sub-depends on it? 2025-03-25 12:54:03 -04:00
Tyler Goodlet 27e4fc6660 Bump up to `pytest>=8.3.5` to match "GH actions"
Ensure it's only for the `--dev` optional deps.
2025-03-25 12:54:03 -04:00
Tyler Goodlet 5feac62d3f Bump to `msgspec>=0.19.0` for py 3.13 support! 2025-03-25 12:54:03 -04:00
Tyler Goodlet 631fcc0471 Bind another `_bexc` for debuggin 2025-03-25 12:54:03 -04:00
Tyler Goodlet 6e43fe1dd0 Unpack errors from `pdb.bdb`
Like any `bdb.BdbQuit` that might be relayed from a remote context after
a REPl exit with the `quit` cmd. This fixes various issues while
debugging where it may not be clear to the parent task that the child
was terminated with a purposefully unrecoverable error.
2025-03-25 12:54:03 -04:00
Tyler Goodlet 187af24bcc Show frames when decode is handed bad input 2025-03-25 12:54:03 -04:00
Tyler Goodlet 64fbad708e Another loosie in the trioisms suite 2025-03-25 12:54:03 -04:00
Tyler Goodlet 6bd4903d01 Match `maybe_open_crash_handler()` to non-maybe version
Such that it will deliver a `BoxedMaybeException` to the caller
regardless whether `pdb` is set, and proxy through all `**kwargs`.
2025-03-25 12:54:03 -04:00
Tyler Goodlet f6c098d608 Use `collapse_eg()` in broadcaster suite
Around the test embedded `trio.open_nursery()` calls as expected. Also
tidy up the various nursery var names.
2025-03-25 12:54:03 -04:00
Tyler Goodlet 7920e2980b Draft some eg collapsing helpers
Inside a new `.trionics._beg` and exposed from the subpkg ns in
anticipation of the `strict_exception_groups=False` being removed by
`trio` in py 3.15.

Notes,
- mk an embedded single-exc "extractor" using a `BaseExceptionGroup.exceptions` length
  check, when 1 return the lone child.
- use the above in a new `@acm`, async bc it's most likely to be composed in an
  `async with` tuple-style sequence block, called `collapse_eg()` which
  acts a one line "absorber" for when the above mentioned flag is no
  logner supported by `trio.open_nursery()`.

All untested atm fwiw.. but soon to be used in our test suite(s) likely!
2025-03-25 12:54:03 -04:00
Tyler Goodlet e92c3e63ae Fix docs tests with yet another loosie-goosie
So the KBI propagates up to the actor nursery scope and also avoid
running any `examples/multihost/` subdir scripts.
2025-03-25 12:54:03 -04:00
Tyler Goodlet 364ae6f6c8 Another couple loose-ifies for discovery and advanced fault suites 2025-03-25 12:54:03 -04:00
Tyler Goodlet 052b36f1e7 Add (masked) meta-debug-fixture for determining if `debug_mode` is set in harness.. 2025-03-25 12:54:03 -04:00
Tyler Goodlet 261c48e126 Various test tweaks related to 3.13 egs
Including changes like,
- loose eg flagging in various test emedded `trio.open_nursery()`s.
- changes to eg handling (like using `except*`).
- added `debug_mode` integration to tests that needed some REPLin
  in order to figure out appropriate updates.
2025-03-25 12:54:03 -04:00
Tyler Goodlet c793f177f6 Go to loose egs in `Actor` root & service nurseries (for now..) 2025-03-25 12:54:03 -04:00
Tyler Goodlet deb84e0b2c Fix `roundtripped` ref error in `validate_payload_msg()` 2025-03-25 12:54:03 -04:00
Tyler Goodlet 3fa86c82fd Hide `open_nursery()` frame by def 2025-03-25 12:54:03 -04:00
Tyler Goodlet 1407ea26d3 Moar sclang log fmting tweaks 2025-03-25 12:54:03 -04:00
Tyler Goodlet 2903431540 Expose `._state.debug_mode()` predicate at top level 2025-03-25 12:54:03 -04:00
Tyler Goodlet 1da0cba380 Another loose-egs flag in `test_child_manages_service_nursery` 2025-03-25 12:54:03 -04:00
Tyler Goodlet 36eb30daa3 Handle egs on failed `request_root_stdio_lock()`
Namely when the subactor fails to lock the root, in which case we
try to be very verbose about how/what failed in logging as well
as ensure we cancel the employed IPC ctx.

Implement the outer `BaseException` handler to handle both styles,
- match on an eg (or the prior std cancel excs) only raising a lone
  sub-exc from for former.
- always `as _req_err:` and assign to a new func-global `req_err`
  to enable the above matching.

Other,
- raise `DebugStateError` on `status.subactor_uid != actor_uid`.
- fix a `_repl_fail_report` ref error due to making silly assumptions
  about the `_repl_fail_msg` global; now copy from global as default.
- various log-fmt and logic expression styling tweaks.
- ignore `trio.Cancelled` by default in `open_crash_handler()`.
2025-03-25 12:54:03 -04:00
Tyler Goodlet 5bbd9b4e54 A couple more loose-egs flag flips
Namely inside,
- `ActorNursery.open_portal()` which uses
  `.trionics.maybe_open_nursery()` and is now adjusted to
  pass-through `**kwargs` for at least this flag.
- inside the `.trionics.gather_contexts()`.
2025-03-25 12:54:03 -04:00
Tyler Goodlet 89db16c693 Disable tb colors in `._testing.mk_cmd()`
Unset the appropriate cpython osenv var such that our `pexpect` script
runs in the test suite can maintain original matching logic.
2025-03-25 12:54:03 -04:00
Tyler Goodlet 8056a9cf9f Log format tweaks for sclang reprs
A space here, a newline there..
2025-03-25 12:54:03 -04:00
Tyler Goodlet c8a3a6fb2b Expose `hide_tb: bool` from `.open_nursery()`
Such that it gets passed through to `.open_root_actor()` in the
`implicit_runtime==True` case - useful for debugging cases where
`.devx._debug` APIs might be used to avoid REPL clobbering in subactors.
2025-03-25 12:54:03 -04:00
Tyler Goodlet 897fa3d9f2 Flip to `strict_exception_groups=False` in core tns
Since it'll likely need a bit of detailing to get the test suite running
identically with strict egs (exception groups), i've opted to just flip
the switch on a few core nursery scopes for now until as such a time
i can focus enough to port the matching internals.. Xp
2025-03-25 12:54:03 -04:00
Tyler Goodlet 639d6a981c Clean up some imports in `._clustering` 2025-03-25 12:54:03 -04:00
Tyler Goodlet 08941c22a6 Bump various (dev) deps and prefer sys python
Since it turns out there's a few gotchas moving to python 3.13,
- we need to pin to new(er) `trio` which now flips to strict exception
  groups (something to be handled in a follow up patch).
- since we're now using `uv` we should (at least for now) prefer the
  system `python` (over astral's distis) since they compile for
  `libedit` in terms of what the (new) `readline.backend: str` will read
  as; this will break our tab-completion and vi-mode settings in
  the `pdbp` REPL without a user configuring a `~/.editrc`
  appropriately.
- go back to using latest `pdbp` (not a local dev version) since it
  should work fine presuming the previous bullet is addressed.

Lock bumps,
- for now use latest `trio==0.29.0` (which i gotta feeling might have
  broken some existing attempts at strict-eg handling i've tried..)
- update to latest `xonsh`, `pdbp` and its dep `tabcompleter`

Other cleaning,
- put back in various deps "comments" from `poetry` content.
- drop the `xonsh-vox` and `xontrib-vox` dev deps; no `vox` support with
  `uv` rn anyway..
2025-03-25 12:54:03 -04:00
Tyler Goodlet 010d75248e Comment-tag pause points in `asycnio_bp.py`
Thought i already did this but, obvi needed these to make the expect
matches pass in our test.
2025-03-25 12:45:04 -04:00
Tyler Goodlet 47ec7e7a49 Add equiv of `AsyncioCancelled` for aio side
Such that a `TrioCancelled` is raised in the aio task via
`.set_exception()` to explicitly indicate and allow that task to handle
a taskc request from the parent `trio.Task`.
2025-03-25 12:45:04 -04:00
Tyler Goodlet a66caa2397 Drop `asyncio`-canc error from `._exceptions` 2025-03-25 12:45:04 -04:00
Tyler Goodlet b1018a13fe Continue supporting py3.11+
Apparently the only thing needing a guard was use of
`asyncio.Queue.shutdown()` and the paired `QueueShutDown` exception?

Cool.
2025-03-24 21:44:46 -04:00
Tyler Goodlet 90287b9875 Fix an `aio_err` ref bug 2025-03-24 21:44:09 -04:00
Tyler Goodlet 15f99c313e Mask ctlc borked REPL tests
Namely the `tractor.pause_from_sync()` examples using both bg threads
and `asyncio` which seem to go into bad states where SIGINT is ignored..

Deats,
- add `maybe_expect_timeout()` cm to ensure the EOF hangs get
  `.xfail()`ed instead.
- @pytest.mark.ctlcs_bish` `test_pause_from_sync` and don't expect the
  greenback prompt msg.
- also mark `test_sync_pause_from_aio_task`.
2025-03-24 15:37:12 -04:00
Tyler Goodlet 39027cd330 Repair/update `stackscope` test
Seems that on 3.13 it's not showing our script code in the output now?
Gotta get an example for @oremanj to see what's up but really it'd be
nice to just custom format stuff above `trio`'s runtime by def..

Anyway, update the `.devx._stackscope`,
- log formatting to be a little more "sclangy" lookin.
- change the per-actor "delimiter" lines style.
- report the `signal.getsignal(SIGINT)` which i needed in the
  `sync_bp.py` with ctl-c causing a hang..
- mask the `_tree_dumped` duplicator log report as well as the "dumped
  fine" one.
- add an example `pkill --signal SIGUSR1` cmdline.

Tweak the test to cope with,
- not showing our script lines now.. which i've commented in the
  `assert_before()` patts..
- to expect the newly formatted delimiter (ascii) lines to separate the
  root vs. hanger sub-actor sections.
2025-03-24 15:37:12 -04:00
Tyler Goodlet d7dc51a429 Add a mark to `pytest.xfail()` questionably conc py stuff (ur mam `.xfail()`s bish!) 2025-03-24 15:37:12 -04:00
Tyler Goodlet 7720564afb Be extra sure to re-raise EoCs from translator
That is whenever `trio.EndOfChannel` is raised (presumably from the
`._to_trio.receive()` call inside `LinkedTaskChannel.receive()`) we need
to be extra certain that we let it bubble upward transparently DESPITE
special exc-as-signal handling that is normally suppressed from the aio
side; REPEAT we want to ALWAYS bubble any `trio_err ==
trio.EndOfChannel` in the `finally:` handler of `translate_aio_errors()`
despite `chan._trio_to_raise == AsyncioTaskExited` such that the
caller's iterable machinery will operate as normal when the inter-task
stream is stopped (again, presumably by the aio side task terminating
the inter-task stream).

Main impl deats for this,
- in the EoC handler block ensure we assign both `chan._trio_err` and
  the local `trio_err` as well as continue to re-raise.
- add a case to the match block in the `finally:` handler which FOR SURE
  re-raises any `type(trio_err) is EndOfChannel`!

Additionally fix a bad bug,
- a ref bug where we were NOT using the
  `except BaseException as _trio_err` to assign to `chan._trio_err` (by
  accident was missing the leading `_`..)

Unrelated impl tweak,
- move all `maybe_raise_aio_side_err()` content back to inline with its
  parent func - makes it easier to use `tractor.pause()` mostly Bp
- go back to trying to use `aio_task.set_exception(aio_taskc)` for now
  even though i'm pretty sure we're going to move to a try-fute-first
  style helper for this in the future.

Adjust some tests to match/mk-them-green,
- break from `aio_echo_server()` recv loop on
  `to_asyncio.TrioTaskExited` much like how you'd expect to (implicitly
  with a `for`) with a `trio.EndOfChannel`.
- toss in a masked `value is None` pause point i needed for debugging
  inf looping caused by not re-raising EoCs per the main patch
  description.
- add a debug-mode sized delay to root-infected test.
2025-03-24 15:37:12 -04:00
Tyler Goodlet 24c309671d More `debug_mode` test support, better nursery var names 2025-03-24 15:37:12 -04:00
Tyler Goodlet 680501aa10 Add per-side graceful-exit/cancel excs-as-signals
Such that any combination of task terminations/exits can be explicitly
handled and "dual side independent" crash cases re-raised in egs.

The main error-or-exit impl changes include,

- use of new per-side "signaling exceptions":
  - TrioTaskExited|TrioCancelled for signalling aio.
  - AsyncioTaskExited|AsyncioCancelled for signalling trio.

- NOT overloading the `LinkedTaskChannel._trio/aio_err` fields for
  err-as-signal relay and instead add a new pair of
  `._trio/aio_to_raise` maybe-exc-attrs which allow each side's
  task to specify what it would want the other side to raise to signal
  its/a termination outcome:
  - `._trio_to_raise: AsyncioTaskExited|AsyncioCancelled` to signal,
    |_ the aio task having returned while the trio side was still reading
       from the `asyncio.Queue` or is just not `.done()`.
    |_ the aio task being self or trio-request cancelled where
       a `asyncio.CancelledError` is raised and caught but NOT relayed
       as is back to trio; instead signal a "more explicit" exc type.
  - `._aio_to_raise: TrioTaskExited|TrioCancelled` to signal,
    |_ the trio task having returned while the aio side was still reading
       from the mem chan and indicating that the trio side might not
       care any more about future streamed values (like the
       `Stop/EndOfChannel` equivs for ipc `Context`s).
    |_ when the trio task canceld we do
        a `asyncio.Future.set_exception(TrioTaskExited())` to indicate
        to the aio side verbosely that it should cancel due to the trio
        parent.
  - `_aio/trio_err` are now left to only capturing the **actual**
    per-side task excs for introspection / other side's handling logic.

- supporting "graceful exits" depending on API in use from
  `translate_aio_errors()` such that if either side exits but the other
  side isn't expect to consume the final `return`ed value, we just exit
  silently, which required:
  - adding a `suppress_graceful_exits: bool` flag.
  - adjusting the `maybe_raise_aio_side_err()` logic to use that flag
    and suppress only on certain combos of `._trio_to_raise/._trio_err`.
  - prefer to raise `._trio_to_raise` when the aio-side is the src and
    vice versa.

- filling out pedantic logging for cancellation cases indicating which
  side is the cause.

- add a `LinkedTaskChannel._aio_result` modelled after our
  `Context._result` a a similar `.wait_for_result()` interface which
  allows maybe accessing the aio task's final return value if desired
  when using the `open_channel_from()` API.

- rename `cancel_trio()` done handler -> `signal_trio_when_done()`

Also some fairly major test suite updates,
- add a `delay: int` producing fixture which delivers a much larger
  timeout whenever `debug_mode` is set so that the REPL can be used
  without a surrounding cancel firing.
- add a new `test_aio_exits_early_relays_AsyncioTaskExited` including
  a paired `exit_early: bool` flag to `push_from_aio_task()`.
- adjust `test_trio_closes_early_causes_aio_checkpoint_raise` to expect
  a `to_asyncio.TrioTaskExited`.
2025-03-24 15:37:12 -04:00
Tyler Goodlet cc7ad719d4 Another `is` fix.. 2025-03-24 15:37:12 -04:00
Tyler Goodlet 5149b75f25 Unset `$PYTHON_COLORS` for test debugger suite..
Since obvi all our `pexpect` patterns aren't going to match with
a heck-ton of terminal color escape sequences in the output XD
2025-03-24 15:37:12 -04:00
Tyler Goodlet eb3337a593 Tweak some test asserts to better `is` style 2025-03-24 15:37:12 -04:00
Tyler Goodlet b23b55f219 Save an MIA `breakpoint()`-restore test from prior!?
It appears that during the reorg commit
a356233b47 this was intended to be moved
(presumably where i have here) to `test_tooling` but was somehow just
never pasted over XD

Good thing this was caught while going through the remaining TODO
bullets in #2 !!

Also includes fixed relative `.conftest` imports!
2025-03-24 15:37:12 -04:00
Tyler Goodlet ccb717ecc7 Draft test-doc for "out-of-band" `asyncio.Task`..
Since there's no way to activate `greenback`'s portal in such cases, we
should at least have a test verifying our very loud error about the
inability to support this usage..
2025-03-24 15:37:12 -04:00
Tyler Goodlet e0be3397d1 Raise "independent" task errors in an eg
The (rare) condition is heavily detailed in new comments in
the `cancel_trio()` callback but, more or less the idea here is to be
extra pedantic in raising an `Exceptiongroup` of errors from each task
(both `asyncio` and `trio`) whenever the 2 tasks raise "independently"
- in the sense that it's not obviously one side's task causing an error
(or cancellation) in the other. In this case we set the error for each
side on the `LinkedTaskChannel` (via new attrs described later).

As a synopsis, most of this work was refined out of supporting
`infected_aio=True` mode in the **root actor** and in particular as part
of getting that to work inside the `modden` daemon which at the time of
writing was still using the `i3ipc` lib and thus `asyncio`.

Impl deats,
- extend the `LinkedTaskChannel` field/API set (and type it),
  - `._trio_task: trio.Task` for test/user introspection.
- also "stage" some ideas for a more refined interface,
  - `.started()` to deliver the value yielded to the `trio.Task` parent.
   |_ also includes some todos for how to implement this design
      underneath.
  - `._aio_first: Any|None = None` to hold that value ^.
  - `.wait_aio_complete()` for syncing to the asyncio task.
- some detailed logging around "asyncio cancelled trio" case.
- Move `AsyncioCancelled` in this module.

Styling changes,
- generally more explicit var naming.
- some todos for getting modern and fancy with typing..

NB, Let it be known this commit msg was written on a friday with the
help of various "mr. white" solns.
2025-03-24 15:37:12 -04:00
Tyler Goodlet af60417177 Add a `tests/test_root_infect_asyncio`
Might as well break apart the specific test set since there are some
(minor) subtleties and the orig test mod is already getting pretty big
XD

Includes both the new "independent"-event-loops test as well as the std
usage base case suite.
2025-03-24 15:37:12 -04:00
Tyler Goodlet 526e5b91d9 Impl a proto "unmasker" `@acm` alongside our test
Such that the suite verifies the wip `maybe_raise_from_masking_exc()`
will raise from a `trio.Cancelled.__context__` since I can't think of
any reason a `Cancelled` should ever be raised in-place of
a non-`Cancelled` XD

Not sure what should be raised instead (or maybe just a `log.warning()`
emitted?) but this starts a draft for refinement at the least. Use the
new `@pytest.mark.parametrize` explicit tuple-of-params form with an
`pytest.param + `.mark.xfail()` for the default behaviour case.
2025-03-24 15:37:12 -04:00
Tyler Goodlet 6012628223 Add a "raise-from-`finally:`" example test
Since i wasted 2 days just to find an example of this inside an `@acm`,
figured I better reproduce for the purposes of maybe implementing
a warning sys (inside our wip proto `open_taskman()`) when a nursery
detects a single `Cancelled` in an eg where the `.__context__` is set to
some non-cancel error (which likely means a cancel-causing source
exception was suppressed by accident).

Left in a buncha commented code using `maybe_open_nursery()` which
i thought might be part of the issue but didn't end up being required;
will likely remove on a follow up refinement.
2025-03-24 15:37:12 -04:00
Tyler Goodlet 391d3faafd Yield a boxed-maybe-error from `open_crash_handler()`
Along the lines of something like `pytest.raises()` where the handled
exception can be inspected from the `pdbp` REPL using its `.value` field
B)

This is super handy in particular for understanding
`BaseException[Group]`s without manually adding surrounding handler code
to assign the `except[*] Exception as exc_var:` particularly when trying
to understand multi-cancelled eg trees.
2025-03-24 15:37:12 -04:00
Tyler Goodlet 34c1c1713d Add an inter-leaved-task error test
Trying to replicate cases where errors are raised in both `trio` and
`asyncio` tasks independently (at least in `.to_asyncio` API terms) with
a new `test_trio_prestarted_task_bubbles` that generates 3 cases inside
a `@acm` calls stack composing a `trio.Nursery` with
a `to_asyncio.open_channel_from()` call where a set of `trio` tasks are
started in a loop using `.start()` with various exc raising sequences,
- the aio task raising *before* the last `trio` task spawns.
- the aio task raising just after the last trio task spawns, but before
  it starts.
- after the last trio task `.start()` call returns control to the
  parent - but (for now) did not error.

TODO, still more cases to discover as i'm still fighting a `modden` bug
of this sort atm..

Other,
- tweak some other tests to have timeouts since some recent hangs were
  found..
- started mucking with py3.13 and thus adjustments for strict egs in
  some tests; full patchset to test suite likely coming soon!
2025-03-24 15:37:12 -04:00
Tyler Goodlet 5a9a3a457c Hm, `asyncio.Task._fut_waiter.set_exception()`?
Since we can't use it to `Task.set_exception()` (since that task method never
seems to work.. XD) and setting the private/internal always seems to do
the desired raising in the task? I realize it's an internal `asyncio`
runtime field but i'd rather take the risk of it breaking then having to
rely on our own equivalent hack..

Also, it seems like the case where the task's associated (and internal)
future-waiter field is null, we won't run into the (same?) prior hanging
issues (maybe since there's nothing for `asyncio` internals to use to
wait XD ??) when `Task.cancel()` is used..??

Main deats,
- add and `Future.set_exception()` a new signal-exception
  `class TrioTaskExited(AsyncioCancelled):` whenever the trio-task exits
  gracefully and the asyncio-side task is still doing blocking work (of
  some sort) which *seem to* be predicated by a check that
  `._fut_waiter is not None`.
- always call `asyncio.Queue.shutdown()` for the same^ as well as
  whenever we decide to call `Task.cancel()`; in that case the shutdown
  relays correctly?

Some further refinements,
- only warn about `Task.cancel()` usage when actually used ;)
- more local scope vars setting in the exit phase of
  `translate_aio_errors()`.
- also in ^ use explicit caught-exc var names for each error-type.
2025-03-24 15:37:12 -04:00
Tyler Goodlet 3e1258f840 Much more limited `asyncio.Task.cancel()` use
Since it can not only cause the guest-mode run to abandon but also in
some edge cases prevent `trio`-errors from propagating (at least on
py3.12-13?) as discovered as part of supporting this mode officially
in the *root actor*.

As such try to avoid that method as much as possible instead opting to
pass the `trio`-side error via the iter-task channel ref.

Deats,
- add a `LinkedTaskChannel._trio_err: BaseException|None` which gets set
  whenver the `trio.Task` error is caught; ONLY set `AsyncioCancelled`
  when the `trio` task was for sure the cause, whether itself cancelled
  or errored.
- always check for this error when exiting the `asyncio` side (even when
  terminated via a call to `asyncio.Task.cancel()` or during any other
  `CancelledError` handling such that the `asyncio`-task can expect to
  handle `AsyncioCancelled` due to the above^^ cases.
- never `cs.cancel()` the `trio` side unless that cancel scope has not
  yet been `.cancel_called` whatsoever; it's a noop anyway.
- only raise any exc from `asyncio.Task.result()` when `chan._aio_err`
  does not already match it since the existence of the pre-existing
  `task_err` means `asyncio` prolly intends (or has already) raised and
  interrupted the task elsewhere.

Various supporting tweaks,
- don't bother maybe-init-ing `greenback` from the actor entrypoint
  since we already need to (and do) bestow the portals to each `asyncio`
  task spawned using the `run_task()`/`open_channel_from()` API; further
  the init-ing should be done already by client code that enables
  infected mode (even in the root actor).
 |_we should prolly also codify it from any
   `run_daemon(infected_aio=True, debug_mode=True)` usage we offer.
- pass all the `_<field>`s to `Linked TaskChannel` explicitly in named
  kwarg style.
- better sclang-style log reports throughout, particularly on teardowns.
- generally more/better comments and docs around (not well understood)
  edge cases.
- prep to just inline `maybe_raise_aio_side_err()` closure..
2025-03-24 15:37:12 -04:00
Tyler Goodlet 669c09c977 Expose `debug_filter` from `open_root_actor()` also
Such that actor-runtime graceful cancel handling can be used throughout
any process tree.
2025-03-24 15:37:12 -04:00
Tyler Goodlet 1a126effec Drop extra nl from boxed error fmt 2025-03-24 15:37:12 -04:00
Tyler Goodlet 6982c53386 Raise explicitly on missing `greenback` portal
When `.pause_from_sync()` is called from an `asyncio.Task` which was
never bestowed a portal we want to be mega pedantic about it; indicate
that the task was NOT spawned from our `.to_asyncio` API and likely by
some out-of-our-control code (normally using
`asyncio.ensure_future()/.create_task()`). Though `greenback` already
errors on such usage, it's not always clear why no portal exists;
explaining the situation of a 3rd-party-bg-spawned-task should avoid
dev confusion for most cases.

Impl deats,
- distinguish between an actor in infected mode versus the actual caller
  of `.pause_from_sync()` being an `asyncio.Task` with more explicit
  `asyncio_task` and `is_infected_aio` vars.
- ONLY in the case of being both an infected-mode-actor AND detecting
  that the caller is an `asyncio.Task`, check `greenback.has_portal()`
  such that when not bestowed we presume the aforementioned
  3rd-party-bg-task case above and raise a new explicit RTE with
  a detailed explanatory message.
- add some masked draft code for handling the speical case of a root
  actor `asyncio.Task` caller which could (in theory) not actually
  require gb portal use since the `Lock` can be acquired directly
  without IPC.
 |_this will likely require factoring of various pause machinery funcs
   into a `_pause_from_root_task()` to mk the impl sane XD

Other,
- expose a new `debug_filter: Callable` which can be provided by the
  caller of `_maybe_enter_pm()` to predicate whether to enter the
  debugger REPL based on the caught `BaseException|BaseExceptionGroup`;
  this is handy for customizing the meaning of "graceful cancellations"
  so as to avoid crash handling on expected egs of more then
  `trioCancelled`.
|_ make the default as it was implemented: `not is_multi_cancelled(err)`
- pass-through a new `ignore: set[BaseException]` as
  `open_crash_handler(ignore_nested=ignore)` to allow for the same
  silent-cancellation-egs-swallowing as desired from outside the actor
  runtime.
2025-03-24 15:37:12 -04:00
Tyler Goodlet 7d3d1e1afb Accept err-type override in `is_multi_cancelled()`
Such that equivalents of `trio.Cancelled` from other runtimes such as
`asyncio.CancelledError` and `subprocess.CalledProcessError` (with
a `.returncode == -2`) can be gracefully ignored as needed by the
caller.

For example this is handy if you want to avoid debug-mode REPL entry on
an exception-group full of only some subset of exception types since you
expect certain tasks to raise such errors after having been cancelled by
a request from some parent supervision sys (some "higher up"
`trio.CancelScope`, a remote triggered `ContextCancelled` or just from
and OS SIGINT).

Impl deats,
- offer a new `ignore_nested: set[BaseException]` param which by
  default we add `trio.Cancelled` to when no other types are provided.
- use `ExceptionGroup.subgroup(tuple(ignore_nested)` to filter to egs of
  the "ignored sub-errors set" and return any such match (instead of
  `True`).
- detail a comment on exclusion case.
2025-03-24 15:37:12 -04:00
Tyler Goodlet 1222ef1e74 Support passing pre-conf-ed `Logger`
Such that we can hook into 3rd-party-libs more easily to monkey them and
use our (prettier/hipper) console logging with something like (an
example from the client project `modden`),

```python
    connection_mod = i3ipc.connection
    tractor_style_i3ipc_logger: logging.LoggingAdapter = tractor.log.get_console_log(
        _root_name=connection_mod.__name__,
        logger=i3ipc.connection_mod.logger,
        level='info',
    )
    # monkey the instance-ref in 3rd-party module
    connection_mod.logger = our_logger
```

Impl deats,
- expose as `get_console_log(logger: logging.Logger)` and add default
  failover logic.
- toss in more typing, also for mod-global instance.
2025-03-24 15:37:12 -04:00
Tyler Goodlet 1cc87bda50 Support and test infected-`asyncio`-mode for root
Such that you can use,

```python

    tractor.to_asyncio.run_as_asyncio_guest(
        trio_main=_trio_main,
    )
```

to boostrap the root actor (and thus main parent process) to embed
the actor-rumtime into an `asyncio` loop. Prove it all works with an
subactor-free version of the aio echo-server test suite B)
2025-03-24 15:37:12 -04:00
Tyler Goodlet aa2a7f050f TOSQUASH: 9002f60 howtorelease.md file 2025-03-24 15:37:12 -04:00
Tyler Goodlet 636b29e440 Draft a (pretty)`Struct.fields_diff()`
For comparing a `msgspec.Struct` against an input `dict` presumably to
be used as input for struct instantiation. The main diff with
`.__sub__()` is that non-existing fields on either are reported
(loudly).
2025-03-24 15:37:12 -04:00
Tyler Goodlet ef6fdbd09b Spitballing how to expose custom `msgspec` type hooks
Such that maybe we can eventually offer a nicer higher-level API which
implements much of the boilerplate required by `msgspec` (like
type-matched branching to serialization logic) via a type-table
interface or something?

Not sure if the idea is that useful so leaving it all as TODOs for now
obviously.
2025-03-24 15:37:12 -04:00
Tyler Goodlet 5dd38643a7 Add `notes_to_self/howtorelease.md` reminder doc 2025-03-24 15:37:12 -04:00
Tyler Goodlet 6e06a04e14 Add TODO for a runtime-vars passing mechanism 2025-03-24 15:37:12 -04:00
Tyler Goodlet c3e68e4133 Change masked `.pause()` line 2025-03-24 15:37:12 -04:00
Tyler Goodlet 8722c6a1f7 Type the inter-loop chans 2025-03-24 15:37:12 -04:00
Tyler Goodlet 9f6b9e133d Add TODO for a tb frame "filterer" sys.. 2025-03-24 15:37:12 -04:00
Tyler Goodlet 52de75f1d4 Set `RemoteActorError.pformat(boxer_header=self.relay_uid)` by def 2025-03-24 15:37:12 -04:00
Tyler Goodlet 0546b7c684 Support custom `boxer_header: str` provided by `pformat_boxed_tb()` caller 2025-03-24 15:37:12 -04:00
Tyler Goodlet 6a57f28619 Expose a `_ctlc_ignore_header: str` for use in `sigint_shield()` 2025-03-24 15:37:12 -04:00
Tyler Goodlet 73a82fe422 Change `tractor.breakpoint()` to new `.pause()` in test suite 2025-03-24 15:37:12 -04:00
Tyler Goodlet 61ca5b1f61 Wrap `asyncio_bp.py` ex into test suite
Ensuring we can at least use `breakpoint()` from an infected actor's
`asyncio.Task` spawned via a `.to_asyncio` API.

Also includes a little `tests/devx/` reorging,
- start splitting out non-`tractor.pause()` tests into a new
  `test_pause_from_non_trio.py` for all the `.pause_from_sync()`
  use in bg-threaded or `asyncio` applications.
- factor harness commonalities to the `devx/conftest` (namely
  the `do_ctlc()` masher).
- mv `test_pause_from_sync` to the new non`-trio` mod.

NOTE, the `ctlc=True` is still failing for
`test_pause_from_asyncio_task` which is a user-happiness bug but not
anything fundamentally broken - just need to handle the `asyncio` case
in `.devx._debug.sigint_shield()`!
2025-03-24 15:37:12 -04:00
Tyler Goodlet 70416347c1 Add `breakpoint()` hook restoration example + test 2025-03-24 15:37:12 -04:00
Tyler Goodlet a7c06271a0 Rename `n: trio.Nursery` -> `tn` (task nursery) 2025-03-24 15:37:12 -04:00
Tyler Goodlet 9ec7913562 Messy-teardown `DebugStatus` related fixes
Mostly fixing edge cases with `asyncio` and/or bg threads where the
`.repl_release: trio.Event` needs to be used from the main `trio`
thread OW confusing-but-valid teardown tracebacks can show under various
races.

Also improve,
- log reporting for such internal bugs to make them more obvious on
  console via `log.exception()`.
- only restore the SIGINT handler when runtime is (still) active.
- reporting when `tractor.pause(shield=True)` should be used and
  unhiding the internal frames from the tb in that case.
- for `pause_from_sync()` some deep fixes..
 |_add a `allow_no_runtime: bool = False` flag to allow
   **not** requiring the actor runtime to be active.
 |_fix the `greenback` case-branch to only trigger on `not
   is_trio_thread`.
 |_add a scope-global `repl_owner: Task|Thread|None = None` to
   avoid ref errors..
2025-03-24 15:37:12 -04:00
Tyler Goodlet 9b07e7bdeb More `.pause_from_sync()` in bg-threads "polish"
Various `try`/`except` blocks around external APIs that raise when not
running inside an `tractor` and/or some async framework (mostly to avoid
too-late/benign error tbs on certain classes of actor tree teardown):
- for the `log.pdb()` prompts emitted before REPL console entry.
- inside `DebugStatus.is_main_trio_thread()`'s call to `sniffio`.
- in `_post_mortem()` by catching `NoRuntime` when called from a thread
  still active after the `.open_root_actor()` has already exited.

Also,
- create a dedicated `DebugStateError` for raising instead of `assert`s
  when we have actual debug-request inconsistencies (as seem to be most
  likely with bg thread usage of `breakpoint()`).
- show the `open_crash_handler()` frame on `bdb.BdbQuit` (for now?)
2025-03-24 15:37:12 -04:00
Tyler Goodlet 3a53921535 Hide `[maybe]_open_crash_handler()` frame by default 2025-03-24 15:37:12 -04:00
Tyler Goodlet 8114f0d327 Use our `._post_mortem` from `open_crash_handler()`
Since it seems that `pdbp.xpm()` can sometimes lose the up-stack
traceback info/frames? Not sure why but ours seems to work just fine
from a `asyncio`-handler in `modden`'s use of `i3ipc` B)

Also call `DebugStatus.shield_sigint()` from `pause_from_sync()` in the
infected-`asyncio` case to get the same shielding behaviour as in all
other usage!
2025-03-24 15:37:12 -04:00
Tyler Goodlet 0bc4a18ce6 Drop `asyncio_bp` loglevel setting by default 2025-03-24 15:37:12 -04:00
Tyler Goodlet db843b361d First draft, `asyncio`-task, sync-pausing Bo
Mostly due to magic from @oremanj where we slap in a little bit of
`.from_asyncio`-type stuff to run a `trio`-task from `asyncio.Task`
code!

I'm not gonna go into tooo too much detail but basically the primary
thing needed was a way to (blocking-ly) invoke a `trio.lowlevel.Task`
from an `asyncio` one (which we now have with a new
`run_trio_task_in_future()` thanks to draft code from the aforementioned
jefe) which we now invoke from a dedicated aio case-branch inside
`.devx._debug.pause_from_sync()`. Further include a case inside
`DebugStatus.release()` to handle using the same func to set the
`repl_release: trio.Event` from the aio side when releasing the REPL on
exit cmds.

Prolly more refinements to come ;{o
2025-03-24 15:37:12 -04:00
Tyler Goodlet f45d672b54 Fix multi-daemon debug test `break` signal..
It was expecting `AssertionError` as a proceed-in-test signal (by
breaking from a continue loop), but `in_prompt_msg(raise_on_err=True)`
was changed to raise `ValueError`; so instead just use as a predicate
for the `break`.

Also rework `in_prompt_msg()` to accept the `child: BaseSpawn` as input
instead of `before: str` remove the casting boilerplate, and adjust all
usage to match.
2025-03-24 15:37:12 -04:00
Tyler Goodlet cd4df52608 Use "sclang"-style syntax in `to_asyncio` task logging
Just like we've started doing throughout the rest of the actor runtime
for reporting (and where "sclang" = "structured conc (s)lang", our
little supervision-focused operations syntax i've been playing with in
log msg content).

Further tweaks:
- report the `trio_done_fute` alongside the `main_outcome` value.
- add a todo list for supporting `greenback` for pause points.
2025-03-24 15:37:12 -04:00
Tyler Goodlet aa06452c48 Pass `infect_asyncio` setting via runtime-vars
The reason for this "duplication" with the `--asyncio` CLI flag (passed
to the child during spawn) is 2-fold:
- allows verifying inside `Actor._from_parent()` that the `trio` runtime was
  started via `.start_guest_run()` as well as if the
  `Actor._infected_aio` spawn-entrypoint value has been set (by the
  `._entry.<spawn-backend>_main()` whenever `--asyncio` is passed)
  such that any mismatch can be signaled via an `InternalError`.
- enables checking the `._state._runtime_vars['_is_infected_aio']` value
  directly (say from a non-actor/`trio`-thread) instead of calling
  `._state.current_actor(err_on_no_runtime=False)` in certain edge
  cases.

Impl/testing deats:
- add `._state._runtime_vars['_is_infected_aio'] = False` default.
- raise `InternalError` on any `--asyncio`-flag-passed vs.
  `_runtime_vars`-value-relayed-from-parent inside
  `Actor._from_parent()` and include a `Runner.is_guest` assert for good
  measure B)
- set and relay `infect_asyncio: bool` via runtime-vars to child in
  `ActorNursery.start_actor()`.
- verify `actor.is_infected_aio()`, `actor._infected_aio` and
  `_state._runtime_vars['_is_infected_aio']` are all set in test suite's
  `asyncio_actor()` endpoint.
2025-03-24 15:37:12 -04:00
Tyler Goodlet fae0ec9edf Officially test proto-ed `stackscope` integration
By re-purposing our `pexpect`-based console matching with a new
`debugging/shield_hang_in_sub.py` example, this tests a few "hanging
actor" conditions more formally:

- that despite a hanging actor's task we can dump
  a `stackscope.extract()` tree on relay of `SIGUSR1`.
- the actor tree will terminate despite a shielded forever-sleep by our
  "T-800" zombie reaper machinery activating and hard killing the
  underlying subprocess.

Some test deats:
- simulates the expect actions of a real user by manually using
  `os.kill()` to send both signals to the actor-tree program.
- `pexpect`-matches against `log.devx()` emissions under normal
  `debug_mode == True` usage.
- ensure we get the actual "T-800 deployed" `log.error()` msg and
  that the actor tree eventually terminates!

Surrounding (re-org/impl/test-suite) changes:
- allow disabling usage via a `maybe_enable_greenback: bool` to
  `open_root_actor()` but enable by def.
- pretty up the actual `.devx()` content from `.devx._stackscope`
  including be extra pedantic about the conc-primitives for each signal
  event.
- try to avoid double handles of `SIGUSR1` even though it seems the
  original (what i thought was a) problem was actually just double
  logging in the handler..
  |_ avoid double applying the handler func via `signal.signal()`,
  |_ use a global to avoid double handle func calls and,
  |_ a `threading.RLock` around handling.
- move common fixtures and helper routines from `test_debugger` to
  `tests/devx/conftest.py` and import them for use in both test mods.
2025-03-24 15:37:12 -04:00
Tyler Goodlet 506ddb72e1 Start a new `tests/devx/` tooling-subsuite-pkg 2025-03-24 15:37:12 -04:00
Tyler Goodlet d007e965f0 Move `mk_cmd()` to `._testing`
Since we're going to need it more generally for `.devx` sub-sys tooling
tests.

Also, up the sync-pause ctl-c delay another 10ms..
2025-03-24 15:37:12 -04:00
Tyler Goodlet 9817fa5201 Get multi-threaded sync-pausing fully workin!
The final issue was making sure we do the same thing on ctl-c/SIGINT
from the user. That is, if there's already a bg-thread in REPL, we
`log.pdb()` about SIGINT shielding and re-draw the prompt; the same UX
as normal actor-runtime-task behaviour.

Reasons this wasn't workin.. and the fix:
- `.pause_from_sync()` was overriding the local `repl` var with `None`
  delivered by (transitive) calls to `_pause(debug_func=None)`.. so
  remove all that and only assign it OAOO prior to thread-type case
  branching.
- always call `DebugStatus.shield_sigint()` as needed from all requesting
  threads/tasks:
  - in `_pause_from_bg_root_thread()` BEFORE calling `._pause()` AND BEFORE
    yielding back to the bg-thread via `.started(out)` to ensure we're
    definitely overriding the handler in the `trio`-main-thread task
    before unblocking the requesting bg-thread.
  - from any requesting bg-thread in the root actor such that both its
    main-`trio`-thread scheduled task (as per above bullet) AND it are
    SIGINT shielded.
  - always call `.shield_sigint()` BEFORE any `greenback._await()` case
    don't entirely grok why yet, but it works)?
  - for `greenback._await()` case always set `bg_task` to the current one..
- tweaks to the `SIGINT` handler, now renamed `sigint_shield()` so as
  not to name-collide with the methods when editor-searching:
  - always try to `repr()` the REPL thread/task "owner" as well as the
    active `PdbREPL` instance.
  - add `.devx()` notes around the prompt flushing deats and comments
    for any root-actor-bg-thread edge cases.

Related/supporting refinements:
- add `get_lock()`/`get_debug_req()` factory funcs since the plan is to
  eventually implement both as `@singleton` instances per actor.
- fix `acquire_debug_lock()`'s call-sig-bug for scheduling
  `request_root_stdio_lock()`..
- in `._pause()` only call `mk_pdb()` when `debug_func != None`.
- add some todo/warning notes around the `cls.repl = None` in
  `DebugStatus.release()`

`test_pause_from_sync()` tweaks:
- don't use a `attach_patts.copy()`, since we always `break` on match.
- do `pytest.fail()` on that ^ loop's fallthrough..
- pass `do_ctlc(child, patt=attach_key)` such that we always match the
  the current thread's name with the ctl-c triggered `.pdb()` emission.
- oh yeah, return the last `before: str` from `do_ctlc()`.
- in the script, flip `abandon_on_cancel=True` since when `False` it
  seems to cause `trio.run()` to hang on exit from the last bg-thread
  case?!?
2025-03-24 15:37:12 -04:00
Tyler Goodlet eef6e6779c Another tweak to REPL entry `.pdb()` headers 2025-03-24 15:37:12 -04:00
Tyler Goodlet 82a6e5bec0 More failed REPL-lock-request refinements
In `lock_stdio_for_peer()` better internal-error handling/reporting:
- only `Lock._blocked.remove(ctx.cid)` if that same cid was added on
  entry to avoid needless key-errors.
- drop all `Lock.release(force: bool)` usage remnants.
- if `req_ctx.cancel()` fails mention it with `ctx_err.add_note()`.
- add more explicit internal-failed-request log messaging via a new
  `fail_reason: str`.
- use and use new `x)<=\n|_` annots in any failure logging.

Other cleanups/niceties:
- drop `force: bool` flag entirely from the `Lock.release()`.
- use more supervisor-op-annots in `.pdb()` logging
  with both `_pause/crash_msg: str` instead of double '|' lines when
  `.pdb()`-reported from `._set_trace()`/`._post_mortem()`.
2025-03-24 15:37:12 -04:00
Tyler Goodlet b0eb1b7dd6 Todo a test for sync-pausing from non-main-root-tasks 2025-03-24 15:37:12 -04:00
Tyler Goodlet b1f8741575 Use `delay=0` in pump loop..
Turns out it does work XD

Prior presumption was from before I had the fute poll-loop so makes
sense we needed more then one sched-tick's worth of context switch vs.
now we can just keep looping-n-pumping as fast possible until the
guest-run's main task completes.

Also,
- minimize the preface commentary (as per todo) now that we have tests
  codifying all the edge cases :finger_crossed:
- parameter-ize the pump-loop-cycle delay and default it to 0.
2025-03-24 15:37:12 -04:00
Tyler Goodlet 6e66020121 Solve our abandonment issues..
To make the recent set of tests pass this (hopefully) finally solves all
`asyncio` embedded `trio` guest-run abandonment by ensuring we "pump the
event loop" until the guest-run future is fully complete.

Accomplished via simple poll loop of the form `while not
trio_done_fut.done(): await asyncio.sleep(.1)` in the `aio_main()`
task's exception teardown sequence. The loop does a naive 10ms
"pump-via-sleep & poll" for the `trio` side to complete before finally
exiting (and presumably raising) from the SIGINT cancellation.

Other related cleanups and refinements:
- use `asyncio.Task.result()` inside `cancel_trio()` since it also
  inline-raises any exception outcome and we can also log-report the
  result in non-error cases.
- comment out buncha not-sure-we-need-it stuff in `cancel_trio()`.
- remove the botched `AsyncioCancelled(CancelledError):` idea obvi XD
- comment `greenback` init for now in `aio_main()` since (pretty sure)
  we don't ever want to actually REPL in that specific func-as-task?
- always capture any `fute_err: BaseException` from the `main_outcome:
  Outcome` delivered by the `trio` side guest-run task.
- add and raise a new super noisy `AsyncioRuntimeTranslationError`
  whenever we detect that the guest-run `trio_done_fut` has not
  completed before task exit; should avoid abandonment issues ever
  happening again without knowing!
2025-03-24 15:37:12 -04:00
Tyler Goodlet f090bf32f2 Demo-abandonment on shielded `trio`-side work
Finally this reproduces the issue as it (originally?) exhibited inside
`piker` where the `Actor.lifetime_stack` wasn't closed in cases where
during `infected_aio`-actor cancellation/shutdown `trio` side tasks
which are doing shielded (teardown) work are NOT being watched/waited on
from the `aio_main()` task-closure inside `run_as_asyncio_guest()`!

This is then the root cause of the guest-run being abandoned since if
our `aio_main()` task-closure doesn't know it should allow the run to
finish, it's going to call `loop.close()` eventually resulting in the
`GeneratorExit` thrown into `trio._core._run.unrolled_run()`..

So, this extends the `test_sigint_closes_lifetime_stack()` suite to
include cases for such shielded `trio`-task ops:
- add a new `trio_side_is_shielded: bool` which will toggle whether to
  add a shielded 0.5s `trio.sleep()` loop to `manage_file()` which
  should outlive the `asyncio` event-loop shutdown sequence and result
  in an abandoned guest-run and thus a leaked file.
- parametrize the existing suite with this case resulting in a total 16
  test set B)

This patch demonstrates the problem with our `aio_main()` task-closure
impl via the now 4 failing tests, a fix is coming in a follow up commit!
2025-03-24 15:37:12 -04:00
Tyler Goodlet 39155f9633 Lel, revert `AsyncioCancelled` inherit, module..
Turns out it somehow breaks our `to_asyncio` error relay since obvi
`asyncio`'s runtime seems to specially handle it (prolly via
`isinstance()` ?) and it caused our
`test_aio_cancelled_from_aio_causes_trio_cancelled()` to hang..
Further, obvi `unpack_error()` won't be able to find the type def if not
kept inside `._exceptions`..

So given all that, revert the change/move as well as:
- tweak the aio-from-aio cancel test to timeout.
- do `trio.sleep()` conc with any bg aio task by moving out nursery
  block.
- add a `send_sigint_to: str` parameter to
  `test_sigint_closes_lifetime_stack()` such that we test the SIGINT
  being relayed to just the parent or the child.
2025-03-24 15:37:12 -04:00
Tyler Goodlet 60036cfb72 Hack `asyncio` to not abandon a guest-mode run?
Took me a while to figure out what the heck was going on but, turns out
`asyncio` changed their SIGINT handling in 3.11 as per:

https://docs.python.org/3/library/asyncio-runner.html#handling-keyboard-interruption

I'm not entirely sure if it's the 3.11 changes or possibly wtv further
updates were made in 3.12  but more or less due to the way
our current main task was written the `trio` guest-run was getting
abandoned on SIGINTs sent from the OS to the infected child proc..

Note that much of the bug and soln cases are layed out in very detailed
comment-notes both in the new test and `run_as_asyncio_guest()`, right
above the final "fix" lines.

Add new `test_infected_aio.test_sigint_closes_lifetime_stack()` test suite
which reliably triggers all abandonment issues with multiple cases
of different parent behaviour post-sending-SIGINT-to-child:
 1. briefly sleep then raise a KBI in the parent which was originally
    demonstrating the file leak not being cleaned up by `Actor.lifetime_stack.close()`
    and simulates a ctl-c from the console (relayed in tandem by
    the OS to the parent and child processes).
 2. do `Context.wait_for_result()` on the child context which would
    hang and timeout since the actor runtime would never complete and
    thus never relay a `ContextCancelled`.
 3. both with and without running a `asyncio` task in the `manage_file`
    child actor; originally it seemed that with an aio task scheduled in
    the child actor the guest-run abandonment always was the "loud" case
    where there seemed to be some actor teardown but with tbs from
    python failing to gracefully exit the `trio` runtime..

The (seemingly working) "fix" required 2 lines of code to be run inside
a `asyncio.CancelledError` handler around the call to `await trio_done_fut`:
- `Actor.cancel_soon()` which schedules the actor runtime to cancel on
  the next `trio` runner cycle and results in a "self cancellation" of
  the actor.
- "pumping the `asyncio` event loop" with a non-0 `.sleep(0.1)` XD
 |_ seems that a "shielded" pump with some actual `delay: float >= 0`
   did the trick to get `asyncio` to allow the `trio` runner/loop to
   fully complete its guest-run without abandonment.

Other supporting changes:
- move `._exceptions.AsyncioCancelled`, our renamed
  `asyncio.CancelledError` error-sub-type-wrapper, to `.to_asyncio` and make
  it derive from `CancelledError` so as to be sure when raised by our
  `asyncio` x-> `trio` exception relay machinery that `asyncio` is
  getting the specific type it expects during cancellation.
- do "summary status" style logging in `run_as_asyncio_guest()` wherein
  we compile the eventual `startup_msg: str` emitted just before waiting
  on the `trio_done_fut`.
- shield-wait with `out: Outcome = await asyncio.shield(trio_done_fut)`
  even though it seems to do nothing in the SIGINT handling case..(I
  presume it might help avoid abandonment in a `asyncio.Task.cancel()`
  case maybe?)
2025-03-24 15:37:12 -04:00

Diff Content Not Available