Compare commits

...

822 Commits

Author SHA1 Message Date
Tyler Goodlet fde681fa19 Merge pull request 'Extension types support via msgspec.Encoder/Decoder hooks' () from ext_type_plds into main
Reviewed(-ish)-on: 
2025-03-27 17:43:43 -04:00
Tyler Goodlet efcf81bcad Add `.runtime()`-emit to `._invoke()` to report final result msg in the child 2025-03-27 15:58:03 -04:00
Tyler Goodlet 3988ea69f5 Add `MsgStream._stop_msg` use new `PldRx` API
In particular ensuring we use `ctx._pld_rx.recv_msg_nowait()` from
`.receive_nowait()` (which is called from `.aclose()`) such that we
ALWAYS (can) set the surrounding `Context._result/._outcome_msg` attrs
on reception of a final `Return`!!

This fixes a final stream-teardown-race-condition-bug where prior we
normally didn't set the `Context._result/._outcome_msg` in such cases.
This is **precisely because**  `.receive_nowait()` only returns the
`pld` and when called from `.aclose()` this value is discarded, meaning
so is its boxing `Return` despite consuming it from the underlying
`._rx_chan`..

Longer term this should be solved differently by ensuring such races
cases are handled at a higher scope like inside `Context._deliver_msg()`
or the `Portal.open_context()` enter/exit blocks? Add a detailed warning
note and todos for all this around the special case block!
2025-03-27 15:58:03 -04:00
Tyler Goodlet 8bd4490cad Add `Context._outcome_msg` use new `PldRx` API
Such that any `Return` is always capture for each ctx instance and set
in `._deliver_msg()` normally; ensures we can at least introspect for it
when missing (like in a recently discovered stream teardown race bug).
Yes this augments the already existing `._result` which is dedicated for
the `._outcome_msg.pld` in the non-error case; we might want to see if
there's a nicer way to directly proxy ref to that without getting the
pre-pld-decoded `Raw` form with `msgspec`?

Also use the new `ctx._pld_rx.recv_msg()` and drop assigning
`pld_rx._ctx`.
2025-03-27 15:58:03 -04:00
Tyler Goodlet 622f840dfd Slight `PldRx` rework to simplify
Namely renaming and tweaking the `MsgType` receiving methods,
- `.recv_msg()` from what was `.recv_msg_w_pld()` which both receives
  the IPC msg from the underlying `._rx_chan` and then decodes its
  payload with `.decode_pld()`; it now also log reports on the different
  "stage of SC dialog protocol" msg types via a `match/case`.
- a new `.recv_msg_nowait()` sync equivalent of ^ (*was*
  `.recv_pld_nowait()`) who's use was the source of a recently
  discovered bug where any final `Return.pld` is being
  consumed-n-discarded by by `MsgStream.aclose()` depending on
  ctx/stream teardown race conditions..

Also,
- remove all the "instance persistent" ipc-ctx attrs, specifically the
  optional `_ipc`, `_ctx` and the `.wraps_ipc()` cm, since none of them
  were ever really needed/used; all methods which require
  a `Context/MsgStream` are explicitly always passed.
- update a buncha typing namely to use the more generic-styled
  `PayloadT` over `Any` and obviously `MsgType[PayloadT]`.
2025-03-27 15:58:03 -04:00
Tyler Goodlet 8ba315e60c Rename ext-types with `msgspec` suite module 2025-03-27 15:58:03 -04:00
Tyler Goodlet 80f20b35b1 Complete rename to parent->child IPC ctx peers
Now changed in all comments docs **and** test-code content such that we
aren't using the "caller"->"callee" semantics anymore.
2025-03-27 15:58:02 -04:00
Tyler Goodlet 9ec37dd13f Fix msg-draining on `parent_never_opened_stream`!
Repairs a bug in `drain_to_final_msg()` where in the `Yield()` case
block we weren't guarding against the `ctx._stream is None` edge case
which should be treated a `continue`-draining (not a `break` or
attr-error!!) situation since the peer task maybe be continuing to send
`Yield` but has not yet sent an outcome msg (one of
`Return/Error/ContextCancelled`) to terminate the loop. Ensure we
explicitly warn about this case as well as `.cancel()` emit on a taskc.

Thanks again to @guille for discovering this!

Also add temporary `.info()`s around rxed `Return` msgs as part of
trying to debug a different bug discovered while updating the
context-semantics test suite (in a prior commit).
2025-03-27 15:58:02 -04:00
Tyler Goodlet 9be76b1dda Extend ctx semantics suite for streaming edge cases!
Muchas grax to @guilledk for finding the first issue which kicked of
this further scrutiny of the `tractor.Context` and `MsgStream` semantics
test suite with a strange edge case where,
- if the parent opened and immediately closed a stream while the remote
  child task started and continued (without terminating) to send msgs
  the parent's `open_context().__aexit__()` would **not block** on the
  child to complete!
=> this was seemingly due to a bug discovered inside the
  `.msg._ops.drain_to_final_msg()` stream handling case logic where we
  are NOT checking if `Context._stream` is non-`None`!

As such this,
- extends the `test_caller_closes_ctx_after_callee_opens_stream` (now
  renamed, see below) to include cases for all combinations of the child
  and parent sending before receiving on the stream as well as all
  placements of `Context.cancel()` in the parent before, around and after
  the stream open.
- uses the new `expect_ctxc()` for expecting the taskc (`trio.Task`
  cancelled)` cases.
- also extends the `test_callee_closes_ctx_after_stream_open` (also
  renamed) to include the case where the parent sends a msg before it
  receives.
=> this case has unveiled yet-another-bug where somehow the underlying
  `MsgStream._rx_chan: trio.ReceiveMemoryChannel` is allowing the
  child's `Return[None]` msg be consumed and NOT in a place where it is
  correctly set as `Context._result` resulting in the parent hanging
  forever inside `._ops.drain_to_final_msg()`..

Alongside,
- start renaming using the new "remote-task-peer-side" semantics
  throughout the test module: "caller" -> "parent", "callee" -> "child".
2025-03-27 15:58:02 -04:00
Tyler Goodlet 31f88b59f4 Deliver a `MaybeBoxedError` from `.expect_ctxc()`
Just like we do from the `.devx._debug.open_crash_handler()`, this
allows checking various attrs on the raised `ContextCancelled` much like
`with pytest.raises() as excinfo:`.
2025-03-27 15:58:02 -04:00
Tyler Goodlet 155d581fa2 Avoid attr-err when `._ipc_msg==None`
Seems this can happen in particular when we raise a `MessageTypeError`
on the sender side of a `Context`, since there isn't any msg relayed
from the other side (though i'm wondering if MTE should derive from RAE
then considering this case?).

Means `RemoteActorError.boxed_type = None` in such cases instead of
raising an attr-error for the `None.boxed_type_str`.
2025-03-27 15:58:02 -04:00
Tyler Goodlet a810f6c8f6 Facepalm, fix logic misstep on child side
Namely that `add_hooks: bool` should be the same as on the rent side..
Also, just drop the now unused `iter_maybe_sends`.

This makes the suite entire greeeeen btw, including the new sub-suite
which i hadn't runt before Bo
2025-03-27 15:58:02 -04:00
Tyler Goodlet 83b9dc3c62 Rework IPC-using `test_caps_basesd_msging` tests
Namely renaming and massively simplifying it to a new
`test_ext_types_over_ipc` which avoids all the wacky "parent dictates
what sender should be able to send beforehand"..

Instead keep it simple and just always try to send the same small set of
types over the wire with expect-logic to handle each case,

- use the new `dec_hook`/`ext_types` args to `mk_[co]dec()` routines for
  pld-spec ipc transport.
- always try to stream a small set of types from the child with logic to
  handle the cases expected to error.

Other,
- draft a `test_pld_limiting_usage` to check runtime raising of bad API
  usage; haven't run it yet tho.
- move `test_custom_extension_types` to top of mod so that the
  `enc/dec_nsp()` hooks can be reffed from test parametrizations.
- comment out (and maybe remove) the old routines for
  `iter_maybe_sends`, `test_limit_msgspec`, `chk_pld_type`.

XXX TODO, turns out the 2 failing cases from this suite have exposed an
an actual bug with `MsgTypeError` unpacking where the `ipc_msg=` input
is being set to `None` ?? -> see the comment at the bottom of
`._exceptions._mk_recv_mte()` which seems to describe the likely
culprit?
2025-03-27 15:58:02 -04:00
Tyler Goodlet f152a20025 Raise RTE from `limit_plds()` on no `curr_ctx`
Since it should only be used from within a `Portal.open_context()`
scope, make sure the caller knows that!

Also don't hide the frame in tb if the immediate function errors..
2025-03-27 15:58:02 -04:00
Tyler Goodlet 1ea8254ae3 Offer a `mods: list` to `dec_type_union()`; drop importing this-mod 2025-03-27 15:58:02 -04:00
Tyler Goodlet 8ed890f892 Tweak type-error messages for when `ext_types` is missing 2025-03-27 15:58:02 -04:00
Tyler Goodlet d4e6f2b8dc Move `Union` serializers to new `msg.` mod
Namely moving `enc/dec_type_union()` from the test mod to a new
`tractor.msg._exts` for general use outside the test suite.
2025-03-27 15:58:02 -04:00
Tyler Goodlet 64fe767647 Finally get type-extended `msgspec` fields workinn
By using our new `PldRx` design we can,
- pass through the pld-spec & a `dec_hook()` to our `MsgDec` which is
  used to configure the underlying `.dec: msgspec.msgpack.Decoder`
- pass through a `enc_hook()` to `mk_codec()` and use it to conf the
  equiv `MsgCodec.enc` such that sent msg-plds are converted prior
  to transport.

The trick ended up being just to always union the `mk_dec()`
extension-types spec with the normaly with the `msgspec.Raw` pld-spec
such that the `dec_hook()` is only invoked for payload types tagged
by the encoder/sender side B)

A variety of impl tweaks to make it all happen as well as various
cleanups in the `.msg._codec` mod include,

- `mk_dec()` no defaul `spec` arg, better doc string, accept the new
  `ext_types` arg, doing the union of that with `msgspec.Raw`.
- proto-ed a now unused `mk_boxed_ext_struct()` which will likely get
  removed since it ended up that our `PayloadMsg` structs already cover
  the ext-type-hook requirement that the decoder is passed
  a `.type=msgspec.Struct` of some sort in order for `.dec_hook` to be
  used.
- add a `unpack_spec_types()` util fn for getting the `set[Type]` from
  from a `Union[Type]` annotation instance.
- mk the default `mk_codec(pc_pld_spec = Raw,)` since the `PldRx` design
  was already passing/overriding it and it doesn't make much sense to
  use `Any` anymore for the same reason; it will cause various `Context`
  apis to now break.
  |_ also accept a `enc_hook()` and `ext_types` which are used to maybe
     config the `.msgpack.Encoder`
- generally tweak a bunch of comments-as-docs and todos namely the ones
  that are completed after the pld-rx design was implemented.

Also,
- mask the non-functioning `'defstruct'` approach `inside
  `.msg.types.mk_msg_spec()` to prep for its removal.

Adjust the test suite (rn called `test_caps_based_msging`),
- add a new suite `test_custom_extension_types` and move and
  use the `enc/dec_nsp()` hooks to the mod level for its use.
- prolly planning to drop the `test_limit_msgspec` suite since it's
  mostly replaced by the `test_pldrx_limiting` mod's version?
- originally was tweaking a bunch in `test_codec_hooks_mod` but likely
  it will get mostly rewritten to be simpler and simply verify that
  ext-typed fields can be used over IPC `Context`s between actors (as
  originally intended for this sub-suite).
2025-03-27 15:58:02 -04:00
Tyler Goodlet aca015f1c2 Mask top level import of `.hilevel`
Since it isn't required until the landing of the new service-manager
stuff in ; was an oversight
from commit `0607a31dddeba032a2cf7d9fe605edd9d7bb4846`.
2025-03-27 15:57:44 -04:00
Tyler Goodlet 818cd8535f Support `ctx: UnionType` annots for `@tractor.context` eps 2025-03-27 15:56:39 -04:00
goodboy 1e86722357 Merge pull request 'Python 3.13 support' () from py313_support into main
Reviewed-on: 
2025-03-27 19:50:43 +00:00
Tyler Goodlet eda48c8021 Move bp to-match-comments on same line for py3.13
In the `examples/debugging/restore_builtin_breakpoint.py` i had put the
pattern-comment lines on the line following the `breakpoint()` bc it
seems that's where `pdb` would always "stop" and print the line to
console? So the test would only pass by actually ensuring that in the
`pexpect` capture..

Now on 3.13 it seems that the `pdb` line halting must have been fixed;
it now renders to console the same `breakpoint()` line?
Anyway it works as you'd expect now but **only** on 3.13 so after this
change we might have to adjust the tests to `pytest.xfail()` on earlier
versions.
2025-03-27 13:38:47 -04:00
Tyler Goodlet ceda1e466d Drop explicit `tabcompleter` dep, `pdpp` already sub-depends on it? 2025-03-27 13:38:47 -04:00
Tyler Goodlet d14d29ae8c Bump up to `pytest>=8.3.5` to match "GH actions"
Ensure it's only for the `--dev` optional deps.
2025-03-27 13:38:47 -04:00
Tyler Goodlet f068782e74 Bump to `msgspec>=0.19.0` for py 3.13 support! 2025-03-27 13:38:47 -04:00
Tyler Goodlet 84b04639f8 Bind another `_bexc` for debuggin 2025-03-27 13:38:47 -04:00
Tyler Goodlet 4aa7e8c022 Unpack errors from `pdb.bdb`
Like any `bdb.BdbQuit` that might be relayed from a remote context after
a REPl exit with the `quit` cmd. This fixes various issues while
debugging where it may not be clear to the parent task that the child
was terminated with a purposefully unrecoverable error.
2025-03-27 13:38:47 -04:00
Tyler Goodlet b46a886449 Show frames when decode is handed bad input 2025-03-27 13:38:47 -04:00
Tyler Goodlet a26f817ed1 Another loosie in the trioisms suite 2025-03-27 13:38:47 -04:00
Tyler Goodlet 2d18e6a4be Match `maybe_open_crash_handler()` to non-maybe version
Such that it will deliver a `BoxedMaybeException` to the caller
regardless whether `pdb` is set, and proxy through all `**kwargs`.
2025-03-27 13:38:47 -04:00
Tyler Goodlet e815dcd3c8 Use `collapse_eg()` in broadcaster suite
Around the test embedded `trio.open_nursery()` calls as expected. Also
tidy up the various nursery var names.
2025-03-27 13:38:47 -04:00
Tyler Goodlet 0d7b3f1ac5 Draft some eg collapsing helpers
Inside a new `.trionics._beg` and exposed from the subpkg ns in
anticipation of the `strict_exception_groups=False` being removed by
`trio` in py 3.15.

Notes,
- mk an embedded single-exc "extractor" using a `BaseExceptionGroup.exceptions` length
  check, when 1 return the lone child.
- use the above in a new `@acm`, async bc it's most likely to be composed in an
  `async with` tuple-style sequence block, called `collapse_eg()` which
  acts a one line "absorber" for when the above mentioned flag is no
  logner supported by `trio.open_nursery()`.

All untested atm fwiw.. but soon to be used in our test suite(s) likely!
2025-03-27 13:38:47 -04:00
Tyler Goodlet 3ad558230a Fix docs tests with yet another loosie-goosie
So the KBI propagates up to the actor nursery scope and also avoid
running any `examples/multihost/` subdir scripts.
2025-03-27 13:38:47 -04:00
Tyler Goodlet 22f405a707 Another couple loose-ifies for discovery and advanced fault suites 2025-03-27 13:38:47 -04:00
Tyler Goodlet e5bcefb575 Add (masked) meta-debug-fixture for determining if `debug_mode` is set in harness.. 2025-03-27 13:38:47 -04:00
Tyler Goodlet 8f7c022afe Various test tweaks related to 3.13 egs
Including changes like,
- loose eg flagging in various test emedded `trio.open_nursery()`s.
- changes to eg handling (like using `except*`).
- added `debug_mode` integration to tests that needed some REPLin
  in order to figure out appropriate updates.
2025-03-27 13:38:47 -04:00
Tyler Goodlet c453623b9b Go to loose egs in `Actor` root & service nurseries (for now..) 2025-03-27 13:38:47 -04:00
Tyler Goodlet 6e68f51617 Fix `roundtripped` ref error in `validate_payload_msg()` 2025-03-27 13:38:47 -04:00
Tyler Goodlet fdf934d02d Hide `open_nursery()` frame by def 2025-03-27 13:38:47 -04:00
Tyler Goodlet 13572151aa Moar sclang log fmting tweaks 2025-03-27 13:38:47 -04:00
Tyler Goodlet 87342696a1 Expose `._state.debug_mode()` predicate at top level 2025-03-27 13:38:47 -04:00
Tyler Goodlet 8f774f52b1 Another loose-egs flag in `test_child_manages_service_nursery` 2025-03-27 13:38:47 -04:00
Tyler Goodlet 8b4ed31d3b Handle egs on failed `request_root_stdio_lock()`
Namely when the subactor fails to lock the root, in which case we
try to be very verbose about how/what failed in logging as well
as ensure we cancel the employed IPC ctx.

Implement the outer `BaseException` handler to handle both styles,
- match on an eg (or the prior std cancel excs) only raising a lone
  sub-exc from for former.
- always `as _req_err:` and assign to a new func-global `req_err`
  to enable the above matching.

Other,
- raise `DebugStateError` on `status.subactor_uid != actor_uid`.
- fix a `_repl_fail_report` ref error due to making silly assumptions
  about the `_repl_fail_msg` global; now copy from global as default.
- various log-fmt and logic expression styling tweaks.
- ignore `trio.Cancelled` by default in `open_crash_handler()`.
2025-03-27 13:38:47 -04:00
Tyler Goodlet eb18168a4e A couple more loose-egs flag flips
Namely inside,
- `ActorNursery.open_portal()` which uses
  `.trionics.maybe_open_nursery()` and is now adjusted to
  pass-through `**kwargs` for at least this flag.
- inside the `.trionics.gather_contexts()`.
2025-03-27 13:38:47 -04:00
Tyler Goodlet 6b2809b82e Disable tb colors in `._testing.mk_cmd()`
Unset the appropriate cpython osenv var such that our `pexpect` script
runs in the test suite can maintain original matching logic.
2025-03-27 13:38:47 -04:00
Tyler Goodlet aa80b55567 Log format tweaks for sclang reprs
A space here, a newline there..
2025-03-27 13:38:47 -04:00
Tyler Goodlet 4186541724 Expose `hide_tb: bool` from `.open_nursery()`
Such that it gets passed through to `.open_root_actor()` in the
`implicit_runtime==True` case - useful for debugging cases where
`.devx._debug` APIs might be used to avoid REPL clobbering in subactors.
2025-03-27 13:38:47 -04:00
Tyler Goodlet f0deda1fda Flip to `strict_exception_groups=False` in core tns
Since it'll likely need a bit of detailing to get the test suite running
identically with strict egs (exception groups), i've opted to just flip
the switch on a few core nursery scopes for now until as such a time
i can focus enough to port the matching internals.. Xp
2025-03-27 13:38:47 -04:00
Tyler Goodlet 8f369b5132 Clean up some imports in `._clustering` 2025-03-27 13:38:47 -04:00
Tyler Goodlet aa3432f2a4 Bump various (dev) deps and prefer sys python
Since it turns out there's a few gotchas moving to python 3.13,
- we need to pin to new(er) `trio` which now flips to strict exception
  groups (something to be handled in a follow up patch).
- since we're now using `uv` we should (at least for now) prefer the
  system `python` (over astral's distis) since they compile for
  `libedit` in terms of what the (new) `readline.backend: str` will read
  as; this will break our tab-completion and vi-mode settings in
  the `pdbp` REPL without a user configuring a `~/.editrc`
  appropriately.
- go back to using latest `pdbp` (not a local dev version) since it
  should work fine presuming the previous bullet is addressed.

Lock bumps,
- for now use latest `trio==0.29.0` (which i gotta feeling might have
  broken some existing attempts at strict-eg handling i've tried..)
- update to latest `xonsh`, `pdbp` and its dep `tabcompleter`

Other cleaning,
- put back in various deps "comments" from `poetry` content.
- drop the `xonsh-vox` and `xontrib-vox` dev deps; no `vox` support with
  `uv` rn anyway..
2025-03-27 13:38:47 -04:00
goodboy 222b90940c Merge pull request 'Prevent `asyncio` from abandoning guest-runs, `.pause_from_sync()` support via `.to_asyncio`' () from aio_abandons into main
Reviewed-ish-on: 
2025-03-27 17:37:57 +00:00
Tyler Goodlet c91373148a Comment-tag pause points in `asycnio_bp.py`
Thought i already did this but, obvi needed these to make the expect
matches pass in our test.
2025-03-27 13:24:25 -04:00
Tyler Goodlet f1af87007e Add equiv of `AsyncioCancelled` for aio side
Such that a `TrioCancelled` is raised in the aio task via
`.set_exception()` to explicitly indicate and allow that task to handle
a taskc request from the parent `trio.Task`.
2025-03-27 13:24:25 -04:00
Tyler Goodlet 13adaa110a Drop `asyncio`-canc error from `._exceptions` 2025-03-27 13:24:25 -04:00
Tyler Goodlet 9e10064bda Continue supporting py3.11+
Apparently the only thing needing a guard was use of
`asyncio.Queue.shutdown()` and the paired `QueueShutDown` exception?

Cool.
2025-03-27 13:24:25 -04:00
Tyler Goodlet bde355dcd5 Fix an `aio_err` ref bug 2025-03-27 13:24:25 -04:00
Tyler Goodlet b021772a1e Mask ctlc borked REPL tests
Namely the `tractor.pause_from_sync()` examples using both bg threads
and `asyncio` which seem to go into bad states where SIGINT is ignored..

Deats,
- add `maybe_expect_timeout()` cm to ensure the EOF hangs get
  `.xfail()`ed instead.
- @pytest.mark.ctlcs_bish` `test_pause_from_sync` and don't expect the
  greenback prompt msg.
- also mark `test_sync_pause_from_aio_task`.
2025-03-27 13:24:25 -04:00
Tyler Goodlet 03406e020c Repair/update `stackscope` test
Seems that on 3.13 it's not showing our script code in the output now?
Gotta get an example for @oremanj to see what's up but really it'd be
nice to just custom format stuff above `trio`'s runtime by def..

Anyway, update the `.devx._stackscope`,
- log formatting to be a little more "sclangy" lookin.
- change the per-actor "delimiter" lines style.
- report the `signal.getsignal(SIGINT)` which i needed in the
  `sync_bp.py` with ctl-c causing a hang..
- mask the `_tree_dumped` duplicator log report as well as the "dumped
  fine" one.
- add an example `pkill --signal SIGUSR1` cmdline.

Tweak the test to cope with,
- not showing our script lines now.. which i've commented in the
  `assert_before()` patts..
- to expect the newly formatted delimiter (ascii) lines to separate the
  root vs. hanger sub-actor sections.
2025-03-27 13:24:25 -04:00
Tyler Goodlet b0acc9ffe8 Add a mark to `pytest.xfail()` questionably conc py stuff (ur mam `.xfail()`s bish!) 2025-03-27 13:24:25 -04:00
Tyler Goodlet fc325a621b Be extra sure to re-raise EoCs from translator
That is whenever `trio.EndOfChannel` is raised (presumably from the
`._to_trio.receive()` call inside `LinkedTaskChannel.receive()`) we need
to be extra certain that we let it bubble upward transparently DESPITE
special exc-as-signal handling that is normally suppressed from the aio
side; REPEAT we want to ALWAYS bubble any `trio_err ==
trio.EndOfChannel` in the `finally:` handler of `translate_aio_errors()`
despite `chan._trio_to_raise == AsyncioTaskExited` such that the
caller's iterable machinery will operate as normal when the inter-task
stream is stopped (again, presumably by the aio side task terminating
the inter-task stream).

Main impl deats for this,
- in the EoC handler block ensure we assign both `chan._trio_err` and
  the local `trio_err` as well as continue to re-raise.
- add a case to the match block in the `finally:` handler which FOR SURE
  re-raises any `type(trio_err) is EndOfChannel`!

Additionally fix a bad bug,
- a ref bug where we were NOT using the
  `except BaseException as _trio_err` to assign to `chan._trio_err` (by
  accident was missing the leading `_`..)

Unrelated impl tweak,
- move all `maybe_raise_aio_side_err()` content back to inline with its
  parent func - makes it easier to use `tractor.pause()` mostly Bp
- go back to trying to use `aio_task.set_exception(aio_taskc)` for now
  even though i'm pretty sure we're going to move to a try-fute-first
  style helper for this in the future.

Adjust some tests to match/mk-them-green,
- break from `aio_echo_server()` recv loop on
  `to_asyncio.TrioTaskExited` much like how you'd expect to (implicitly
  with a `for`) with a `trio.EndOfChannel`.
- toss in a masked `value is None` pause point i needed for debugging
  inf looping caused by not re-raising EoCs per the main patch
  description.
- add a debug-mode sized delay to root-infected test.
2025-03-27 13:24:25 -04:00
Tyler Goodlet d5ba9be3a9 More `debug_mode` test support, better nursery var names 2025-03-27 13:24:25 -04:00
Tyler Goodlet 639186aa37 Add per-side graceful-exit/cancel excs-as-signals
Such that any combination of task terminations/exits can be explicitly
handled and "dual side independent" crash cases re-raised in egs.

The main error-or-exit impl changes include,

- use of new per-side "signaling exceptions":
  - TrioTaskExited|TrioCancelled for signalling aio.
  - AsyncioTaskExited|AsyncioCancelled for signalling trio.

- NOT overloading the `LinkedTaskChannel._trio/aio_err` fields for
  err-as-signal relay and instead add a new pair of
  `._trio/aio_to_raise` maybe-exc-attrs which allow each side's
  task to specify what it would want the other side to raise to signal
  its/a termination outcome:
  - `._trio_to_raise: AsyncioTaskExited|AsyncioCancelled` to signal,
    |_ the aio task having returned while the trio side was still reading
       from the `asyncio.Queue` or is just not `.done()`.
    |_ the aio task being self or trio-request cancelled where
       a `asyncio.CancelledError` is raised and caught but NOT relayed
       as is back to trio; instead signal a "more explicit" exc type.
  - `._aio_to_raise: TrioTaskExited|TrioCancelled` to signal,
    |_ the trio task having returned while the aio side was still reading
       from the mem chan and indicating that the trio side might not
       care any more about future streamed values (like the
       `Stop/EndOfChannel` equivs for ipc `Context`s).
    |_ when the trio task canceld we do
        a `asyncio.Future.set_exception(TrioTaskExited())` to indicate
        to the aio side verbosely that it should cancel due to the trio
        parent.
  - `_aio/trio_err` are now left to only capturing the **actual**
    per-side task excs for introspection / other side's handling logic.

- supporting "graceful exits" depending on API in use from
  `translate_aio_errors()` such that if either side exits but the other
  side isn't expect to consume the final `return`ed value, we just exit
  silently, which required:
  - adding a `suppress_graceful_exits: bool` flag.
  - adjusting the `maybe_raise_aio_side_err()` logic to use that flag
    and suppress only on certain combos of `._trio_to_raise/._trio_err`.
  - prefer to raise `._trio_to_raise` when the aio-side is the src and
    vice versa.

- filling out pedantic logging for cancellation cases indicating which
  side is the cause.

- add a `LinkedTaskChannel._aio_result` modelled after our
  `Context._result` a a similar `.wait_for_result()` interface which
  allows maybe accessing the aio task's final return value if desired
  when using the `open_channel_from()` API.

- rename `cancel_trio()` done handler -> `signal_trio_when_done()`

Also some fairly major test suite updates,
- add a `delay: int` producing fixture which delivers a much larger
  timeout whenever `debug_mode` is set so that the REPL can be used
  without a surrounding cancel firing.
- add a new `test_aio_exits_early_relays_AsyncioTaskExited` including
  a paired `exit_early: bool` flag to `push_from_aio_task()`.
- adjust `test_trio_closes_early_causes_aio_checkpoint_raise` to expect
  a `to_asyncio.TrioTaskExited`.
2025-03-27 13:24:25 -04:00
Tyler Goodlet 182218a776 Another `is` fix.. 2025-03-27 13:24:25 -04:00
Tyler Goodlet 6de17a3949 Unset `$PYTHON_COLORS` for test debugger suite..
Since obvi all our `pexpect` patterns aren't going to match with
a heck-ton of terminal color escape sequences in the output XD
2025-03-27 13:24:25 -04:00
Tyler Goodlet 41a3297b9f Tweak some test asserts to better `is` style 2025-03-27 13:24:25 -04:00
Tyler Goodlet 255db4b127 Save an MIA `breakpoint()`-restore test from prior!?
It appears that during the reorg commit
a356233b47 this was intended to be moved
(presumably where i have here) to `test_tooling` but was somehow just
never pasted over XD

Good thing this was caught while going through the remaining TODO
bullets in  !!

Also includes fixed relative `.conftest` imports!
2025-03-27 13:24:25 -04:00
Tyler Goodlet 66a7d660f6 Draft test-doc for "out-of-band" `asyncio.Task`..
Since there's no way to activate `greenback`'s portal in such cases, we
should at least have a test verifying our very loud error about the
inability to support this usage..
2025-03-27 13:24:25 -04:00
Tyler Goodlet f199cac5e8 Raise "independent" task errors in an eg
The (rare) condition is heavily detailed in new comments in
the `cancel_trio()` callback but, more or less the idea here is to be
extra pedantic in raising an `Exceptiongroup` of errors from each task
(both `asyncio` and `trio`) whenever the 2 tasks raise "independently"
- in the sense that it's not obviously one side's task causing an error
(or cancellation) in the other. In this case we set the error for each
side on the `LinkedTaskChannel` (via new attrs described later).

As a synopsis, most of this work was refined out of supporting
`infected_aio=True` mode in the **root actor** and in particular as part
of getting that to work inside the `modden` daemon which at the time of
writing was still using the `i3ipc` lib and thus `asyncio`.

Impl deats,
- extend the `LinkedTaskChannel` field/API set (and type it),
  - `._trio_task: trio.Task` for test/user introspection.
- also "stage" some ideas for a more refined interface,
  - `.started()` to deliver the value yielded to the `trio.Task` parent.
   |_ also includes some todos for how to implement this design
      underneath.
  - `._aio_first: Any|None = None` to hold that value ^.
  - `.wait_aio_complete()` for syncing to the asyncio task.
- some detailed logging around "asyncio cancelled trio" case.
- Move `AsyncioCancelled` in this module.

Styling changes,
- generally more explicit var naming.
- some todos for getting modern and fancy with typing..

NB, Let it be known this commit msg was written on a friday with the
help of various "mr. white" solns.
2025-03-27 13:24:25 -04:00
Tyler Goodlet 9b393338ca Add a `tests/test_root_infect_asyncio`
Might as well break apart the specific test set since there are some
(minor) subtleties and the orig test mod is already getting pretty big
XD

Includes both the new "independent"-event-loops test as well as the std
usage base case suite.
2025-03-27 13:24:25 -04:00
Tyler Goodlet 4edf36a895 Impl a proto "unmasker" `@acm` alongside our test
Such that the suite verifies the wip `maybe_raise_from_masking_exc()`
will raise from a `trio.Cancelled.__context__` since I can't think of
any reason a `Cancelled` should ever be raised in-place of
a non-`Cancelled` XD

Not sure what should be raised instead (or maybe just a `log.warning()`
emitted?) but this starts a draft for refinement at the least. Use the
new `@pytest.mark.parametrize` explicit tuple-of-params form with an
`pytest.param + `.mark.xfail()` for the default behaviour case.
2025-03-27 13:24:25 -04:00
Tyler Goodlet bfd1864180 Add a "raise-from-`finally:`" example test
Since i wasted 2 days just to find an example of this inside an `@acm`,
figured I better reproduce for the purposes of maybe implementing
a warning sys (inside our wip proto `open_taskman()`) when a nursery
detects a single `Cancelled` in an eg where the `.__context__` is set to
some non-cancel error (which likely means a cancel-causing source
exception was suppressed by accident).

Left in a buncha commented code using `maybe_open_nursery()` which
i thought might be part of the issue but didn't end up being required;
will likely remove on a follow up refinement.
2025-03-27 13:24:25 -04:00
Tyler Goodlet 3345962253 Yield a boxed-maybe-error from `open_crash_handler()`
Along the lines of something like `pytest.raises()` where the handled
exception can be inspected from the `pdbp` REPL using its `.value` field
B)

This is super handy in particular for understanding
`BaseException[Group]`s without manually adding surrounding handler code
to assign the `except[*] Exception as exc_var:` particularly when trying
to understand multi-cancelled eg trees.
2025-03-27 13:24:25 -04:00
Tyler Goodlet 3c8b1aa888 Add an inter-leaved-task error test
Trying to replicate cases where errors are raised in both `trio` and
`asyncio` tasks independently (at least in `.to_asyncio` API terms) with
a new `test_trio_prestarted_task_bubbles` that generates 3 cases inside
a `@acm` calls stack composing a `trio.Nursery` with
a `to_asyncio.open_channel_from()` call where a set of `trio` tasks are
started in a loop using `.start()` with various exc raising sequences,
- the aio task raising *before* the last `trio` task spawns.
- the aio task raising just after the last trio task spawns, but before
  it starts.
- after the last trio task `.start()` call returns control to the
  parent - but (for now) did not error.

TODO, still more cases to discover as i'm still fighting a `modden` bug
of this sort atm..

Other,
- tweak some other tests to have timeouts since some recent hangs were
  found..
- started mucking with py3.13 and thus adjustments for strict egs in
  some tests; full patchset to test suite likely coming soon!
2025-03-27 13:24:25 -04:00
Tyler Goodlet d4f1a02f43 Hm, `asyncio.Task._fut_waiter.set_exception()`?
Since we can't use it to `Task.set_exception()` (since that task method never
seems to work.. XD) and setting the private/internal always seems to do
the desired raising in the task? I realize it's an internal `asyncio`
runtime field but i'd rather take the risk of it breaking then having to
rely on our own equivalent hack..

Also, it seems like the case where the task's associated (and internal)
future-waiter field is null, we won't run into the (same?) prior hanging
issues (maybe since there's nothing for `asyncio` internals to use to
wait XD ??) when `Task.cancel()` is used..??

Main deats,
- add and `Future.set_exception()` a new signal-exception
  `class TrioTaskExited(AsyncioCancelled):` whenever the trio-task exits
  gracefully and the asyncio-side task is still doing blocking work (of
  some sort) which *seem to* be predicated by a check that
  `._fut_waiter is not None`.
- always call `asyncio.Queue.shutdown()` for the same^ as well as
  whenever we decide to call `Task.cancel()`; in that case the shutdown
  relays correctly?

Some further refinements,
- only warn about `Task.cancel()` usage when actually used ;)
- more local scope vars setting in the exit phase of
  `translate_aio_errors()`.
- also in ^ use explicit caught-exc var names for each error-type.
2025-03-27 13:24:25 -04:00
Tyler Goodlet c5291b7f33 Much more limited `asyncio.Task.cancel()` use
Since it can not only cause the guest-mode run to abandon but also in
some edge cases prevent `trio`-errors from propagating (at least on
py3.12-13?) as discovered as part of supporting this mode officially
in the *root actor*.

As such try to avoid that method as much as possible instead opting to
pass the `trio`-side error via the iter-task channel ref.

Deats,
- add a `LinkedTaskChannel._trio_err: BaseException|None` which gets set
  whenver the `trio.Task` error is caught; ONLY set `AsyncioCancelled`
  when the `trio` task was for sure the cause, whether itself cancelled
  or errored.
- always check for this error when exiting the `asyncio` side (even when
  terminated via a call to `asyncio.Task.cancel()` or during any other
  `CancelledError` handling such that the `asyncio`-task can expect to
  handle `AsyncioCancelled` due to the above^^ cases.
- never `cs.cancel()` the `trio` side unless that cancel scope has not
  yet been `.cancel_called` whatsoever; it's a noop anyway.
- only raise any exc from `asyncio.Task.result()` when `chan._aio_err`
  does not already match it since the existence of the pre-existing
  `task_err` means `asyncio` prolly intends (or has already) raised and
  interrupted the task elsewhere.

Various supporting tweaks,
- don't bother maybe-init-ing `greenback` from the actor entrypoint
  since we already need to (and do) bestow the portals to each `asyncio`
  task spawned using the `run_task()`/`open_channel_from()` API; further
  the init-ing should be done already by client code that enables
  infected mode (even in the root actor).
 |_we should prolly also codify it from any
   `run_daemon(infected_aio=True, debug_mode=True)` usage we offer.
- pass all the `_<field>`s to `Linked TaskChannel` explicitly in named
  kwarg style.
- better sclang-style log reports throughout, particularly on teardowns.
- generally more/better comments and docs around (not well understood)
  edge cases.
- prep to just inline `maybe_raise_aio_side_err()` closure..
2025-03-27 13:24:25 -04:00
Tyler Goodlet 8f0ca44b79 Expose `debug_filter` from `open_root_actor()` also
Such that actor-runtime graceful cancel handling can be used throughout
any process tree.
2025-03-27 13:24:25 -04:00
Tyler Goodlet 2fd9c0044b Drop extra nl from boxed error fmt 2025-03-27 13:24:25 -04:00
Tyler Goodlet 79f4197d26 Raise explicitly on missing `greenback` portal
When `.pause_from_sync()` is called from an `asyncio.Task` which was
never bestowed a portal we want to be mega pedantic about it; indicate
that the task was NOT spawned from our `.to_asyncio` API and likely by
some out-of-our-control code (normally using
`asyncio.ensure_future()/.create_task()`). Though `greenback` already
errors on such usage, it's not always clear why no portal exists;
explaining the situation of a 3rd-party-bg-spawned-task should avoid
dev confusion for most cases.

Impl deats,
- distinguish between an actor in infected mode versus the actual caller
  of `.pause_from_sync()` being an `asyncio.Task` with more explicit
  `asyncio_task` and `is_infected_aio` vars.
- ONLY in the case of being both an infected-mode-actor AND detecting
  that the caller is an `asyncio.Task`, check `greenback.has_portal()`
  such that when not bestowed we presume the aforementioned
  3rd-party-bg-task case above and raise a new explicit RTE with
  a detailed explanatory message.
- add some masked draft code for handling the speical case of a root
  actor `asyncio.Task` caller which could (in theory) not actually
  require gb portal use since the `Lock` can be acquired directly
  without IPC.
 |_this will likely require factoring of various pause machinery funcs
   into a `_pause_from_root_task()` to mk the impl sane XD

Other,
- expose a new `debug_filter: Callable` which can be provided by the
  caller of `_maybe_enter_pm()` to predicate whether to enter the
  debugger REPL based on the caught `BaseException|BaseExceptionGroup`;
  this is handy for customizing the meaning of "graceful cancellations"
  so as to avoid crash handling on expected egs of more then
  `trioCancelled`.
|_ make the default as it was implemented: `not is_multi_cancelled(err)`
- pass-through a new `ignore: set[BaseException]` as
  `open_crash_handler(ignore_nested=ignore)` to allow for the same
  silent-cancellation-egs-swallowing as desired from outside the actor
  runtime.
2025-03-27 13:24:25 -04:00
Tyler Goodlet b71d96fdee Accept err-type override in `is_multi_cancelled()`
Such that equivalents of `trio.Cancelled` from other runtimes such as
`asyncio.CancelledError` and `subprocess.CalledProcessError` (with
a `.returncode == -2`) can be gracefully ignored as needed by the
caller.

For example this is handy if you want to avoid debug-mode REPL entry on
an exception-group full of only some subset of exception types since you
expect certain tasks to raise such errors after having been cancelled by
a request from some parent supervision sys (some "higher up"
`trio.CancelScope`, a remote triggered `ContextCancelled` or just from
and OS SIGINT).

Impl deats,
- offer a new `ignore_nested: set[BaseException]` param which by
  default we add `trio.Cancelled` to when no other types are provided.
- use `ExceptionGroup.subgroup(tuple(ignore_nested)` to filter to egs of
  the "ignored sub-errors set" and return any such match (instead of
  `True`).
- detail a comment on exclusion case.
2025-03-27 13:24:25 -04:00
Tyler Goodlet 4a8e1f56ae Support passing pre-conf-ed `Logger`
Such that we can hook into 3rd-party-libs more easily to monkey them and
use our (prettier/hipper) console logging with something like (an
example from the client project `modden`),

```python
    connection_mod = i3ipc.connection
    tractor_style_i3ipc_logger: logging.LoggingAdapter = tractor.log.get_console_log(
        _root_name=connection_mod.__name__,
        logger=i3ipc.connection_mod.logger,
        level='info',
    )
    # monkey the instance-ref in 3rd-party module
    connection_mod.logger = our_logger
```

Impl deats,
- expose as `get_console_log(logger: logging.Logger)` and add default
  failover logic.
- toss in more typing, also for mod-global instance.
2025-03-27 13:24:25 -04:00
Tyler Goodlet a283d8c05a Support and test infected-`asyncio`-mode for root
Such that you can use,

```python

    tractor.to_asyncio.run_as_asyncio_guest(
        trio_main=_trio_main,
    )
```

to boostrap the root actor (and thus main parent process) to embed
the actor-rumtime into an `asyncio` loop. Prove it all works with an
subactor-free version of the aio echo-server test suite B)
2025-03-27 13:24:25 -04:00
Tyler Goodlet c2bbb7e259 TOSQUASH: 9002f60 howtorelease.md file 2025-03-27 13:24:25 -04:00
Tyler Goodlet 2764d82c1a Draft a (pretty)`Struct.fields_diff()`
For comparing a `msgspec.Struct` against an input `dict` presumably to
be used as input for struct instantiation. The main diff with
`.__sub__()` is that non-existing fields on either are reported
(loudly).
2025-03-27 13:24:25 -04:00
Tyler Goodlet 824801d2ba Spitballing how to expose custom `msgspec` type hooks
Such that maybe we can eventually offer a nicer higher-level API which
implements much of the boilerplate required by `msgspec` (like
type-matched branching to serialization logic) via a type-table
interface or something?

Not sure if the idea is that useful so leaving it all as TODOs for now
obviously.
2025-03-27 13:24:25 -04:00
Tyler Goodlet 0fe6f63012 Add `notes_to_self/howtorelease.md` reminder doc 2025-03-27 13:24:25 -04:00
Tyler Goodlet 8d190bb505 Add TODO for a runtime-vars passing mechanism 2025-03-27 13:24:25 -04:00
Tyler Goodlet 514fb1a4ac Change masked `.pause()` line 2025-03-27 13:24:25 -04:00
Tyler Goodlet 684253ab11 Type the inter-loop chans 2025-03-27 13:24:25 -04:00
Tyler Goodlet 9af2a4e739 Add TODO for a tb frame "filterer" sys.. 2025-03-27 13:24:25 -04:00
Tyler Goodlet 141a842d3d Set `RemoteActorError.pformat(boxer_header=self.relay_uid)` by def 2025-03-27 13:24:25 -04:00
Tyler Goodlet 61c5613943 Support custom `boxer_header: str` provided by `pformat_boxed_tb()` caller 2025-03-27 13:24:25 -04:00
Tyler Goodlet 5b29dd5d2b Expose a `_ctlc_ignore_header: str` for use in `sigint_shield()` 2025-03-27 13:24:25 -04:00
Tyler Goodlet a58c1cad91 Change `tractor.breakpoint()` to new `.pause()` in test suite 2025-03-27 13:24:25 -04:00
Tyler Goodlet e1d96099fc Wrap `asyncio_bp.py` ex into test suite
Ensuring we can at least use `breakpoint()` from an infected actor's
`asyncio.Task` spawned via a `.to_asyncio` API.

Also includes a little `tests/devx/` reorging,
- start splitting out non-`tractor.pause()` tests into a new
  `test_pause_from_non_trio.py` for all the `.pause_from_sync()`
  use in bg-threaded or `asyncio` applications.
- factor harness commonalities to the `devx/conftest` (namely
  the `do_ctlc()` masher).
- mv `test_pause_from_sync` to the new non`-trio` mod.

NOTE, the `ctlc=True` is still failing for
`test_pause_from_asyncio_task` which is a user-happiness bug but not
anything fundamentally broken - just need to handle the `asyncio` case
in `.devx._debug.sigint_shield()`!
2025-03-27 13:24:25 -04:00
Tyler Goodlet ccd60b0c6e Add `breakpoint()` hook restoration example + test 2025-03-27 13:24:25 -04:00
Tyler Goodlet c1c93e08a2 Rename `n: trio.Nursery` -> `tn` (task nursery) 2025-03-27 13:24:25 -04:00
Tyler Goodlet bb60a6d623 Messy-teardown `DebugStatus` related fixes
Mostly fixing edge cases with `asyncio` and/or bg threads where the
`.repl_release: trio.Event` needs to be used from the main `trio`
thread OW confusing-but-valid teardown tracebacks can show under various
races.

Also improve,
- log reporting for such internal bugs to make them more obvious on
  console via `log.exception()`.
- only restore the SIGINT handler when runtime is (still) active.
- reporting when `tractor.pause(shield=True)` should be used and
  unhiding the internal frames from the tb in that case.
- for `pause_from_sync()` some deep fixes..
 |_add a `allow_no_runtime: bool = False` flag to allow
   **not** requiring the actor runtime to be active.
 |_fix the `greenback` case-branch to only trigger on `not
   is_trio_thread`.
 |_add a scope-global `repl_owner: Task|Thread|None = None` to
   avoid ref errors..
2025-03-27 13:24:25 -04:00
Tyler Goodlet 6ef06be6d0 More `.pause_from_sync()` in bg-threads "polish"
Various `try`/`except` blocks around external APIs that raise when not
running inside an `tractor` and/or some async framework (mostly to avoid
too-late/benign error tbs on certain classes of actor tree teardown):
- for the `log.pdb()` prompts emitted before REPL console entry.
- inside `DebugStatus.is_main_trio_thread()`'s call to `sniffio`.
- in `_post_mortem()` by catching `NoRuntime` when called from a thread
  still active after the `.open_root_actor()` has already exited.

Also,
- create a dedicated `DebugStateError` for raising instead of `assert`s
  when we have actual debug-request inconsistencies (as seem to be most
  likely with bg thread usage of `breakpoint()`).
- show the `open_crash_handler()` frame on `bdb.BdbQuit` (for now?)
2025-03-27 13:24:25 -04:00
Tyler Goodlet f8222356ce Hide `[maybe]_open_crash_handler()` frame by default 2025-03-27 13:24:25 -04:00
Tyler Goodlet 4b9d638be9 Use our `._post_mortem` from `open_crash_handler()`
Since it seems that `pdbp.xpm()` can sometimes lose the up-stack
traceback info/frames? Not sure why but ours seems to work just fine
from a `asyncio`-handler in `modden`'s use of `i3ipc` B)

Also call `DebugStatus.shield_sigint()` from `pause_from_sync()` in the
infected-`asyncio` case to get the same shielding behaviour as in all
other usage!
2025-03-27 13:24:25 -04:00
Tyler Goodlet 35ebc087dd Drop `asyncio_bp` loglevel setting by default 2025-03-27 13:24:25 -04:00
Tyler Goodlet 6b18fcd437 First draft, `asyncio`-task, sync-pausing Bo
Mostly due to magic from @oremanj where we slap in a little bit of
`.from_asyncio`-type stuff to run a `trio`-task from `asyncio.Task`
code!

I'm not gonna go into tooo too much detail but basically the primary
thing needed was a way to (blocking-ly) invoke a `trio.lowlevel.Task`
from an `asyncio` one (which we now have with a new
`run_trio_task_in_future()` thanks to draft code from the aforementioned
jefe) which we now invoke from a dedicated aio case-branch inside
`.devx._debug.pause_from_sync()`. Further include a case inside
`DebugStatus.release()` to handle using the same func to set the
`repl_release: trio.Event` from the aio side when releasing the REPL on
exit cmds.

Prolly more refinements to come ;{o
2025-03-27 13:24:25 -04:00
Tyler Goodlet 00d1c8ea29 Fix multi-daemon debug test `break` signal..
It was expecting `AssertionError` as a proceed-in-test signal (by
breaking from a continue loop), but `in_prompt_msg(raise_on_err=True)`
was changed to raise `ValueError`; so instead just use as a predicate
for the `break`.

Also rework `in_prompt_msg()` to accept the `child: BaseSpawn` as input
instead of `before: str` remove the casting boilerplate, and adjust all
usage to match.
2025-03-27 13:24:25 -04:00
Tyler Goodlet 8da7a1ca36 Use "sclang"-style syntax in `to_asyncio` task logging
Just like we've started doing throughout the rest of the actor runtime
for reporting (and where "sclang" = "structured conc (s)lang", our
little supervision-focused operations syntax i've been playing with in
log msg content).

Further tweaks:
- report the `trio_done_fute` alongside the `main_outcome` value.
- add a todo list for supporting `greenback` for pause points.
2025-03-27 13:24:25 -04:00
Tyler Goodlet 5cdfee3bcf Pass `infect_asyncio` setting via runtime-vars
The reason for this "duplication" with the `--asyncio` CLI flag (passed
to the child during spawn) is 2-fold:
- allows verifying inside `Actor._from_parent()` that the `trio` runtime was
  started via `.start_guest_run()` as well as if the
  `Actor._infected_aio` spawn-entrypoint value has been set (by the
  `._entry.<spawn-backend>_main()` whenever `--asyncio` is passed)
  such that any mismatch can be signaled via an `InternalError`.
- enables checking the `._state._runtime_vars['_is_infected_aio']` value
  directly (say from a non-actor/`trio`-thread) instead of calling
  `._state.current_actor(err_on_no_runtime=False)` in certain edge
  cases.

Impl/testing deats:
- add `._state._runtime_vars['_is_infected_aio'] = False` default.
- raise `InternalError` on any `--asyncio`-flag-passed vs.
  `_runtime_vars`-value-relayed-from-parent inside
  `Actor._from_parent()` and include a `Runner.is_guest` assert for good
  measure B)
- set and relay `infect_asyncio: bool` via runtime-vars to child in
  `ActorNursery.start_actor()`.
- verify `actor.is_infected_aio()`, `actor._infected_aio` and
  `_state._runtime_vars['_is_infected_aio']` are all set in test suite's
  `asyncio_actor()` endpoint.
2025-03-27 13:24:25 -04:00
Tyler Goodlet 64d506970a Officially test proto-ed `stackscope` integration
By re-purposing our `pexpect`-based console matching with a new
`debugging/shield_hang_in_sub.py` example, this tests a few "hanging
actor" conditions more formally:

- that despite a hanging actor's task we can dump
  a `stackscope.extract()` tree on relay of `SIGUSR1`.
- the actor tree will terminate despite a shielded forever-sleep by our
  "T-800" zombie reaper machinery activating and hard killing the
  underlying subprocess.

Some test deats:
- simulates the expect actions of a real user by manually using
  `os.kill()` to send both signals to the actor-tree program.
- `pexpect`-matches against `log.devx()` emissions under normal
  `debug_mode == True` usage.
- ensure we get the actual "T-800 deployed" `log.error()` msg and
  that the actor tree eventually terminates!

Surrounding (re-org/impl/test-suite) changes:
- allow disabling usage via a `maybe_enable_greenback: bool` to
  `open_root_actor()` but enable by def.
- pretty up the actual `.devx()` content from `.devx._stackscope`
  including be extra pedantic about the conc-primitives for each signal
  event.
- try to avoid double handles of `SIGUSR1` even though it seems the
  original (what i thought was a) problem was actually just double
  logging in the handler..
  |_ avoid double applying the handler func via `signal.signal()`,
  |_ use a global to avoid double handle func calls and,
  |_ a `threading.RLock` around handling.
- move common fixtures and helper routines from `test_debugger` to
  `tests/devx/conftest.py` and import them for use in both test mods.
2025-03-27 13:24:25 -04:00
Tyler Goodlet de7b114303 Start a new `tests/devx/` tooling-subsuite-pkg 2025-03-27 13:24:25 -04:00
Tyler Goodlet f195c5ec47 Move `mk_cmd()` to `._testing`
Since we're going to need it more generally for `.devx` sub-sys tooling
tests.

Also, up the sync-pause ctl-c delay another 10ms..
2025-03-27 13:24:25 -04:00
Tyler Goodlet 92713af63e Get multi-threaded sync-pausing fully workin!
The final issue was making sure we do the same thing on ctl-c/SIGINT
from the user. That is, if there's already a bg-thread in REPL, we
`log.pdb()` about SIGINT shielding and re-draw the prompt; the same UX
as normal actor-runtime-task behaviour.

Reasons this wasn't workin.. and the fix:
- `.pause_from_sync()` was overriding the local `repl` var with `None`
  delivered by (transitive) calls to `_pause(debug_func=None)`.. so
  remove all that and only assign it OAOO prior to thread-type case
  branching.
- always call `DebugStatus.shield_sigint()` as needed from all requesting
  threads/tasks:
  - in `_pause_from_bg_root_thread()` BEFORE calling `._pause()` AND BEFORE
    yielding back to the bg-thread via `.started(out)` to ensure we're
    definitely overriding the handler in the `trio`-main-thread task
    before unblocking the requesting bg-thread.
  - from any requesting bg-thread in the root actor such that both its
    main-`trio`-thread scheduled task (as per above bullet) AND it are
    SIGINT shielded.
  - always call `.shield_sigint()` BEFORE any `greenback._await()` case
    don't entirely grok why yet, but it works)?
  - for `greenback._await()` case always set `bg_task` to the current one..
- tweaks to the `SIGINT` handler, now renamed `sigint_shield()` so as
  not to name-collide with the methods when editor-searching:
  - always try to `repr()` the REPL thread/task "owner" as well as the
    active `PdbREPL` instance.
  - add `.devx()` notes around the prompt flushing deats and comments
    for any root-actor-bg-thread edge cases.

Related/supporting refinements:
- add `get_lock()`/`get_debug_req()` factory funcs since the plan is to
  eventually implement both as `@singleton` instances per actor.
- fix `acquire_debug_lock()`'s call-sig-bug for scheduling
  `request_root_stdio_lock()`..
- in `._pause()` only call `mk_pdb()` when `debug_func != None`.
- add some todo/warning notes around the `cls.repl = None` in
  `DebugStatus.release()`

`test_pause_from_sync()` tweaks:
- don't use a `attach_patts.copy()`, since we always `break` on match.
- do `pytest.fail()` on that ^ loop's fallthrough..
- pass `do_ctlc(child, patt=attach_key)` such that we always match the
  the current thread's name with the ctl-c triggered `.pdb()` emission.
- oh yeah, return the last `before: str` from `do_ctlc()`.
- in the script, flip `abandon_on_cancel=True` since when `False` it
  seems to cause `trio.run()` to hang on exit from the last bg-thread
  case?!?
2025-03-27 13:24:25 -04:00
Tyler Goodlet 4a08d586cd Another tweak to REPL entry `.pdb()` headers 2025-03-27 13:24:25 -04:00
Tyler Goodlet 607e1dcf45 More failed REPL-lock-request refinements
In `lock_stdio_for_peer()` better internal-error handling/reporting:
- only `Lock._blocked.remove(ctx.cid)` if that same cid was added on
  entry to avoid needless key-errors.
- drop all `Lock.release(force: bool)` usage remnants.
- if `req_ctx.cancel()` fails mention it with `ctx_err.add_note()`.
- add more explicit internal-failed-request log messaging via a new
  `fail_reason: str`.
- use and use new `x)<=\n|_` annots in any failure logging.

Other cleanups/niceties:
- drop `force: bool` flag entirely from the `Lock.release()`.
- use more supervisor-op-annots in `.pdb()` logging
  with both `_pause/crash_msg: str` instead of double '|' lines when
  `.pdb()`-reported from `._set_trace()`/`._post_mortem()`.
2025-03-27 13:24:25 -04:00
Tyler Goodlet b057a1681c Todo a test for sync-pausing from non-main-root-tasks 2025-03-27 13:24:25 -04:00
Tyler Goodlet 82bee3c55b Use `delay=0` in pump loop..
Turns out it does work XD

Prior presumption was from before I had the fute poll-loop so makes
sense we needed more then one sched-tick's worth of context switch vs.
now we can just keep looping-n-pumping as fast possible until the
guest-run's main task completes.

Also,
- minimize the preface commentary (as per todo) now that we have tests
  codifying all the edge cases :finger_crossed:
- parameter-ize the pump-loop-cycle delay and default it to 0.
2025-03-27 13:24:25 -04:00
Tyler Goodlet 4afab9ca47 Solve our abandonment issues..
To make the recent set of tests pass this (hopefully) finally solves all
`asyncio` embedded `trio` guest-run abandonment by ensuring we "pump the
event loop" until the guest-run future is fully complete.

Accomplished via simple poll loop of the form `while not
trio_done_fut.done(): await asyncio.sleep(.1)` in the `aio_main()`
task's exception teardown sequence. The loop does a naive 10ms
"pump-via-sleep & poll" for the `trio` side to complete before finally
exiting (and presumably raising) from the SIGINT cancellation.

Other related cleanups and refinements:
- use `asyncio.Task.result()` inside `cancel_trio()` since it also
  inline-raises any exception outcome and we can also log-report the
  result in non-error cases.
- comment out buncha not-sure-we-need-it stuff in `cancel_trio()`.
- remove the botched `AsyncioCancelled(CancelledError):` idea obvi XD
- comment `greenback` init for now in `aio_main()` since (pretty sure)
  we don't ever want to actually REPL in that specific func-as-task?
- always capture any `fute_err: BaseException` from the `main_outcome:
  Outcome` delivered by the `trio` side guest-run task.
- add and raise a new super noisy `AsyncioRuntimeTranslationError`
  whenever we detect that the guest-run `trio_done_fut` has not
  completed before task exit; should avoid abandonment issues ever
  happening again without knowing!
2025-03-27 13:24:25 -04:00
Tyler Goodlet 53409f2942 Demo-abandonment on shielded `trio`-side work
Finally this reproduces the issue as it (originally?) exhibited inside
`piker` where the `Actor.lifetime_stack` wasn't closed in cases where
during `infected_aio`-actor cancellation/shutdown `trio` side tasks
which are doing shielded (teardown) work are NOT being watched/waited on
from the `aio_main()` task-closure inside `run_as_asyncio_guest()`!

This is then the root cause of the guest-run being abandoned since if
our `aio_main()` task-closure doesn't know it should allow the run to
finish, it's going to call `loop.close()` eventually resulting in the
`GeneratorExit` thrown into `trio._core._run.unrolled_run()`..

So, this extends the `test_sigint_closes_lifetime_stack()` suite to
include cases for such shielded `trio`-task ops:
- add a new `trio_side_is_shielded: bool` which will toggle whether to
  add a shielded 0.5s `trio.sleep()` loop to `manage_file()` which
  should outlive the `asyncio` event-loop shutdown sequence and result
  in an abandoned guest-run and thus a leaked file.
- parametrize the existing suite with this case resulting in a total 16
  test set B)

This patch demonstrates the problem with our `aio_main()` task-closure
impl via the now 4 failing tests, a fix is coming in a follow up commit!
2025-03-27 13:24:25 -04:00
Tyler Goodlet 7f00921be1 Lel, revert `AsyncioCancelled` inherit, module..
Turns out it somehow breaks our `to_asyncio` error relay since obvi
`asyncio`'s runtime seems to specially handle it (prolly via
`isinstance()` ?) and it caused our
`test_aio_cancelled_from_aio_causes_trio_cancelled()` to hang..
Further, obvi `unpack_error()` won't be able to find the type def if not
kept inside `._exceptions`..

So given all that, revert the change/move as well as:
- tweak the aio-from-aio cancel test to timeout.
- do `trio.sleep()` conc with any bg aio task by moving out nursery
  block.
- add a `send_sigint_to: str` parameter to
  `test_sigint_closes_lifetime_stack()` such that we test the SIGINT
  being relayed to just the parent or the child.
2025-03-27 13:24:25 -04:00
Tyler Goodlet a9b3336318 Hack `asyncio` to not abandon a guest-mode run?
Took me a while to figure out what the heck was going on but, turns out
`asyncio` changed their SIGINT handling in 3.11 as per:

https://docs.python.org/3/library/asyncio-runner.html#handling-keyboard-interruption

I'm not entirely sure if it's the 3.11 changes or possibly wtv further
updates were made in 3.12  but more or less due to the way
our current main task was written the `trio` guest-run was getting
abandoned on SIGINTs sent from the OS to the infected child proc..

Note that much of the bug and soln cases are layed out in very detailed
comment-notes both in the new test and `run_as_asyncio_guest()`, right
above the final "fix" lines.

Add new `test_infected_aio.test_sigint_closes_lifetime_stack()` test suite
which reliably triggers all abandonment issues with multiple cases
of different parent behaviour post-sending-SIGINT-to-child:
 1. briefly sleep then raise a KBI in the parent which was originally
    demonstrating the file leak not being cleaned up by `Actor.lifetime_stack.close()`
    and simulates a ctl-c from the console (relayed in tandem by
    the OS to the parent and child processes).
 2. do `Context.wait_for_result()` on the child context which would
    hang and timeout since the actor runtime would never complete and
    thus never relay a `ContextCancelled`.
 3. both with and without running a `asyncio` task in the `manage_file`
    child actor; originally it seemed that with an aio task scheduled in
    the child actor the guest-run abandonment always was the "loud" case
    where there seemed to be some actor teardown but with tbs from
    python failing to gracefully exit the `trio` runtime..

The (seemingly working) "fix" required 2 lines of code to be run inside
a `asyncio.CancelledError` handler around the call to `await trio_done_fut`:
- `Actor.cancel_soon()` which schedules the actor runtime to cancel on
  the next `trio` runner cycle and results in a "self cancellation" of
  the actor.
- "pumping the `asyncio` event loop" with a non-0 `.sleep(0.1)` XD
 |_ seems that a "shielded" pump with some actual `delay: float >= 0`
   did the trick to get `asyncio` to allow the `trio` runner/loop to
   fully complete its guest-run without abandonment.

Other supporting changes:
- move `._exceptions.AsyncioCancelled`, our renamed
  `asyncio.CancelledError` error-sub-type-wrapper, to `.to_asyncio` and make
  it derive from `CancelledError` so as to be sure when raised by our
  `asyncio` x-> `trio` exception relay machinery that `asyncio` is
  getting the specific type it expects during cancellation.
- do "summary status" style logging in `run_as_asyncio_guest()` wherein
  we compile the eventual `startup_msg: str` emitted just before waiting
  on the `trio_done_fut`.
- shield-wait with `out: Outcome = await asyncio.shield(trio_done_fut)`
  even though it seems to do nothing in the SIGINT handling case..(I
  presume it might help avoid abandonment in a `asyncio.Task.cancel()`
  case maybe?)
2025-03-27 13:24:25 -04:00
goodboy 978691c668 Merge pull request 'Rework low-level-runtime to enforce a `msgspec`-defined, SC-supervision-protocol for IPC `Context`s' () from runtime_to_msgspec into main
Reviewed-kinda-on: 
2025-03-27 02:14:16 +00:00
Tyler Goodlet 4b92e14c92 Denoise duplicate chan logging for now 2025-03-24 14:04:52 -04:00
Tyler Goodlet dbff7e6cd0 Report any external-rent-task-canceller during msg-drain
As in whenever `Context.cancel()` is not (runtime internally) called
(i.e. `._cancel_called` is not set), we can attempt to detect the parent
`trio` nursery/cancel-scope that is the source. Emit the report with
a `.cancel()` level and attempt to repr in "sclang" form as well as
unhide the stack frame for debug/traceback-in.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 125876185d Add `indent: str` suport to `Context.pformat()` using `textwrap` 2025-03-24 14:04:52 -04:00
Tyler Goodlet 5ea324da5e Add `tb_hide: bool` ctl flag to `_open_and_supervise_one_cancels_all_nursery()` 2025-03-24 14:04:52 -04:00
Tyler Goodlet d1b4d4be52 Adjusts advanced fault tests to match new `TransportClosed` semantics 2025-03-24 14:04:52 -04:00
Tyler Goodlet 32f7742e53 Finally implement peer-lookup optimization..
There's a been a todo for soo long for this XD

Since all `Actor`'s store a set of `._peers` we can try a lookup on that
table as a shortcut before pinging the registry Bo

Impl deats:
- add a new `._discovery.get_peer_by_name()` routine which attempts the
  `._peers` lookup by combining a copy of that `dict` + an entry added
  for `Actor._parent_chan` (since all subs have a parent and often the
  desired contact is just that connection).
- change `.find_actor()` (for the `only_first == True` case),
  `.query_actor()` and `.wait_for_actor()` to call the new helper and
  deliver appropriate outputs if possible.

Other,
- deprecate `get_arbiter()` def and all usage in tests and examples.
- drop lingering use of `arbiter_sockaddr` arg to various routines.
- tweak the `Actor` doc str as well as some code fmting and a tweak to
  the `._stream_handler()`'s initial `con_status: str` logging value
  since the way it was could never be reached.. oh and `.warning()` on
  any new connections which already have a `_pre_chan: Channel` entry in
  `._peers` so we can start minimizing IPC duplications.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 46066c02e4 More-n-more scops annots in logging 2025-03-24 14:04:52 -04:00
Tyler Goodlet bac84a5e23 Quieter `Stop` handling on ctx result capture
In the `drain_to_final_msg()` impl, since a stream terminating
gracefully requires this msg, there's really no reason to `log.cancel()`
about it; go `.runtime()` level instead since we're trying de-noise
under "normal operation".

Also,
- passthrough `hide_tb` to taskc-handler's `ctx.maybe_raise()` call.
- raise `MessagingError` for the `MsgType` unmatched `case _:`.
- detail the doc string motivation a little more.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 950a2ec30f Use `._entry` proto-ed "lifetime ops" in logging
As per a WIP scribbled out TODO in `._entry.nest_from_op()`, change
a bunch of "supervisor/lifetime mgmt ops" related log messages to
contain some supervisor-annotation "headers" in an effort to give
a terser "visual indication" of how some execution/scope/storage
primitive entity (like an actor/task/ctx/connection) is being operated
on (like, opening/started/closed/cancelled/erroring) from a "supervisor
action" POV.

Also tweak a bunch more emissions to lower levels to reduce noise around
normal inter-actor operations like process and IPC ctx supervision.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 50e02295a9 Reraise RAEs in `MsgStream.receive()`; truncate tbs
To avoid showing lowlevel details of exception handling around the
underlying call to `return await self._ctx._pld_rx.recv_pld(ipc=self)`,
any time a `RemoteActorError` is unpacked (an raised locally) we re-raise
it directly from the captured `src_err` captured so as to present to
the user/app caller-code an exception raised directly from the `.receive()`
frame. This simplifies traceback call-stacks for any `log.exception()`
or `pdb`-REPL output filtering out the lower `PldRx` frames by default.
2025-03-24 14:04:52 -04:00
Tyler Goodlet cb998a2b2f Add `Portal.chan` property, to wrap `._chan` attr 2025-03-24 14:04:52 -04:00
Tyler Goodlet 71e8d466ae More formal `TransportClosed` reporting/raising
Since it was all ad-hoc defined inside
`._ipc.MsgpackTCPStream._iter_pkts()` more or less, this starts
formalizing a way for particular transport backends to indicate whether
a disconnect condition should be re-raised in the RPC msg loop and if
not what log level to report it at (if any).

Based on our lone transport currently we try to suppress any logging
noise from ephemeral connections expected during normal actor
interaction and discovery subsys ops:
- any short lived discovery related TCP connects are only logged as
  `.transport()` level.
- both `.error()` and raise on any underlying `trio.ClosedResource`
  cause since that normally means some task touched transport layer
  internals that it shouldn't have.
- do a `.warning()` on anything else unexpected.

Impl deats:
- extend the `._exceptions.TransportClosed` to accept an input log
  level, raise-on-report toggle and custom reporting & raising via a new
  `.report_n_maybe_raise()` method.
- construct the TCs with inputs per case in (the newly named) `._iter_pkts().
- call ^ this method from the `TransportClosed` handler block inside the
  RPC msg loop thus delegating reporting levels and/or raising to the
  backend's per-case TC instantiating.

Related `._ipc` changes:
- mask out all the `MsgpackTCPStream._codec` debug helper stuff and drop
  any lingering cruft from the initial proto-ing of msg-codecs.
- rename some attrs/methods:
  |_`MsgpackTCPStream._iter_packets()` -> `._iter_pkts()` and
    `._agen` -> `_aiter_pkts`.
  |_`Channel._aiter_recv()` -> `._aiter_msgs()` and
    `._agen` -> `_aiter_msgs`.
- add `hide_tb: bool` support to `Channel.send()` and only show the
  frame on non-MTEs.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 6cd19c408e Refine some `.trionics` docs and logging
- allow passing and report the lib name (`trio` or `tractor`) from
  `maybe_open_nursery()`.
- use `.runtime()` level when reporting `_Cache`-hits in
  `maybe_open_context()`.
- tidy up some doc strings.
2025-03-24 14:04:52 -04:00
Tyler Goodlet a796fb7103 Woops, set `.cancel()` level in custom levels table.. 2025-03-24 14:04:52 -04:00
Tyler Goodlet 0332604044 (Re)type annot some tests
- For the (still not finished) `test_caps_based_msging`, switch to
  using the new `PayloadMsg`.
- add `testdir` fixture type.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 90bd757b48 Use `msgspec.Struct.__repr__()` failover impl
In case the struct doesn't import a field type (which will cause the
`.pformat()` to raise) just report the issue and try to fall back to the
original `repr()` version.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 0263599cef Don't use pretty struct stuff in `._invoke`
It's too fragile to put in side core RPC machinery since
`msgspec.Struct` defs can fail if a field type can't be
looked up at creation time (like can easily happen if you
conditionally import using `if TYPE_CHECKING:`)

Also,
- rename `cs` to `rpc_ctx_cs: CancelScope` since it's literally
  the wrapping RPC `Context._scope`.
- report self cancellation via `explain: str` and add tail case for
  "unknown cause".
- put a ?TODO? around what to do about KBIs if a context is opened
  from an `infected_aio`-actor task.
- similar to our nursery and portal add TODO list for moving all
  `_invoke_non_context()` content out the RPC core and instead implement
  them as `.hilevel` endpoint helpers (maybe as decorators?)which under
  neath define `@context`-funcs.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 96960982ff Update `._entry` actor status log
Log-report the different types of actor exit conditions including cancel
via KBI, error or normal return with varying levels depending on case.

Also, start proto-ing out this weird ascii-syntax idea for describing
conc system states and implement the first bit in a `nest_from_op()`
log-message fmter that joins and indents an obj `repr()` with
a tree-like `'>)\n|_'` header.
2025-03-24 14:04:52 -04:00
Tyler Goodlet c7f153c266 Update `MsgTypeError` content matching to latest 2025-03-24 14:04:52 -04:00
Tyler Goodlet 8ff682440d Further formalize `greenback` integration
Since we more or less require it for `tractor.pause_from_sync()` this
refines enable toggles and their relay down the actor tree as well as
more explicit logging around init and activation.

Tweaks summary:
- `.info()` report the module if discovered during root boot.
- use a `._state._runtime_vars['use_greenback']: bool` activation flag
  inside `Actor._from_parent()` to determine if the sub should try to
  use it and set to `False` if mod-loading fails / not installed.
- expose `maybe_init_greenback()` from `.devx` sugpkg.
- comment out RTE in `._pause()` for now since we already have it in
  `.pause_from_sync()`.
- always `.exception()` on `maybe_init_greenback()` import errors to
  clarify the underlying failure deats.
- always explicitly report if `._state._runtime_vars['use_greenback']`
  was NOT set when `.pause_from_sync()` is called.

Other `._runtime.async_main()` adjustments:
- combine the "internal error call ur parents" message and the failed
  registry contact status into one new `err_report: str`.
- drop the final exception handler's call to
  `Actor.lifetime_stack.close()` since we're already doing it in the
  `finally:` block and the earlier call has no currently known benefit.
- only report on the `.lifetime_stack()` callbacks if any are detected
  as registered.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 7db5bbffc5 Always reset `._state._ctxvar_Context` to prior
Not sure how I forgot this but, obviously it's correct context-var
semantics to revert the current IPC `Context` (set in the latest
`.open_context()` block) such that any prior instance is reset..

This ensures the sanity `assert`s pass inside
`.msg._ops.maybe_limit_plds()` and just in general ensures for any task
that the last opened `Context` is the one returned from
`current_ipc_ctx()`.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 59fa9dc452 Prep for legacy RPC API factor-n-remove
This change is adding commentary about the upcoming API removal and
simplification of nursery + portal internals; no actual code changes are
included.

The plan to (re)move the old RPC methods:
- `ActorNursery.run_in_actor()`
- `Portal.run()`
- `Portal.run_from_ns()`

and any related impl internals out of each conc-primitive and instead
into something like a `.hilevel.rpc` set of APIs which then are all
implemented using the newer and more lowlevel `Context`/`MsgStream`
primitives instead Bo

Further,
- formally deprecate the `Portal.result()` meth for
  `.wait_for_result()`.
- only `log.info()` about runtime shutdown in the implicit root case.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 6b1558b675 Add a `Context.portal`, more cancel tooing
Might as well add a public maybe-getter for use on the "parent" side
since it can be handy to check out-of-band cancellation conditions (like
from `Portal.cancel_actor()`).

Buncha bitty tweaks for more easily debugging cancel conditions:
- add a `@.cancel_called.setter` for hooking into `.cancel_called = True`
  being set in hard to decipher "who cancelled us" scenarios.
- use a new `self_ctxc: bool` var in `.cancel()` to capture the output
  state from `._is_self_cancelled(remote_error)` at call time so it can
  be compared against the measured value at crash-time (when REPL-ing it
  can often have already changed due to runtime teardown sequencing vs.
  the crash handler hook entry).
- proxy `hide_tb` to `.drain_to_final_msg()` from `.wait_for_result()`.
- use `remote_error.sender` attr directly instead of through
  `RAE.msgdata: dict` lookup.
- change var name `our_uid` -> `peer_uid`; it's not "ours"..

Other various docs/comment updates:
- extend the main class doc to include some other name ideas.
- change over all remaining `.result()` refs to `.wait_for_result()`.
- doc more details on how we want `.outcome` to eventually signature.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 548fbe725b Flip `infected_asyncio` status msg to `.runtime()` 2025-03-24 14:04:52 -04:00
Tyler Goodlet f64447148e Avoid actor-nursery-exit warns on registrees
Since a local-actor-nursery-parented subactor might also use the root as
its registry, we need to avoid warning when short lived IPC `Channel`
connections establish and then disconnect (quickly, bc the apparently
the subactor isn't re-using an already cached parente-peer<->child conn
as you'd expect efficiency..) since such cases currently considered
normal operation of our super shoddy/naive "discovery sys" XD

As such, (un)guard the whole local-actor-nursery OR channel-draining
waiting blocks with the additional `or Actor._cancel_called` branch
since really we should also be waiting on the parent nurse to exit (at
least, for sure and always) when the local `Actor` indeed has been
"globally" cancelled-called. Further add separate timeout warnings for
channel-draining vs. local-actor-nursery-exit waiting since they are
technically orthogonal cases (at least, afaik).

Also,
- adjust the `Actor._stream_handler()` connection status log-emit to
  `.runtime()`, especially to reduce noise around the aforementioned
  ephemeral registree connection-requests.
- if we do wait on a local actor-nurse to exit, report its `._children`
  table (which should help figure out going forward how useful the
  warning is, if at all).
2025-03-24 14:04:52 -04:00
Tyler Goodlet b0f0971ad4 Change `_Cache` reuse emit to `.runtime()` 2025-03-24 14:04:52 -04:00
Tyler Goodlet 3b056fd761 Expand `PayloadMsg` doc-str 2025-03-24 14:04:52 -04:00
Tyler Goodlet 3246b3a3bc Break `_mk_msg_type_err()` into recv/send side funcs
Name them `_mk_send_mte()`/`_mk_recv_mte()` and change the runtime to
call each appropriately depending on location/usage.

Also add some dynamic call-frame "unhide" blocks such that when we
expect raised MTE from the aboves calls but we get a different
unexpected error from the runtime, we ensure the call stack downward is
shown in tbs/pdb.
|_ ideally in the longer run we come up with a fancier dynamic sys for
   this, prolly something in `.devx._frame_stack`?
2025-03-24 14:04:52 -04:00
Tyler Goodlet 3613c37a6f Don't pass `ipc_msg` for send side MTEs
Just pass `_bad_msg` such that it get's injected to `.msgdata` since
with a send-side `MsgTypeError` we don't have a remote `._ipc_msg:
Error` per say to include.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 00dbf55fd3 Add note about using `@acm` as decorator in 3.10 2025-03-24 14:04:52 -04:00
Tyler Goodlet 89c2137fc9 Update pld-rx limiting test(s) to use deco input
The tests only use one input spec (conveniently) so there's not much to
change in the logic,
- only pass the `maybe_msg_spec` to the child-side decorator and obvi
  drop the surrounding `msgops.limit_plds()` block in the child.
- tweak a few `MsgDec` asserts, mostly dropping the
  `msg._ops._def_any_spec` state checks since the child-side won't have
  any pre pld-spec state given the runtime now applies the `pld_spec`
  before running the task's func body.
  - also allowed dropping the `finally:` which did a similar check
    outside the `.limit_plds()` block.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 535fd06f73 Proxy through `dec_hook` in `.limit_plds()` APIs 2025-03-24 14:04:52 -04:00
Tyler Goodlet 097101f8d3 Port debug request ep to use `@context(pld_spec)`
Namely passing the `.__pld_spec__` directly to the
`lock_stdio_for_peer()` decorator B)

Also, allows dropping `apply_debug_pldec()` (which was a todo) and
removing a `lock_stdio_for_peer()` indent level.
2025-03-24 14:04:52 -04:00
Tyler Goodlet b8d37060ec Offer a `@context(pld_spec=<TypeAlias>)` API
Instead of the WIP/prototyped `Portal.open_context()` offering
a `pld_spec` input arg, this changes to a proper decorator API for
specifying the "payload spec" on `@context` endpoints.

The impl change details actually cover 2-birds:
- monkey patch decorated functions with a new
  `._tractor_context_meta: dict[str, Any]` and insert any provided input
  `@context` kwargs: `_pld_spec`, `enc_hook`, `enc_hook`.
- use `inspect.get_annotations()` to scan for a `func` arg
  type-annotated with `tractor.Context` and use the name of that arg as
  the RPC task-side injected `Context`, thus injecting the needed arg
  by type instead of by name (a longstanding TODO); raise a type-error
  when not found.
- pull the `pld_spec` from the `._tractor_context_meta` attr both in the
  `.open_context()` parent-side and child-side `._invoke()`-cation of
  the RPC task and use the `msg._ops.maybe_limit_plds()` API to apply it
  internally in the runtime for each case.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 0ffb4f0db1 Log tbs from non-RAE `._invoke()`-RPC-task errors
`RemoteActorError`s show this by default in their `.__repr__()`, and we
obvi capture and embed the src traceback in an `Error` msg prior to
transit, but for logging it's also handy to see the tb of any set
`Context._remote_error` on console especially when trying to decipher
remote error details at their origin actor. Also improve the log message
description using `ctx.repr_state` and show any `ctx.outcome`.
2025-03-24 14:04:52 -04:00
Tyler Goodlet c10c34d717 Add `@context(pld_spec=<TypeAlias>)` TODO list
Longer run we don't want `tractor` app devs having to call
`msg._ops.limit_plds()` from every child endpoint.. so this starts
a list of decorator API ideas and obviously ties in with an ideal final
API design that will come with py3.13 and typed funcs. Obviously this
is directly fueled by,

- https://github.com/goodboy/tractor/issues/365

Other,
- type with direct `trio.lowlevel.Task` import.
- use `log.exception()` to show tbs for all error-terminations in
  `.open_context()` (for now) and always explicitly mention the `.side`.
2025-03-24 14:04:52 -04:00
Tyler Goodlet ad28f0c9b3 Use `_debug._sync_pause_from_builtin()` as `breakpoint()` override 2025-03-24 14:04:52 -04:00
Tyler Goodlet f83e06d371 Use new `._debug._repl_fail_msg` inside `test_pause_from_sync` 2025-03-24 14:04:52 -04:00
Tyler Goodlet 6a5d33b7ed Make big TODO: for `devx._debug` refinements
Hopefully would make grok-ing this fairly sophisticated sub-sys possible
for any up-and-coming `tractor` hacker

XD

A lot of internal API and re-org ideas I discovered/realized as part of
finishing the `__pld_spec__` and multi-threaded support. Particularly
better isolation between root-actor vs subactor task APIs and generally
less globally-state-ful stuff like `DebugStatus` and `Lock` method APIs
would likely make a lot of the hard to follow edge cases more clear?
2025-03-24 14:04:52 -04:00
Tyler Goodlet 31cc33c66c First proto: multi-threaded synced `pdb`-REPLs
Functionally working for multi-threaded (via cpython threads spawned
from `to_trio.to_thread.run_sync()`) alongside subactors, tested (for
now) only with threads started inside the root actor (which seemed to
have the most issues in terms of the impl and special cases..) using the
new `tractor.pause_from_sync()` API!

Main implementation changes to `.pause_from_sync()`
------ - ------
- from the root actor, we need to ensure bg thread case is handled
  *specially* since no IPC is used to request the TTY stdio mutex and
  `Lock` (API) usage is conducted entirely from a local task or thread;
  dedicated `Lock` usage for the root-actor already is branched inside
  `._pause()` and needs similar handling from a root bg-thread:
 |_for the special case of a root bg thread we need to
   `trio`-main-thread schedule a bg task inside a new
   `_pause_from_bg_root_thread()`. The new task needs to implement most
   of what was is handled inside `._pause()` manually, mostly because in
   this root-actor-bg-thread case we have 2 constraints:
   1. to enter `PdbREPL.interaction()` **from the bg thread** directly,
   2. the task that `Lock._debug_lock.acquire()`s has to be the same
      that calls `.release() (a `trio.FIFOLock` constraint)
 |_impl deats of this `_pause_from_bg_root_thread()` include:
   - (for now) calling `._pause()` to acquire the `Lock._debug_lock`.
   - setting its own `DebugStatus.repl_release`.
   - calling `.DebugStatus.shield_sigint()` to ensure the root's
     main thread  uses the right handler when the bg one is REPL-ing.
   - wait manually on the `.repl_release()` to be set by the thread's
     dedicated `PdbREPL` exit.
   - manually calling `Lock.release()` from the **same task** that
     acquired it.
- expect calls to `._pause()` to deliver a `tuple[Task, PdbREPL]` such
  that we always get the handle both to any newly created REPl instance
  and the (maybe) the scheduled bg task within which is runs.
- add a single `message: str` style to `log.devx()` based on branching
  style for logging.
- ensure both `DebugStatus.repl` and `.repl_task` are set **just
  before** calling `._set_trace()` to ensure the correct `Task|Thread`
  is set when the REPL is finally entered from sync code.
- add a wrapping caller `_sync_pause_from_builtin()` which passes in the
  new `called_from_builtin=True` to indicate `breakpoint()` caller
  usage, obvi pass in `api_frame`.

Changes to `._pause()` in support of ^
------ - ------
- `TaskStatus.started()` and return the `tuple[Task, PdbREPL]` to
  callers / starters.
- only call `DebugStatus.shield_sigint()` when no `repl` passed bc some
  callers (like bg threads) may need to apply it at some specific point
  themselves.
- tweak some asserts for the `debug_func == None` / non-`trio`-thread
  case.
- add a mod-level `_repl_fail_msg: str` to be used when there's an
  internal `._pause()` failure for testing, easier to pexpect match.
- more comprehensive logging for the root-actor branched case to
  (attempt to) indicate any of the 3 cases:
  - remote ctx from subactor has the `Lock`,
  - already existing root task or thread has it or,
  - some kinda stale `.locked()` situation where the root has the lock
    but we don't know why.
- for root usage, revert to always `await Lock._debug_lock.acquire()`-ing
  despite `called_from_sync` since `.pause_from_sync()` was reworked to
  instead handle the special bg thread case in the new
  `_pause_from_bg_root_thread()` task.
- always do `return _enter_repl_sync(debug_func)`.
- try to report any `repl_task: Task|Thread` set by the caller
  (particularly for the bg thread cases) as being the thread or task
  `._pause()` was called "on behalf of"

Changes to `DebugStatus`/`Lock` in support of ^
------ - ------
- only call `Lock.release()` from `DebugStatus.set_[quit/continue]()`
  when called from the main `trio` thread and always call
  `DebugStatus.release()` **after** to ensure `.repl_released()` is set
  **after** `._debug_lock.release()`.
- only call `.repl_release.set()` from `trio` thread otherwise use
  `.from_thread.run()`.
- much more refinements in `Lock.release()` for threading cases:
  - return `bool` to indicate whether lock was released by caller.
  - mask (in prep to drop) `_pause()` usage of
    `Lock.release.force=True)` since forcing a release can't ever avoid
    the RTE from `trio`.. same task **must** acquire/release.
  - don't allow usage from non-`trio`-main-threads, ever; there's no
    point since the same-task-needs-to-manage-`FIFOLock` constraint.
  - much more detailed logging using `message`-building-style for all
    caller (edge) cases.
   |_ use a `we_released: bool` to determine failed-to-release edge
      cases which can happen if called from bg threads, ensure we
      `log.exception()` on any incorrect usage resulting in  release
      failure.
   |_ complain loudly if the release fails and some other task/thread
      still holds the lock.
   |_ be explicit about "who" (which task or thread) the release is "on
      behalf of" by reading `DebugStatus.repl_task` since the caller
      isn't the REPL operator in many sync cases.
  - more or less drop `force` support, as mentioned above.
  - ensure we unset `._owned_by_root` if the caller is a root task.

Other misc
------ - ------
- rename `lock_tty_for_child()` -> `lock_stdio_for_peer()`.
- rejig `Lock.repr()` to show lock and event stats.
- stage `Lock.stats` and `.owner` methods in prep for doing a singleton
  instance and `@property`s.
2025-03-24 14:04:52 -04:00
Tyler Goodlet ad44d59f3d Drop thread logging to make `log.pdb()` patts match in test 2025-03-24 14:04:52 -04:00
Tyler Goodlet 2f1a97e73e Catch `.pause_from_sync()` in root bg thread bugs!
Originally discovered as while using `tractor.pause_from_sync()`
from the `i3ipc` client running in a bg-thread that uses `asyncio`
inside `modden`.

Turns out we definitely aren't correctly handling `.pause_from_sync()`
from the root actor when called from a `trio.to_thread.run_sync()`
bg thread:
- root-actor bg threads which can't `Lock._debug_lock.acquire()` since
  they aren't in `trio.Task`s.
- even if scheduled via `.to_thread.run_sync(_debug._pause)` the
  acquirer won't be the task/thread which calls `Lock.release()` from
  `PdbREPL` hooks; this results in a RTE raised by `trio`..
- multiple threads will step on each other's stdio since cpython's GIL
  seems to ctx switch threads on every input from the user to the REPL
  loop..

Reproduce via reworking our example and test so that they catch and fail
for all edge cases:
- rework the `/examples/debugging/sync_bp.py` example to demonstrate the
  above issues, namely the stdio clobbering in the REPL when multiple
  threads and/or a subactor try to debug simultaneously.
  |_ run one thread using a task nursery to ensure it runs conc with the
     nursery's parent task.
  |_ ensure the bg threads run conc a subactor usage of
     `.pause_from_sync()`.
  |_ gravely detail all the special cases inside a TODO comment.
  |_ add some control flags to `sync_pause()` helper and don't use
     `breakpoint()` by default.
- extend and adjust `test_debugger.test_pause_from_sync` to match (and
  thus currently fail) by ensuring exclusive `PdbREPL` attachment when
  the 2 bg root-actor threads are concurrently interacting alongside the
  subactor:
  |_ should only see one of the `_pause_msg` logs at a time for either
     one of the threads or the subactor.
  |_ ensure each attaches (in no particular order) before expecting the
     script to exit.

Impl adjustments to `.devx._debug`:
- drop `Lock.repl`, no longer used.
- add `Lock._owned_by_root: bool` for the `.ctx_in_debug == None`
  root-actor-task active case.
- always `log.exception()` for any `._debug_lock.release()` ownership
  RTE emitted by `trio`, like we used to..
- add special `Lock.release()` log message for the stale lock but
  `._owned_by_root == True` case; oh yeah and actually
  `log.devx(message)`..
- rename `Lock.acquire()` -> `.acquire_for_ctx()` since it's only ever
  used from subactor IPC usage; well that and for local root-task
  usage we should prolly add a `.acquire_from_root_task()`?
- buncha `._pause()` impl improvements:
 |_ type `._pause()`'s `debug_func` as a `partial` as well.
 |_ offer `called_from_sync: bool` and `called_from_bg_thread: bool`
    for the special case handling when called from `.pause_from_sync()`
 |_ only set `DebugStatus.repl/repl_task` when `debug_func != None`
   (OW ensure the `.repl_task` is not the current one).
 |_ handle error logging even when `debug_func is None`..
 |_ lotsa detailed commentary around root-actor-bg-thread special cases.
- when `._set_trace(hide_tb=False)` do `pdbp.set_trace(frame=currentframe())`
  so the `._debug` internal frames are always included.
- by default always hide tracebacks for `.pause[_from_sync]()` internals.
- improve `.pause_from_sync()` to avoid root-bg-thread crashes:
 |_ pass new `called_from_xxx_` flags and ensure `DebugStatus.repl_task`
    is actually set to the `threading.current_thread()` when needed.
 |_ manually call `Lock._debug_lock.acquire_nowait()` for the non-bg
    thread case.
 |_ TODO: still need to implement the bg-thread case using a bg
    `trio.Task`-in-thread with an `trio.Event` set by thread REPL exit.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 18b4618b5f Move `Context.open_stream()` impl to `._streaming`
Exactly like how it's organized for `Portal.open_context()`, put the
main streaming API `@acm` with the `MsgStream` code and bind the method
to the new module func.

Other,
- rename `Context.result()` -> `.wait_for_result()` to better match the
  blocking semantics and rebind `.result()` as deprecated.
- add doc-str for `Context.maybe_raise()`.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 54386900e0 Use `Context` repr APIs for RPC outcome logs
Delegate to the new `.repr_state: str` and adjust log level based on
error vs. cancel vs. result.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 028bc3256f Drop sub-decoder proto-cruft from `.msg._codec`
It ended up getting necessarily implemented as the `PldRx` though at
a different layer and won't be needed as part of `MsgCodec` most likely,
though this original idea did provide the source of inspiration for how
things work now!

Also Move the commented TODO proto for a codec hook factory from
`.types` to `._codec` where it prolly better fits and update some msg
related todo/questions.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 4bc7569981 Woops, set `post_mortem=False` by default again! 2025-03-24 14:04:52 -04:00
Tyler Goodlet 15a47dc4f7 Finally, officially support shielded REPL-ing!
It's been a long time prepped and now finally implemented!

Offer a `shield: bool` argument from our async `._debug` APIs:
- `await tractor.pause(shield=True)`,
- `await tractor.post_mortem(shield=True)`

^-These-^ can now be used inside cancelled `trio.CancelScope`s,
something very handy when introspecting complex (distributed) system
tear/shut-downs particularly under remote error or (inter-peer)
cancellation conditions B)

Thanks to previous prepping in a prior attempt and various patches from
the rigorous rework of `.devx._debug` internals around typed msg specs,
there ain't much that was needed!

Impl deats
- obvi passthrough `shield` from the public API endpoints (was already
  done from a prior attempt).
- put ad-hoc internal `with trio.CancelScope(shield=shield):` around all
  checkpoints inside `._pause()` for both the root-process and subactor
  case branches.

Add a fairly rigorous example, `examples/debugging/shielded_pause.py`
with a wrapping `pexpect` test, `test_debugger.test_shield_pause()` and
ensure it covers as many cases as i can think of offhand:

- multiple `.pause()` entries in a loop despite parent scope
  cancellation in a subactor RPC task which itself spawns a sub-task.
- a `trio.Nursery.parent_task` which raises, is handled and
  tries to enter and unshielded `.post_mortem()`, which of course
  internally raises `Cancelled` in a `._pause()` checkpoint, so we catch
  the `Cancelled` again and then debug the debugger's internal
  cancellation with specific checks for the particular raising
  checkpoint-LOC.
- do ^- the latter -^ for both subactor and root cases to ensure we
  can debug `._pause()` itself when it tries to REPL engage from
  a cancelled task scope Bo
2025-03-24 14:04:52 -04:00
Tyler Goodlet d98f06314d Rename `PldRx.dec_msg()` -> `.decode_pld()`
Keep the old alias, but i think it's better form to use longer names for
internal public APIs and this name better reflects the functionality:
decoding and returning a `PayloadMsg.pld` field.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 5bab7648e2 Add a `tractor.post_mortem()` API test + example
Since turns out we didn't have a single example using that API Bo

The test granular-ly checks all use cases:
- `.post_mortem()` manual calls in both subactor and root.
- ensuring built-in RPC crash handling activates after each manual one
  from ^.
- drafted some call-stack frame checking that i commented out for now
  since we need to first do ANSI escape code removal due to the
  colorization that `pdbp` does by default.
  |_ added a TODO with SO link on `assert_before()`.

Also todo-staged a shielded-pause test to match with the already
existing-but-needs-refinement example B)
2025-03-24 14:04:52 -04:00
Tyler Goodlet d099466d21 Change `reraise` to `post_mortem: bool` in `maybe_expect_raises()` 2025-03-24 14:04:52 -04:00
Tyler Goodlet 1c00668d20 Always `.exception()` in `try_ship_error_to_remote()` on internal error 2025-03-24 14:04:52 -04:00
Tyler Goodlet d51c19fe3d Pass `boxed_type` from `_mk_msg_type_err()`
Such that we're boxing the interchanged lib's specific error
`msgspec.ValidationError` in this case) type much like how
a `ContextCancelled[trio.Cancelled]` is composed; allows for seemless
multi-backend-codec support later as well B)

Pass `ctx.maybe_raise(from_src_exc=src_err)` where needed in a couple
spots; as `None` in the send-side `Started` MTE case to avoid showing
the `._scope1.cancel_called` result in the traceback from the
`.open_context()` child-sync phase.
2025-03-24 14:04:52 -04:00
Tyler Goodlet b9ae41a161 Add `from_src_exc: BaseException` to maybe raisers
That is as a control to `Context._maybe_raise_remote_err()` such that
if set to anything other then the default (`False` value), we do
`raise remote_error from from_src_exc` such that caller can choose to
suppress or override the `.__cause__` tb.

Also tidy up and old masked TODO regarding calling `.maybe_raise()`
after the caller exits from the `yield` in `.open_context()`..
2025-03-24 14:04:52 -04:00
Tyler Goodlet 2e522d003f Better RAE `.pformat()`-ing for send-side MTEs
Send-side `MsgTypeError`s actually shouldn't have any "boxed" traceback
per say since they're raised in the transmitting actor's local task env
and we (normally) don't want the ascii decoration added around the
error's `._message: str`, that is not until the exc is `pack_error()`-ed
before transit. As such, the presentation of an embedded traceback (and
its ascii box) gets bypassed when only a `._message: str` is set (as we
now do for pld-spec failures in `_mk_msg_type_err()`).

Further this tweaks the `.pformat()` output to include the `._message`
part to look like `<RemoteActorError( <._message> ) ..` instead of
jamming it implicitly to the end of the embedded `.tb_str` (as was done
implicitly by `unpack_error()`) and also adds better handling for the
`with_type_header == False` case including forcing that case when we
detect that the currently handled exc is the RAE in `.pformat()`.
Toss in a lengthier doc-str explaining it all.

Surrounding/supporting changes,
- better `unpack_error()` message which just briefly reports the remote
  task's error type.
- add public `.message: str` prop.
- always set a `._extra_msgdata: dict` since some MTE props rely on it.
- handle `.boxed_type == None` for `.boxed_type_str`.
- maybe pack any detected input or `exc.message` in `pack_error()`.
- comment cruft cleanup in `_mk_msg_type_err()`.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 56a46b1bf0 Add `Error.message: str`
Allows passing a custom error msg other then the traceback-str over the
wire. Make `.tb_str` optional (in the blank `''` sense) since it's
treated that way thus far in `._exceptions.pack_error()`.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 830df00152 Fix missing newline in task-cancel log-message 2025-03-24 14:04:52 -04:00
Tyler Goodlet 4b3c6b7e39 Don't need to pack an `Error` with send-side MTEs 2025-03-24 14:04:51 -04:00
Tyler Goodlet 4b843d6219 Ensure only a boxed traceback for MTE on parent side 2025-03-24 14:04:51 -04:00
Tyler Goodlet fa2893cc87 Ensure ctx error-state matches the MTE scenario
Namely checking that `Context._remote_error` is set to the raised MTE
in the invalid started and return value cases since prior to the recent
underlying changes to the `Context.result()` impl, it would not match.

Further,
- do asserts for non-MTE raising cases in both the parent and child.
- add todos for testing ctx-outcomes for per-side-validation policies
  i anticipate supporting and implied msg-dialog race cases therein.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 4d935dcfb0 Raise remote errors rxed during `Context` child-sync
More specifically, if `.open_context()` is cancelled when awaiting the
first `Context.started()` during the child task sync phase, check to see
if it was due to `._scope.cancel_called` and raise any remote error via
`.maybe_raise()` instead the `trio.Cancelled` like in every other
remote-error handling case. Ensure we set `._scope[_nursery]` only after
the `Started` has arrived and audited.
2025-03-24 14:04:51 -04:00
Tyler Goodlet b3387aca61 Don't (noisly) log about runtime cancel RPC tasks
Since in the case of the `Actor._cancel_task()` related runtime eps we
actually don't EVER register them in `Actor._rpc_tasks`.. logging about
them is just needless noise, though maybe we should track them in a diff
table; something like a `._runtime_rpc_tasks`?

Drop the cancel-request-for-stale-RPC-task (`KeyError` case in
`Actor._cancel_task()`) log-emit level in to `.runtime()`; it's
generally not useful info other then for granular race condition eval
when hacking the runtime.
2025-03-24 14:04:51 -04:00
Tyler Goodlet a0091b77d8 Raise send-side MTEs inline in `PldRx.dec_msg()`
So when `is_started_send_side is True` we raise the newly created
`MsgTypeError` (MTE) directly instead of doing all the `Error`-msg pack
and unpack to raise stuff via `_raise_from_unexpected_msg()` since the
raise should happen send side anyway and so doesn't emulate any remote
fault like in a bad `Return` or `Started` without send-side pld-spec
validation.

Oh, and proxy-through the `hide_tb: bool` input from `.drain_to_final_msg()`
to `.recv_msg_w_pld()`.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 7bde00d711 Set remote errors in `_raise_from_unexpected_msg()`
By calling `Context._maybe_cancel_and_set_remote_error(exc)` on any
unpacked `Error` msg; provides for `Context.maybe_error` consistency to
match all other error delivery cases.
2025-03-24 14:04:51 -04:00
Tyler Goodlet b992ff73da Factor `.started()` validation into `.msg._ops`
Filling out the helper `validate_payload_msg()` staged in a prior commit
and adjusting all imports to match.

Also add a `raise_mte: bool` flag for potential usage where the caller
wants to handle the MTE instance themselves.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 9dc7602f21 Fix `test_basic_payload_spec` bad msg matching
Expecting `Started` or `Return` with respective bad `.pld` values
depending on what type of failure is test parametrized.

This makes the suite run green it seems B)
2025-03-24 14:04:51 -04:00
Tyler Goodlet eaddde94c5 Drop `msg.types.Msg` for new replacement types
The `TypeAlias` for the msg type-group is now `MsgType` and any user
touching shuttle messages can now be typed as `PayloadMsg`.

Relatedly, add MTE specific `Error._bad_msg[_as_dict]` fields which are
handy for introspection of remote decode failures.
2025-03-24 14:04:51 -04:00
Tyler Goodlet a520951928 Parameterize the `return_msg_type` in `._invoke()`
Since we also handle a runtime-specific `CancelAck`, allow the
caller-scheduler to pass in the expected return-type msg per the RPC msg
endpoint loop.
2025-03-24 14:04:51 -04:00
Tyler Goodlet cbd47d800e Add `MsgTypeError` "bad msg" capture
Such that if caught by user code and/or the runtime we can introspect
the original msg which caused the type error. Previously this was kinda
half-baked with a `.msg_dict` which was delivered from an `Any`-decode
of the shuttle msg in `_mk_msg_type_err()` but now this more explicitly
refines the API and supports both `PayloadMsg`-instance or the msg-dict
style injection:
- allow passing either of `bad_msg: PayloadMsg|None` or
  `bad_msg_as_dict: dict|None` to `MsgTypeError.from_decode()`.
- expose public props for both ^ whilst dropping prior `.msgdict`.
- rework `.from_decode()` to explicitly accept `**extra_msgdata: dict`
  |_ only overriding it from any `bad_msg_as_dict` if the keys are found in
    `_ipcmsg_keys`, **except** for `_bad_msg` when `bad_msg` is passed.
  |_ drop `.ipc_msg` passthrough.
  |_ drop `msgdict` input.
- adjust `.cid` to only pull from the `.bad_msg` if set.

Related fixes/adjustments:
- `pack_from_raise()` should pull `boxed_type_str` from
  `boxed_type.__name__`, not the `type()` of it.. also add a
  `hide_tb: bool` flag.
- don't include `_msg_dict` and `_bad_msg` in the `_body_fields` set.
- allow more granular boxed traceback-str controls:
  |_ allow passing a `tb_str: str` explicitly in which case we use it
    verbatim and presume caller knows what they're doing.
  |_ when not provided, use the more explicit
    `traceback.format_exception(exc)` since the error instance is
    a required input (we still fail back to the old `.format_exc()` call
    if for some reason the caller passes `None`; but that should be
    a bug right?).
  |_ if a `tb: TracebackType` and a `tb_str` is passed, concat them.
- in `RemoteActorError.pformat()` don't indent the `._message` part used
  for the `body` when `with_type_header == False`.
- update `_mk_msg_type_err()` to use `bad_msg`/`bad_msg_as_dict`
  appropriately and drop passing `ipc_msg`.
2025-03-24 14:04:51 -04:00
Tyler Goodlet aefdc9c094 More correct/explicit `.started()` send-side validation
In the sense that we handle it as a special case that exposed
through to `RxPld.dec_msg()` with a new `is_started_send_side: bool`.

(Non-ideal) `Context.started()` impl deats:
- only do send-side pld-spec validation when a new `validate_pld_spec`
  is set (by default it's not).
- call `self.pld_rx.dec_msg(is_started_send_side=True)` to validate the
  payload field from the just codec-ed `Started` msg's `msg_bytes` by
  passing the `roundtripped` msg (with it's `.pld: Raw`) directly.
- add a `hide_tb: bool` param and proxy it to the `.dec_msg()` call.

(Non-ideal) `PldRx.dec_msg()` impl deats:
- for now we're packing the MTE inside an `Error` via a manual call to
  `pack_error()` and then setting that as the `msg` passed to
  `_raise_from_unexpected_msg()` (though really we should just raise
  inline?).
- manually set the `MsgTypeError._ipc_msg` to the above..

Other,
- more comprehensive `Context` type doc string.
- various `hide_tb: bool` kwarg additions through `._ops.PldRx` meths.
- proto a `.msg._ops.validate_payload_msg()` helper planned to get the
  logic from this version of `.started()`'s send-side validation so as
  to be useful more generally elsewhere.. (like for raising back
  `Return` values on the child side?).

Warning: this commit may have been made out of order from required
changes to `._exceptions` which will come in a follow up!
2025-03-24 14:04:51 -04:00
Tyler Goodlet 07ba69f697 Add basic payload-spec test suite
Starts with some very basic cases:
- verify both subactor-as-child-ctx-task send side validation (failures)
  as well as relay and raise on root-parent-side-task.
- wrap failure expectation cases that bubble out of `@acm`s with
  a `maybe_expect_raises()` equiv wrapper with an embedded timeout.
- add `Return` cases including invalid by `str` and valid by a `None`.

Still ToDo:
- commit impl changes to make the bulk of this suite pass.
- adjust how `MsgTypeError`s format the local (`.started()`) send side
  `.tb_str` such that we don't do a "boxed" error prior to
  `pack_error()` being called normally prior to `Error` transit.
2025-03-24 14:04:51 -04:00
Tyler Goodlet cbfabac813 Even smarter `RemoteActorError.pformat()`-ing
Related to the prior patch, re the new `with_type_header: bool`:
- in the `with_type_header == True` use case make sure we keep the first
  `._message: str` line non-indented since it'll show just after the
  header-line's type path with ':'.
- when `False` drop the `)>` `repr()`-instance style as well so that we
  just get the ascii boxed traceback as though it's the error
  message-`str` not the `repr()` of the error obj.

Other,
- hide `pack_from_raise()` call frame since it'll show in debug mode
  crash handling..
- mk `MsgTypeError.from_decode()` explicitly accept and proxy an
  optional `ipc_msg` and change `msgdict` to also be optional, only
  reading out the `**extra_msgdata` when provided.
- expose a `_mk_msg_type_err(src_err_msg: Error|None = None,)` for
  callers who which to inject a `._ipc_msg: Msgtype` to the MTE.
  |_ add a note how we can't use it due to a causality-dilemma when pld
     validating `Started` on the send side..
2025-03-24 14:04:51 -04:00
Tyler Goodlet 24c9c5397f Add debug check-n-wait inside `._spawn.soft_kill()`
And IFF the `await wait_func(proc)` is cancelled such that we avoid
clobbering some subactor that might be REPL-ing even though its parent
actor is in the midst of (gracefully) cancelling it.
2025-03-24 14:04:51 -04:00
Tyler Goodlet e92972a5f4 Mk `MsgDec.spec_str` have a more compact ` 2025-03-24 14:04:51 -04:00
Tyler Goodlet da03deddf1 Call `.devx._debug.hide_runtime_frames()` by default
From both `open_root_actor()` and `._entry._trio_main()`.

Other `breakpoint()`-from-sync-func fixes:
- properly disable the default hook using `"0"` XD
- offer a `hide_tb: bool` from `open_root_actor()`.
- disable hiding the `._trio_main()` frame, bc pretty sure it doesn't
  help anyone (either way) when REPL-ing/tb-ing from a subactor..?
2025-03-24 14:04:51 -04:00
Tyler Goodlet 50ed461996 Port `Actor._stream_handler()` to use `.has_outcome`, fix indent bug.. 2025-03-24 14:04:51 -04:00
Tyler Goodlet 92ac95ce24 Update debugger tests to expect new pformatting
Mostly the result of the `RemoteActorError.pformat()` and our
new `_pause/crash_msg: str`s which include the `trio.Task.__repr__()`
in the `log.pdb()` message.

Obvi use the `in_prompt_msg()` to accomplish where not used prior.

ToDo later:
-[ ] still some outstanding questions on how detailed inceptions
   should look, eg. in `test_multi_nested_subactors_error_through_nurseries()`
  |_maybe we should be more pedantic at checking `.src_uid` vs.
    `.relay_uid` fields?
-[ ] staged a placeholder test for verifying correct call-stack frame on
   crash handler REPL entry.
-[ ] also need a test to verify that you can't pause from an already paused actor task
   such as can happen if you try to step through runtime code that has
   a recurrent entry to `._debug.pause()`.
2025-03-24 14:04:51 -04:00
Tyler Goodlet defe34dec2 Move runtime frame hiding into helper func
Call it `hide_runtime_frames()` and stick all the lines from the top of
the `._debug` mod in there along with a little `log.devx()` emission on
what gets hidden by default ;)

Other,
- fix ref-error where internal-error handler might trigger despite the
  debug `req_ctx` not yet having init-ed, such that we don't try to
  cancel or log about it when it never was fully created/initialize..
- fix assignment typo iniside `_set_trace()` for `task`.. lel
2025-03-24 14:04:51 -04:00
Tyler Goodlet 9c11b2b04d Better context aware `RemoteActorError.pformat()`
Such that when displaying with `.__str__()` we do not show the type
header (style) since normally python's raising machinery already prints
the type path like `'tractor._exceptions.RemoteActorError:'`, so doing
it 2x is a bit ugly ;p

In support,
- include `.relay_uid` in `RemoteActorError.extra_body_fields`.
- offer a `with_type_header: bool` to `.pformat()` and only put the
  opening type path and closing `')>'` tail line when `True`.
- add `.is_inception() -> bool:` for an easy way to determine if the
  error is multi-hop relayed.
- only repr the `'|_relay_uid=<uid>'` field when an error is an inception.
- tweak the invalid-payload case in `_mk_msg_type_err()` to explicitly
  state in the `message` how the `any_pld` value does not match the `MsgDec.pld_spec`
  by decoding the invalid `.pld` with an any-dec.
- allow `_mk_msg_type_err(**mte_kwargs)` passthrough.
- pass `boxed_type=cls` inside `MsgTypeError.from_decode()`.
2025-03-24 14:04:51 -04:00
Tyler Goodlet e1857413a3 Resolve remaining debug-request race causing hangs
More or less by pedantically separating and managing root and subactor
request syncing events to always be managed by the locking IPC context
task-funcs:
- for the root's "child"-side, `lock_tty_for_child()` directly creates
  and sets a new `Lock.req_handler_finished` inside a `finally:`
- for the sub's "parent"-side, `request_root_stdio_lock()` does the same
  with a new `DebugStatus.req_finished` event and separates it from
  the `.repl_release` event (which indicates a "c" or "q" from user and
  thus exit of the REPL session) as well as sets a new `.req_task:
  trio.Task` to explicitly distinguish from the app-user-task that
  enters the REPL vs. the paired bg task used to request the global
  root's stdio mutex alongside it.
- apply the `__pld_spec__` on "child"-side of the ctx using the new
  `Portal.open_context(pld_spec)` parameter support; drops use of any
  `ContextVar` malarky used prior for `PldRx` mgmt.
- removing `Lock.no_remote_has_tty` since it was a nebulous name and
  from the prior "everything is in a `Lock`" design..

------ - ------

More rigorous impl to handle various edge cases in `._pause()`:
- rejig `_enter_repl_sync()` to wrap the `debug_func == None` case
  inside maybe-internal-error handler blocks.
- better logic for recurrent vs. multi-task contention for REPL entry in
  subactors, by guarding using `DebugStatus.req_task` and by now waiting
  on the new `DebugStatus.req_finished` for the multi-task contention
  case.
- even better internal error handling and reporting for when this code
  is hacked on and possibly broken ;p

------ - ------

Updates to `.pause_from_sync()` support:
- add optional `actor`, `task` kwargs to `_set_trace()` to allow
  compat with the new explicit `debug_func` calling in `._pause()` and
  pass a `threading.Thread` for `task` in the `.to_thread()` usage case.
- add an `except` block that tries to show the frame on any internal
  error.

------ - ------

Relatedly includes a buncha cleanups/simplifications somewhat in
prep for some coming refinements (around `DebugStatus`):
- use all the new attrs mentioned above as needed in the SIGINT shielder.
- wait on `Lock.req_handler_finished` in `maybe_wait_for_debugger()`.
- dropping a ton of masked legacy code left in during the recent reworks.
- better comments, like on the use of `Context._scope` for shielding on
  the "child"-side to avoid the need to manage yet another cs.
- add/change-to lotsa `log.devx()` level emissions for those infos which
  are handy while hacking on the debugger but not ideal/necessary to be
  user visible.
- obvi add lotsa follow up todo notes!
2025-03-24 14:04:51 -04:00
Tyler Goodlet 8b85b023f5 Show runtime nursery frames on internal errors
Much like other recent changes attempt to detect runtime-bug-causing
crashes and only show the runtime-endpoint frame when present.

Adds a `ActorNursery._scope_error: BaseException|None` attr to aid with
detection. Also toss in some todo notes for removing and replacing the
`.run_in_actor()` method API.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 117d6177e8 Set `_ctxvar_Context` for child-side RPC tasks
Just inside `._invoke()` after the `ctx: Context` is retrieved.

Also try our best to *not hide* internal frames when a non-user-code
crash happens, normally either due to a runtime RPC EP bug or
a transport failure.
2025-03-24 14:04:51 -04:00
Tyler Goodlet da770f70d6 Add error suppress flag to `current_ipc_ctx()` 2025-03-24 14:04:51 -04:00
Tyler Goodlet cc6b2d4057 Shield channel closing in `_connect_chan()` 2025-03-24 14:04:51 -04:00
Tyler Goodlet 0d4d461c41 Adjust `Portal` usage of `Context.pld_rx`
Pass the new `ipc` arg and try to show api frames when an unexpected
internal error is detected.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 7dc9808480 Expose `tractor.current_ipc_ctx()` at pkg level 2025-03-24 14:04:51 -04:00
Tyler Goodlet c67a04f978 Allocate a `PldRx` per `Context`, new pld-spec API
Since the state mgmt becomes quite messy with multiple sub-tasks inside
an IPC ctx, AND bc generally speaking the payload-type-spec should map
1-to-1 with the `Context`, it doesn't make a lot of sense to be using
`ContextVar`s to modify the `Context.pld_rx: PldRx` instance.

Instead, always allocate a full instance inside `mk_context()` with the
default `.pld_rx: PldRx` set to use the `msg._ops._def_any_pldec: MsgDec`

In support, simplify the `.msg._ops` impl and APIs:
- drop `_ctxvar_PldRx`, `_def_pld_rx` and `current_pldrx()`.
- rename `PldRx._pldec` -> `._pld_dec`.
- rename the unused `PldRx.apply_to_ipc()` -> `.wraps_ipc()`.
- add a required `PldRx._ctx: Context` attr since it is needed
  internally in some meths and each pld-rx now maps to a specific ctx.
- modify all recv methods to accept a `ipc: Context|MsgStream` (instead
  of a `ctx` arg) since both have a ref to the same `._rx_chan` and there
  are only a couple spots (in `.dec_msg()`) where we need the `ctx`
  explicitly (which can now be easily accessed via a new `MsgStream.ctx`
  property, see below).
- always show the `.dec_msg()` frame in tbs if there's a reference error
  when calling `_raise_from_unexpected_msg()` in the fallthrough case.
- implement `limit_plds()` as light wrapper around getting the
  `current_ipc_ctx()` and mutating its `MsgDec` via
  `Context.pld_rx.limit_plds()`.
- add a `maybe_limit_plds()` which just provides an `@acm` equivalent of
  `limit_plds()` handy for composing in a `async with ():` style block
  (avoiding additional indent levels in the body of async funcs).

Obvi extend the `Context` and `MsgStream` interfaces as needed
to match the above:
- add a `Context.pld_rx` pub prop.
- new private refs to `Context._started_msg: Started` and
  a `._started_pld` (mostly for internal debugging / testing / logging)
  and set inside `.open_context()` immediately after the syncing phase.
- a `Context.has_outcome() -> bool:` predicate which can be used to more
  easily determine if the ctx errored or has a final result.
- pub props for `MsgStream.ctx: Context` and `.chan: Channel` providing
  full `ipc`-arg compat with the `PldRx` method signatures.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 7656326484 Make `request_root_stdio_lock()` post-mortem-able
Finally got this working so that if/when an internal bug is introduced
to this request task-func, we can actually REPL-debug the lock request
task itself B)

As in, if the subactor's lock request task internally errors we,
- ensure the task always terminates (by calling `DebugStatus.release()`)
  and explicitly reports (via a `log.exception()`) the internal error.
- capture the error instance and set as a new `DebugStatus.req_err` and
  always check for it on final teardown - in which case we also,
 - ensure it's reraised from a new `DebugRequestError`.
 - unhide the stack frames for `_pause()`, `_enter_repl_sync()` so that
   the dev can upward inspect the `_pause()` call stack sanely.

Supporting internal impl changes,
- add `DebugStatus.cancel()` and `.req_err`.
- don't ever cancel the request task from
  `PdbREPL.set_[continue/quit]()` only when there's some internal error
  that would likely result in a hang and stale lock state with the root.
- only release the root's lock when the current ask is also the owner
  (avoids bad release errors).
- also show internal `._pause()`-related frames on any `repl_err`.

Other temp-dev-tweaks,
- make pld-dec change log msgs info level again while solving this
  final context-vars race stuff..
- drop the debug pld-dec instance match asserts for now since
  the problem is already caught (and now debug-able B) by an attr-error
  on the decoded-as-`dict` started msg, and instead add in
  a `log.exception()` trace to see which task is triggering the case
  where the debug `MsgDec` isn't set correctly vs. when we think it's
  being applied.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 8bab8e8bde Always release debug request from `._post_mortem()`
Since obviously the thread is likely expected to halt and raise after
the REPL session exits; this was a regression from the prior impl. The
main reason for this is that otherwise the request task will never
unblock if the user steps through the crashed task using 'next' since
the `.do_next()` handler doesn't by default release the request since in
the `.pause()` case this would end the session too early.

Other,
- toss in draft `Pdb.user_exception()`, though doesn't seem to ever
  trigger?
- only release `Lock._debug_lock` when already locked.
2025-03-24 14:04:51 -04:00
Tyler Goodlet e3b1c13eba Rename `.msg.types.Msg` -> `PayloadMsg` 2025-03-24 14:04:51 -04:00
Tyler Goodlet b22ee84d26 Modernize streaming example script
- add typing,
- apply multi-line call style,
- use 'cancel' log level,
- enable debug mode.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 683288c8db Update tests for `PldRx` and `Context` changes
Mostly adjustments for the new pld-receiver semantics/shim-layer which
results more often in the direct delivery of `RemoteActorError`s from
IPC API primitives (like `Portal.result()`) instead of being embedded in
an `ExceptionGroup` bundled from an embedded nursery.

Tossed usage of the `debug_mode: bool` fixture to a couple problematic
tests while i was working on them.

Also includes detailed assertion updates to the inter-peer cancellation
suite in terms of,
- `Context.canceller` state correctly matching the true src actor when
  expecting a ctxc.
- any rxed `ContextCancelled` should instance match the `Context._local/remote_error`
  as should the `.msgdata` and `._ipc_msg`.
2025-03-24 14:04:51 -04:00
Tyler Goodlet fded92115a Hide some API frames, port to new `._debug` apis
- start tossing in `__tracebackhide__`s to various eps which don't need
  to show in tbs or in the pdb REPL.
- port final `._maybe_enter_pm()` to pass a `api_frame`.
- start comment-marking up some API eps with `@api_frame`
  in prep for actually using the new frame-stack tracing.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 953976d588 Use `.recv_msg_w_pld()` for final `Portal.result()`
Woops, due to a `None` test against the `._final_result`, any actual
final `None` result would be received but not acked as such causing
a spawning test to hang. Fix it by instead receiving and assigning both
a `._final_result_msg: PayloadMsg` and `._final_result_pld`.

NB: as mentioned in many recent comments surrounding this API layer,
really this whole `Portal`-has-final-result interface/semantics should
be entirely removed as should the `ActorNursery.run_in_actor()` API(s).
Instead it should all be replaced by a wrapping "high level" API
(`tractor.hilevel` ?) which combines a task nursery, `Portal.open_context()`
and underlying `Context` APIs + an `outcome.Outcome` to accomplish the
same "run a single task in a spawned actor and return it's result"; aka
a "one-shot-task-actor".
2025-03-24 14:04:51 -04:00
Tyler Goodlet e07e7da0b5 Rename `.msg.types.Msg` -> `PayloadMsg` 2025-03-24 14:04:51 -04:00
Tyler Goodlet 429f8f4e13 Adjust `._runtime` to report `DebugStatus.req_ctx`
- inside the `Actor.cancel()`'s maybe-wait-on-debugger delay,
  report the full debug request status and it's affiliated lock request
  IPC ctx.
- use the new `.req_ctx.chan.uid` to do the local nursery lookup during
  channel teardown handling.
- another couple log fmt tweaks.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 7010a39bd3 Rework and first draft of `.devx._frame_stack.py`
Proto-ing a little suite of call-stack-frame annotation-for-scanning
sub-systems for the purposes of both,
- the `.devx._debug`er and its
  traceback and frame introspection needs when entering the REPL,
- detailed trace-style logging such that we can explicitly report
  on "which and where" `tractor`'s APIs are used in the "app" code.

Deats:
- change mod name obvi from `._code` and adjust client mod imports.
- using `wrapt` (for perf) implement a `@api_frame` annot decorator
  which both stashes per-call-stack-frame instances of `CallerInfo` in
  a table and marks the function such that API endpoints can be easily
  found via runtime stack scanning despite any internal impl changes.
- add a global `_frame2callerinfo_cache: dict[FrameType, CallerInfo]`
  table for providing the per func-frame info caching.
- Re-implement `CallerInfo` to require less (types of) inputs:
  |_ `_api_func: Callable`, a ref to the (singleton) func def.
  |_ `_api_frame: FrameType` taken from the `@api_frame` marked `tractor`-API
     func's runtime call-stack, from which we can determine the
     app code's `.caller_frame`.
  |_`_caller_frames_up: int|None` allowing the specific `@api_frame` to
    determine "how many frames up" the application / calling code is.
  And, a better set of derived attrs:
  |_`caller_frame: FrameType` which finds and caches the API-eps calling
    frame.
  |_`caller_frame: FrameType` which finds and caches the API-eps calling
- add a new attempt at "getting a method ref from its runtime frame"
  with `get_ns_and_func_from_frame()` using a heuristic that the
  `CodeType.co_qualname: str` should have a "." in it for methods.
  - main issue is still that the func-ref lookup will require searching
    for the method's instance type by name, and that name isn't
    guaranteed to be defined in any particular ns..
   |_rn we try to read it from the `FrameType.f_locals` but that is
     going to obvi fail any time the method is called in a module where
     it's type is not also defined/imported.
  - returns both the ns and the func ref FYI.
2025-03-24 14:04:51 -04:00
Tyler Goodlet c03f6f917e Even moar bitty `Context` refinements
- set `._state._ctxvar_Context` just after `StartAck` inside
  `open_context_from_portal()` so that `current_ipc_ctx()` always
  works on the 'parent' side.
- always set `.canceller` to any `MsgTypeError.src_uid` and otherwise to
  any maybe-detected `.src_uid` (i.e. for RAEs).
- always set `.canceller` to us when we rx a ctxc which reports us as
  its canceller; this is a sanity check on definite "self cancellation".
- adjust `._is_self_cancelled()` logic to only be `True` when
  `._remote_error` is both a ctxc with a `.canceller` set to us AND
  when `Context.canceller` is also set to us (since the change above)
  as a little bit of extra rigor.
- fill-in/fix some `.repr_state` edge cases:
  - merge self-vs.-peer ctxc cases to one block and distinguish via
    nested `._is_self_cancelled()` check.
  - set 'errored' for all exception matched cases despite `.canceller`.
  - add pre-`Return` phase statuses:
   |_'pre-started' and 'syncing-to-child' depending on side and when
     `._stream` has not (yet) been set.
   |_'streaming' and 'streaming-finished' depending on side when
     `._stream` is set and whether it was stopped/closed.
- tweak drainage log-message to use "outcome" instead of "result".
- use new `.devx.pformat.pformat_cs()` inside `_maybe_cancel_and_set_remote_error()`
  but, IFF the log level is at least 'cancel'.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 888af6025b Move `_debug.pformat_cs()` into `devx.pformat` 2025-03-24 14:04:51 -04:00
Tyler Goodlet ee03b8a214 Big debugger rework, more tolerance for internal err-hangs
Since i was running into them (internal errors) during lock request
machinery dev and was getting all sorts of difficult to understand hangs
whenever i intro-ed a bug to either side of the ipc ctx; this all while
trying to get the msg-spec working for `Lock` requesting subactors..

Deats:
- hideframes for `@acm`s and `trio.Event.wait()`, `Lock.release()`.
- better detail out the `Lock.acquire/release()` impls
- drop `Lock.remote_task_in_debug`, use new `.ctx_in_debug`.
- add a `Lock.release(force: bool)`.
- move most of what was `_acquire_debug_lock_from_root_task()` and some
  of the `lock_tty_for_child().__a[enter/exit]()` logic into
  `Lock.[acquire/release]()`  including bunch more logging.
- move `lock_tty_for_child()` up in the module to below `Lock`, with
  some rework:
  - drop `subactor_uid: tuple` arg since we can just use the `ctx`..
  - add exception handler blocks for reporting internal (impl) errors
    and always force release the lock in such cases.
- extend `DebugStatus` (prolly will rename to `DebugRequest` btw):
  - add `.req_ctx: Context` for subactor side.
  - add `.req_finished: trio.Event` to sub to signal request task exit.
  - extend `.shield_sigint()` doc-str.
  - add `.release()` to encaps all the state mgmt previously strewn
    about inside `._pause()`..
- use new `DebugStatus.release()` to replace all the duplication:
  - inside `PdbREPL.set_[continue/quit]()`.
  - inside `._pause()` for the subactor branch on internal
    repl-invocation error cases,
  - in the `_enter_repl_sync()` closure on error,
- replace `apply_debug_codec()` -> `apply_debug_pldec()` in tandem with
  the new `PldRx` sub-sys  which handles the new `__pld_spec__`.
- add a new `pformat_cs()` helper orig to help debug cs stack
  a corruption; going to move to `.devx.pformat` obvi.
- rename `wait_for_parent_stdin_hijack()` -> `request_root_stdio_lock()`
  with improvements:
  - better doc-str and add todos,
  - use `DebugStatus` more stringently to encaps all subactor req state.
  - error handling blocks for cancellation and straight up impl errors
    directly around the `.open_context()` block with the latter doing
    a `ctx.cancel()` to avoid hanging in the shielded `.req_cs` scope.
  - similar exc blocks for the func's overall body with explicit
    `log.exception()` reporting.
  - only set the new `DebugStatus.req_finished: trio.Event` in `finally`.
- rename `mk_mpdb()` -> `mk_pdb()` and don't cal `.shield_sigint()`
  implicitly since the caller usage does matter for this.
- factor out `any_connected_locker_child()` from the SIGINT handler.
- rework SIGINT handler to better handle any stale-lock/hang cases:
  - use new `Lock.ctx_in_debug: Context` to detect subactor-in-debug.
    and use it to cancel any lock request instead of the lower level
  - use `problem: str` summary approach to log emissions.
- rework `_pause()` given all of the above, stuff not yet mentioned:
  - don't take `shield: bool` input and proxy to `debug_func()` (for now).
  - drop `extra_frames_up_when_async: int` usage, expect
    `**debug_func_kwargs` to passthrough an `api_frame: Frametype` (more
    on this later).
  - lotsa asserts around the request ctx vs. task-in-debug ctx using new
    `current_ipc_ctx()`.
  - asserts around `DebugStatus` state.
- rework and simplify the `debug_func` hooks,
  `_set_trace()`/`_post_mortem()`:
  - make them accept a non-optional `repl: PdbRepl` and `api_frame:
    FrameType` which should be used to set the current frame when the
    REPL engages.
  - always hide the hook frames.
  - always accept a `tb: TracebackType` to `_post_mortem()`.
   |_ copy and re-impl what was the delegation to
     `pdbp.xpm()`/`pdbp.post_mortem()` and instead call the
     underlying `Pdb.interaction()` ourselves with a `caller_frame`
     and tb instance.
- adjust the public `.pause()` impl:
  - accept optional `hide_tb` and `api_frame` inputs.
  - mask opening a cancel-scope for now (can cause `trio` stack
    corruption, see notes) and thus don't use the `shield` input other
    then to eventually passthrough to `_post_mortem()`?
   |_ thus drop `task_status` support for now as well.
   |_ pretty sure correct soln is a debug-nursery around `._invoke()`.
- since no longer using `extra_frames_up_when_async` inside
  `debug_func()`s ensure all public apis pass a `api_frame`.
- re-impl our `tractor.post_mortem()` to directly call into `._pause()`
  instead of binding in via `partial` and mk it take similar input as
  `.pause()`.
- drop `Lock.release()` from `_maybe_enter_pm()`, expose and pass
  expected frame and tb.
- use necessary changes from all the above within
  `maybe_wait_for_debugger()` and `acquire_debug_lock()`.

Lel, sorry thought that would be shorter..
There's still a lot more re-org to do particularly with `DebugStatus`
encapsulation but it's coming in follow up.
2025-03-24 14:04:51 -04:00
Tyler Goodlet f17fd35ccb Allow `Stop` passthrough from `PldRx.recv_msg_w_pld()`
Since we need to allow it (at the least) inside
`drain_until_final_msg()` for handling stream-phase termination races
where we don't want to have to handle a raised error from something like
`Context.result()`. Expose the passthrough option via
a `passthrough_non_pld_msgs: bool` kwarg.

Add comprehensive comment to `current_pldrx()`.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 51de6bd1bc Add a "current IPC `Context`" `ContextVar`
Expose it from `._state.current_ipc_ctx()` and set it inside
`._rpc._invoke()` for child and inside `Portal.open_context()` for
parent.

Still need to write a few more tests (particularly demonstrating usage
throughout multiple nested nurseries on each side) but this suffices as
a proto for testing with some debugger request-from-subactor stuff.

Other,
- use new `.devx.pformat.add_div()` for ctxc messages.
- add a block to always traceback dump on corrupted cs stacks.
- better handle non-RAEs exception output-formatting in context
  termination summary log message.
- use a summary for `start_status` for msg logging in RPC loop.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 70d974fc99 Mk `drain_to_final_msg()` never raise from `Error`
Since we usually want them raised from some (internal) call to
`Context.maybe_raise()` and NOT directly from the drainage call, make it
possible via a new `raise_error: bool` to both `PldRx.recv_msg_w_pld()`
and `.dec_msg()`.

In support,
- rename `return_msg` -> `result_msg` since we expect to return
  `Error`s.
- do a `result_msg` assign and `break` in the `case Error()`.
- add `**dec_msg_kwargs` passthrough for other `.dec_msg()` calling
  methods.

Other,
- drop/aggregate todo-notes around the main loop's
  `ctx._pld_rx.recv_msg_w_pld()` call.
- add (configurable) frame hiding to most payload receive meths.
2025-03-24 14:04:51 -04:00
Tyler Goodlet f992b9f2e8 "Icons" in `._entry`'s subactor `.info()` messages
Add a little `>` or `X` supervision icon indicating the spawning or
termination of each sub-actor respectively.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 15cf54fc45 Move pformatters into new `.devx.pformat`
Since `._code` is prolly gonna get renamed (to something "frame & stack
tools" related) and to give a bit better organization.

Also adds a new `add_div()` helper, factored out of ctxc message
creation in `._rpc._invoke()`, for adding a little "header line" divider
under a given `message: str` with a little math to center it.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 77764aceef Change to `RemoteActorError.pformat()`
For more sane manual calls as needed in logging purposes. Obvi remap
the dunder methods to it.

Other:
- drop `hide_tb: bool` from `unpack_error()`, shouldn't need it since
  frame won't ever be part of any tb raised from returned error.
- add a `is_invalid_payload: bool` to `_raise_from_unexpected_msg()` to
  be used from `PldRx` where we don't need to decode the IPC
  msg, just the payload; make the error message reflect this case.
- drop commented `._portal._unwrap_msg()` since we've replaced it with
  `PldRx`'s delegation to newer `._raise_from_unexpected_msg()`.
- hide the `Portal.result()` frame by default, again.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 8347a78276 Add todo for rigorous struct-type spec of `SpawnSpec` fields 2025-03-24 14:04:51 -04:00
Tyler Goodlet 9f3a00c65e Type annot the proc from `trio.lowlevel.open_process()` 2025-03-24 14:04:51 -04:00
Tyler Goodlet 5d1a0da5e6 Fix attr name error, use public `MsgDec.dec` 2025-03-24 14:04:51 -04:00
Tyler Goodlet 45f499cf3a Reorg frames pformatters, add `Context.repr_state`
A better spot for the pretty-formatting of frame text (and thus tracebacks)
is in the new `.devx._code` module:
- move from `._exceptions` -> `.devx._code.pformat_boxed_tb()`.
- add new `pformat_caller_frame()` factored out the use case in
  `._exceptions._mk_msg_type_err()` where we dump a stack trace
  for bad `.send()` side IPC msgs.

Add some new pretty-format methods to `Context`:
- explicitly implement `.pformat()` and allow an `extra_fields: dict`
  which can be used to inject additional fields (maybe eventually by
  default) such as is now used inside
  `._maybe_cancel_and_set_remote_error()` when reporting the internal
  `._scope` state in cancel logging.
- add a new `.repr_state -> str` which provides a single string status
  depending on the internal state of the IPC ctx in terms of the shuttle
  protocol's "phase"; use it from `.pformat()` for the `|_state:`.
- set `.started(complain_no_parity=False)` now since we presume decoding
  with `.pld: Raw` now with the new `PldRx` design.
- use new `msgops.current_pldrx()` in `mk_context()`.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 74b6871bfd Mk `process_messages()` return last msg; summary logging
Not sure it's **that** useful (yet) but in theory would allow avoiding
certain log level usage around transient RPC requests for discovery methods
(like `.register_actor()` and friends); can't hurt to be able to
introspect that last message for other future cases I'd imagine as well.
Adjust the calling code in `._runtime` to match; other spots are using
the `trio.Nursery.start()` schedule style and are fine as is.

Improve a bunch more log messages throughout a few mods mostly by going
to a "summary" single-emission style where possible/appropriate:
- in `._runtime` more "single summary" status style log emissions:
 |_mk `Actor.load_modules()` render a single mod loaded summary.
 |_use a summary `con_status: str` for `Actor._stream_handler()` conn
   setup and an equiv (`con_teardown_status`) for connection teardowns.
 |_similar thing in `Actor.wait_for_actor()`.
- generally more usage of `.msg.pretty_struct` apis throughout `._runtime`.
2025-03-24 14:04:51 -04:00
Tyler Goodlet a67975f8f5 First draft payload-spec limit API
Add new task-scope oriented `PldRx.pld_spec` management API similar to
`.msg._codec.limit_msg_spec()`, but obvi built to process and filter
`MsgType.pld` values.

New API related changes include:
- new per-task singleton getter `msg._ops.current_pldrx()` which
  delivers the current (global) payload receiver via a new
  `_ctxvar_PldRx: ContextVar` configured with a default
  `_def_any_pldec: MsgDec[Any]` decoder.
- a `PldRx.limit_plds()` which sets the decoder (`.type` underneath)
  for the specific payload rx instance.
- `.msg._ops.limit_plds()` which obtains the current task-scoped `PldRx`
  and applies the pld spec via a new `PldRx.limit_plds()`.
- rename `PldRx._msgdec` -> `._pldec`.
- add `.pld_dec` as pub attr for -^

Unrelated adjustments:
- use `.msg.pretty_struct.pformat()` where handy.
- always pass `expect_msg: MsgType`.
- add a `case Stop()` to `PldRx.dec_msg()` which will `log.warning()`
  when a stop is received by no stream was open on this receiving side
  since we rarely want that to raise since it's prolly just a runtime
  race or mistake in user code.

Other:
2025-03-24 14:04:51 -04:00
Tyler Goodlet 753724252d Make `.msg.types.Msg.pld: Raw` only, since `PldRx`.. 2025-03-24 14:04:51 -04:00
Tyler Goodlet 1d1cd9c51a More bitty (runtime) logging tweaks 2025-03-24 14:04:51 -04:00
Tyler Goodlet f32a9657c0 Use new `Msg[Co]Dec` repr meths in `._exceptions`
Particularly when logging around `MsgTypeError`s.

Other:
- make `_raise_from_unexpected_msg()`'s `expect_msg` a non-default value
  arg, must always be passed by caller.
- drop `'canceller'` from `_body_fields` ow it shows up twice for ctxc.
- use `.msg.pretty_struct.pformat()`.
- parameterize `RemoteActorError.reprol()` (repr-one-line method) to
  show `RemoteActorError[<self.boxed_type_str>]( ..` to make obvi
  the boxed remote error type.
- re-impl `.boxed_type_str` as `str`-casting the `.boxed_type` value
  which is guaranteed to render non-`None`.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 799416661e Add more useful `MsgDec.__repr__()`
Basically exact same as that for `MsgCodec` with the `.spec` displayed
via a better (maybe multi-line) `.spec_str: str` generated from a common
new set of helper mod funcs factored out msg-codec meths:
- `mk_msgspec_table()` to gen a `MsgType` name -> msg table.
- `pformat_msgspec()` to `str`-ify said table values nicely.q

Also add a new `MsgCodec.msg_spec_str: str` prop which delegates to the
above for the same.
2025-03-24 14:04:51 -04:00
Tyler Goodlet d83e0eb665 Mk `.msg.pretty_struct.Struct.pformat()` a mod func
More along the lines of `msgspec.struct` and also far more useful
internally for pprinting `MsgTypes`. Of course add method aliases.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 32eb2df5aa Use `Context.[peer_]side` in ctxc messages 2025-03-24 14:04:51 -04:00
Tyler Goodlet e17603402f Add `Context.peer_side: str` property, mk static-meth private. 2025-03-24 14:04:51 -04:00
Tyler Goodlet efb69f9bf9 Flip back `StartAck` timeout to `inf`.. 2025-03-24 14:04:51 -04:00
Tyler Goodlet 506575e4ca Another `._rpc` mod passthrough
- tweaking logging to include more `MsgType` dumps on IPC faults.
- removing some commented cruft.
- comment formatting / cleanups / add-ons.
- more type annots.
- fill out some TODO content.
2025-03-24 14:04:51 -04:00
Tyler Goodlet eb5db36013 Try out `msgspec` encode-buffer optimization
As per the reco:
https://jcristharif.com/msgspec/perf-tips.html#reusing-an-output-buffe

BUT, seems to cause this error in `pikerd`..

`BufferError: Existing exports of data: object cannot be re-sized`

Soo no idea? Maybe there's a tweak needed that we can glean from
tests/examples in the `msgspec` repo?

Disabling for now.
2025-03-24 14:04:51 -04:00
Tyler Goodlet f0155b4525 Set `Context._stream` in `Portal.open_stream_from()`.. 2025-03-24 14:04:51 -04:00
Tyler Goodlet 74d6ffabf2 Use `Context._stream` in `_raise_from_unexpected_msg()`
Instead of expecting it to be passed in (as it was prior), when
determining if a `Stop` msg is a valid end-of-channel signal use the
`ctx._stream: MsgStream|None` attr which **must** be set by any stream
opening API; either of:
- `Context.open_stream()`
- `Portal.open_stream_from()`

Adjust the case block logic to match with fallthrough from any EoC to
a closed error if necessary. Change the `_type: str` to match the
failing IPC-prim name in the tail case we raise a `MessagingError`.

Other:
- move `.sender: tuple` uid attr up to `RemoteActorError` since `Error`
  optionally defines it as a field and for boxed `StreamOverrun`s (an
  ignore case we check for in the runtime during cancellation) we want
  it readable from the boxing rae.
- drop still unused `InternalActorError`.
2025-03-24 14:04:51 -04:00
Tyler Goodlet e4e04c516f First draft "payload receiver in a new `.msg._ops`
As per much tinkering, re-designs and preceding rubber-ducking via many
"commit msg novelas", **finally** this adds the (hopefully) final
missing layer for typed msg safety: `tractor.msg._ops.PldRx`

(or `PayloadReceiver`? haven't decided how verbose to go..)

Design justification summary:
      ------ - ------
- need a way to be as-close-as-possible to the `tractor`-application
  such that when `MsgType.pld: PayloadT` validation takes place, it is
  straightforward and obvious how user code can decide to handle any
  resulting `MsgTypeError`.
- there should be a common and optional-yet-modular way to modify
  **how** data delivered via IPC (possibly embedded as user defined,
  type-constrained `.pld: msgspec.Struct`s) can be handled and processed
  during fault conditions and/or IPC "msg attacks".
- support for nested type constraints within a `MsgType.pld` field
  should be simple to define, implement and understand at runtime.
- a layer between the app-level IPC primitive APIs
  (`Context`/`MsgStream`) and application-task code (consumer code of
  those APIs) should be easily customized and prove-to-be-as-such
  through demonstrably rigorous internal (sub-sys) use!
  -> eg. via seemless runtime RPC eps support like `Actor.cancel()`
  -> by correctly implementing our `.devx._debug.Lock` REPL TTY mgmt
    dialog prot, via a dead simple payload-as-ctl-msg-spec.

There are some fairly detailed doc strings included so I won't duplicate
that content, the majority of the work here is actually somewhat of
a factoring of many similar blocks that are doing more or less the same
`msg = await Context._rx_chan.receive()` with boilerplate for
`Error`/`Stop` handling via `_raise_from_no_key_in_msg()`. The new
`PldRx` basically provides a shim layer for this common "receive msg,
decode its payload, yield it up to the consuming app task" by pairing
the RPC feeder mem-chan with a msg-payload decoder and expecting IPC API
internals to use **one** API instead of re-implementing the same pattern
all over the place XD

`PldRx` breakdown
 ------ - ------
- for now only expects a `._msgdec: MsgDec` which allows for
  override-able `MsgType.pld` validation and most obviously used in
  the impl of `.dec_msg()`, the decode message method.
- provides multiple mem-chan receive options including:
 |_ `.recv_pld()` which does the e2e operation of receiving a payload
    item.
 |_ a sync `.recv_pld_nowait()` version.
 |_ a `.recv_msg_w_pld()` which optionally allows retreiving both the
    shuttling `MsgType` as well as it's `.pld` body for use cases where
    info on both is important (eg. draining a `MsgStream`).

Dirty internal changeover/implementation deatz:
             ------ - ------
- obvi move over all the IPC "primitives" that previously had the duplicate recv-n-yield
  logic:
 - `MsgStream.receive[_nowait]()` delegating instead to the equivalent
   `PldRx.recv_pld[_nowait]()`.
 - add `Context._pld_rx: PldRx`, created and passed in by
   `mk_context()`; use it for the `.started()` -> `first: Started`
   retrieval inside `open_context_from_portal()`.
 - all the relevant `Portal` invocation methods: `.result()`,
   `.run_from_ns()`, `.run()`; also allows for dropping `_unwrap_msg()`
   and `.Portal_return_once()` outright Bo
- rename `Context.ctx._recv_chan` -> `._rx_chan`.
- add detailed `Context._scope` info for logging whether or not it's
  cancelled inside `_maybe_cancel_and_set_remote_error()`.
- move `._context._drain_to_final_msg()` -> `._ops.drain_to_final_msg()`
  since it's really not necessarily ctx specific per say, and it does
  kinda fit with "msg operations" more abstractly ;)
2025-03-24 14:04:51 -04:00
Tyler Goodlet fee20103c6 Add a `MsgDec` for receive-only decoding
In prep for a "payload receiver" abstraction that will wrap
`MsgType.pld`-IO delivery from `Context` and `MsgStream`, adds a small
`msgspec.msgpack.Decoder` shim which delegates an API similar to
`MsgCodec` and is offered via a `.msg._codec.mk_dec()` factory.

Detalles:
- move over the TODOs/comments from `.msg.types.Start` to to
  `MsgDec.spec` since it's probably the ideal spot to start thinking
  about it from a consumer code PoV.
- move codec reversion assert and log emit into `finally:` block.
- flip default `.types._tractor_codec = mk_codec_ipc_pld(ipc_pld_spec=Raw)`
  in prep for always doing payload-delayed decodes.
- make `MsgCodec._dec` private with public property getter.
- change `CancelAck` to NOT derive from `Return` so it's mutex in
  `match/case:` handling.
2025-03-24 14:04:51 -04:00
Tyler Goodlet dfc92352b3 Move `MsgTypeError` maker func to `._exceptions`
Since it's going to be used from the IPC primitive APIs
(`Context`/`MsgStream`) for similarly handling payload type spec
validation errors and bc it's really not well situation in the IPC
module XD

Summary of (impl) tweaks:
- obvi move `_mk_msg_type_err()` and import and use it in `._ipc`; ends
  up avoiding a lot of ad-hoc imports we had from `._exceptions` anyway!
- mask out "new codec" runtime log emission from `MsgpackTCPStream`.
- allow passing a (coming in next commit) `codec: MsgDec` (message
  decoder) which supports the same required `.pld_spec_str: str` attr.
- for send side logging use existing `MsgCodec..pformat_msg_spec()`.
- rename `_raise_from_no_key_in_msg()` to the now more appropriate
  `_raise_from_unexpected_msg()`, but leaving alias for now.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 65e918298b Drop more `dict`-msg cruft from `._exceptions` 2025-03-24 14:04:51 -04:00
Tyler Goodlet cc9af5758d Mark `.pld` msgs as also taking `msgspec.Raw` 2025-03-24 14:04:51 -04:00
Tyler Goodlet ca1d7c28ea Go back to `ContextVar` for codec mgmt
Turns out we do want per-task inheritance particularly if there's to be
per `Context` dynamic mutation of the spec; we don't want mutation in
some task to affect any parent/global setting.

Turns out since we use a common "feeder task" in the rpc loop, we need to
offer a per `Context` payload decoder sys anyway in order to enable
per-task controls for inter-actor multi-task-ctx scenarios.
2025-03-24 14:04:51 -04:00
Tyler Goodlet cc69d86baf Proto in new `Context` refinements
As per some newly added features and APIs:

- pass `portal: Portal` to `Actor.start_remote_task()` from
  `open_context_from_portal()` marking `Portal.open_context()` as
  always being the "parent" task side.

- add caller tracing via `.devx._code.CallerInfo/.find_caller_info()`
  called in `mk_context()` and (for now) a `__runtimeframe__: int = 2`
  inside `open_context_from_portal()` such that any enter-er of
  `Portal.open_context()` will be reported.

- pass in a new `._caller_info` attr which is used in 2 new meths:
  - `.repr_caller: str` for showing the name of the app-code-func.
  - `.repr_api: str` for showing the API ep, which for now we just
    hardcode to `Portal.open_context()` since ow its gonna show the mod
    func name `open_context_from_portal()`.
  - use those new props ^ in the `._deliver_msg()` flow body log msg
    content for much clearer msg-flow tracing Bo

- add `Context._cancel_on_msgerr: bool` to toggle whether
  a delivered `MsgTypeError` should trigger a `._scope.cancel()` call.
  - also (temporarily) add separate `.cancel()` emissions for both cases
    as i work through hacking out the maybe `MsgType.pld: Raw` support.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 3c498c2eac Tweak `current_actor()` failure msg 2025-03-24 14:04:51 -04:00
Tyler Goodlet 958e91962b Add some `bytes` annots 2025-03-24 14:04:51 -04:00
Tyler Goodlet 34b26862ad TOSQUASH 77a15eb use `DebugStatus` in `._rpc` 2025-03-24 14:04:51 -04:00
Tyler Goodlet 2801ccf229 Annotate nursery and portal methods for `CallerInfo` scanning 2025-03-24 14:04:51 -04:00
Tyler Goodlet 94b735ed96 `NamespacePath._mk_fqnp()` handle `__mod__` for methods
Need to use `__self__.__mod__` in the method case i guess..
2025-03-24 14:04:51 -04:00
Tyler Goodlet dc31f0dac9 Use `DebugStatus` around subactor lock requests
Breaks out all the (sub)actor local conc primitives from `Lock` (which
is now only used in and by the root actor) such that there's an explicit
distinction between a task that's "consuming" the `Lock` (remotely) vs.
the root-side service tasks which do the actual acquire on behalf of the
requesters.

`DebugStatus` changeover deats:
------ - ------
- move all the actor-local vars over `DebugStatus` including:
  - move `_trio_handler` and `_orig_sigint_handler`
  - `local_task_in_debug` now `repl_task`
  - `_debugger_request_cs` now `req_cs`
  - `local_pdb_complete` now `repl_release`
- drop all ^ fields from `Lock.repr()` obvi..
- move over the `.[un]shield_sigint()` and
  `.is_main_trio_thread()` methods.
- add some new attrs/meths:
  - `DebugStatus.repl` for the currently running `Pdb` in-actor
    singleton.
  - `.repr()` for pprint of state (like `Lock`).
- Note: that even when a root-actor task is in REPL, the `DebugStatus`
  is still used for certain actor-local state mgmt, such as SIGINT
  handler shielding.
- obvi change all lock-requester code bits to now use a `DebugStatus` in
  their local actor-state instead of `Lock`, i.e. change usage from
  `Lock` in `._runtime` and `._root`.
- use new `Lock.get_locking_task_cs()` API in when checking for
  sub-in-debug from `._runtime.Actor._stream_handler()`.

Unrelated to topic-at-hand tweaks:
------ - ------
- drop the commented bits about hiding `@[a]cm` stack frames from
  `_debug.pause()` and simplify to only one block with the `shield`
  passthrough since we already solved the issue with cancel-scopes using
  `@pdbp.hideframe` B)
  - this includes all the extra logging about the extra frame for the
    user (good thing i put in that wasted effort back then eh..)
- put the `try/except BaseException` with `log.exception()` around the
  whole of `._pause()` to ensure we don't miss in-func errors which can
  cause hangs..
- allow passing in `portal: Portal` to
  `Actor.start_remote_task()` such that `Portal` task spawning methods
  are always denoted correctly in terms of `Context.side`.
- lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 846aff2724 The src error to `_raise_from_no_key_in_msg()` is always an attr-error now! 2025-03-24 14:04:51 -04:00
Tyler Goodlet 1d1c7cb3e8 First draft, sub-msg-spec for debugger `Lock` sys
Since it's totes possible to have a spec applied that won't permit
`str`s, might as well formalize a small msg set for subactors to request
the tree-wide TTY `Lock`.

BTW, I'm prolly not going into every single change here in this first
WIP since there's still a variety of broken stuff mostly to do with
races on the codec apply being done in a `trio.lowleve.RunVar`; it
should be re-done with a `ContextVar` such that each task does NOT
mutate the global setting..

New msg set and usage is simply:
- `LockStatus` which is the reponse msg delivered from `lock_tty_for_child()`
- `LockRelease` a one-off request msg from the subactor to drop the
  `Lock` from a `MsgStream.send()`.
- use these msgs throughout the root and sub sides of the locking
  ctx funcs: `lock_tty_for_child()` & `wait_for_parent_stdin_hijack()`

The codec is now applied in both the root and sub `Lock` request tasks:
- for root inside `lock_tty_for_child()` before the `.started()`.
- for subs, inside `wait_for_parent_stdin_hijack()` since we only want
  to affect the codec *for the locking task*.
  - (hence the need for ctx-var as mentioned above but currently this
    can cause races which will break against other app tasks competing
    for the codec setting).
- add a `apply_debug_codec()` helper for use in both cases.
- add more detailed logging to both the root and sub side of `Lock`
  requesting funcs including requiring that the sub-side task "uid" (a
  `tuple[str, int]` = (trio.Task.name, id(trio.Task)` be provided (more
  on this later).

A main issue discovered while proto-testing all this was the ability of
a sub to "double lock" (leading to self-deadlock) via an error in
`wait_for_parent_stdin_hijack()` which, for ex., can happen in debug
mode via crash handling of a `MsgTypeError` received from the root
during a codec applied msg-spec race! Originally I was attempting to
solve this by making the SIGINT override handler more resilient but this
case is somewhat impossible to detect by an external root task other
then checking for duplicate ownership via the new `subactor_task_uid`.
=> SO NOW, we always stick the current task uid in the
   `Lock._blocked: set` and raise an rte on a double request by the same
   remote task.

Included is a variety of small refinements:
- finally figured out how to mark a variety of `.__exit__()` frames with
  `pdbp.hideframe()` to actually hide them B)
- add cls methods around managing `Lock._locking_task_cs` from root only.
- re-org all the `Lock` attrs into those only used in root vs. subactors
  and proto-prep a new `DebugStatus` actor-singleton to be used in subs.
- add a `Lock.repr()` to contextually print the current conc primitives.
- rename our `Pdb`-subtype to `PdbREPL`.
- rigor out the SIGINT handler a bit, originally to try and hack-solve
  the double-lock issue mentioned above, but now just with better
  logging and logic for most (all?) possible hang cases that should be
  hang-recoverable after enough ctrl-c mashing by the user.. well
  hopefully:
  - using `Lock.repr()` for both root and sub cases.
  - lots more `log.warn()`s and handler reversions on stale lock or cs
    detection.
- factor `._pause()` impl a little better moving the actual repl entry
  to a new `_enter_repl_sync()` (originally for easier wrapping in the
  sub case with `apply_codec()`).
2025-03-24 14:04:51 -04:00
Tyler Goodlet 8baaeb414f Tweak a couple more log message fmts 2025-03-24 14:04:51 -04:00
Tyler Goodlet 1c01608c72 More msg-spec tests tidying
- Drop `test_msg_spec_xor_pld_spec()` since we no longer support
  `ipc_msg_spec` arg to `mk_codec()`.
- Expect `MsgTypeError`s around `.open_context()` calls when
  `add_codec_hooks == False`.
- toss in some `.pause()` points in the subactor ctx body whilst hacking
  out a `.pld` protocol for debug mode TTY locking.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 88686e2271 Pass a `use_greenback: bool` runtime var to subs
Such that the top level `maybe_enable_greenback` from
`open_root_actor()` can toggle the entire actor tree's usage.
Read the rtv in `._rpc` tasks and only enable if set.

Also, rigor up the `._rpc.process_messages()` loop to handle `Error()`
and `case _:` separately such that we now raise an explicit rte for
unknown / invalid msgs. Use "parent" / "child" for side descriptions in
loop comments and put a fat comment before the `StartAck` in `_invoke()`.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 203d0aceb4 Use `_raise_from_no_key_in_msg(allow_msgs)`
Instead of `allow_msg_keys` since we've fully flipped over to
struct-types for msgs in the runtime.

- drop the loop from `MsgStream.receive_nowait()` since
  `Yield/Return.pld` getting will handle both (instead of a loop of
  `dict`-key reads).
2025-03-24 14:04:51 -04:00
Tyler Goodlet 71693ac3dd Add `MsgTypeError.expected_msg_type`
Which matches with renaming `.payload_msg` -> `.expected_msg` which is
the value we attempt to construct from a vanilla-msgppack
decode-to-`dict` and then construct manually into a `MsgType` using
`.msg.types.from_dict_msg()`. Add a todo to use new `use_pretty` flag
which currently conflicts with `._exceptions.pformat_boxed_type()`
prefix formatting..
2025-03-24 14:04:51 -04:00
Tyler Goodlet 97b9d417d2 Add `from_dict_msg(user_pretty: bool)` flag
Allows for optionally (and dynamically) constructing the "expected"
`MsgType` from a `dict` into a `pretty_struct.Struct`, mostly for
logging usage.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 26a3ff6b37 IPC ctx refinements around `MsgTypeError` awareness
Add a bit of special handling for msg-type-errors with a dedicated
log-msg detailing which `.side: str` is the sender/causer and avoiding
a `._scope.cancel()` call in such cases since the local task might be
written to handle and tolerate the badly (typed) IPC msg.

As part of ^, change the ctx task-pair "side" semantics from "caller" ->
"callee" to be "parent" -> "child" which better matches the
cross-process SC-linked-task supervision hierarchy, and
`trio.Nursery.parent_task`; in `trio` the task that opens a nursery is
also named the "parent".

Impl deats / fixes around the `.side` semantics:
- ensure that `._portal: Portal` is set ASAP after
  `Actor.start_remote_task()` such that if the `Started` transaction
  fails, the parent-vs.-child sides are still denoted correctly (since
  `._portal` being set is the predicate for that).
- add a helper func `Context.peer_side(side: str) -> str:` which inverts
  from "child" to "parent" and vice versa, useful for logging info.

Other tweaks:
- make `_drain_to_final_msg()` return a tuple of a maybe-`Return` and
  the list of other `pre_result_drained: list[MsgType]` such that we
  don't ever have to warn about the return msg getting captured as
  a pre-"result" msg.
- Add some strictness flags to `.started()` which allow for toggling
  whether to error or warn log about mismatching roundtripped `Started`
  msgs prior to IPC transit.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 8690a88e50 Extend recv-side `MsgTypeError` default message
Display the new `MsgCodec.pld_spec_str` and format the incorrect field
value to be placed entirely (txt block wise) right of the "type annot"
part of the line:

Iow if you had a bad `dict` value where something else should be it'd
look something like this:

<Started(
 |_pld: NamespacePath = {'cid': '3e0ca00c-7d32-4d2a-a0c2-ac2e12453871',
                         'locked': True,
                         'msg_type': 'LockStatus',
                         'subactor_uid': ['sub', 'af7ccb69-1dab-491f-84f7-2ec42c32d137']}
2025-03-24 14:04:51 -04:00
Tyler Goodlet aa4a4be668 TOSQUASH 322e015d Fix `mk_codec()` input arg 2025-03-24 14:04:51 -04:00
Tyler Goodlet 9e2133e3be Tweak some `pformat_boxed_tb()` indent inputs
- add some `tb_str: str` indent-prefix args for diff indent levels for the
body vs. the surrounding "ascii box".
- ^-use it-^ from `RemoteActorError.__repr()__` obvi.
- use new `msg.types.from_dict_msg()` in impl of
  `MsgTypeError.payload_msg`, handy for showing what the message "would
  have looked like in `Struct` form" had it not failed it's type
  constraints.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 1567dfc3e2 Add custom `MsgCodec.__repr__()`
Sure makes console grokability a lot better by showing only the
customizeable fields.

Further, clean up `mk_codec()` a bunch by removing the `ipc_msg_spec`
param since we don't plan to support another msg-set (for now) which
allows cleaning out a buncha logic that was mostly just a source of
bugs..

Also,
- add temporary `log.info()` around codec application.
- throw in some sanity `assert`s to `limit_msg_spec()`.
- add but mask out the `extend_msg_spec()` idea since it seems `msgspec`
  won't allow `Decoder.type` extensions when using a custom `dec_hook()`
  for some extension type.. (not sure what approach to take here yet).
2025-03-24 14:04:51 -04:00
Tyler Goodlet d716d8b6b4 Expose `tractor.msg.PayloadT` from subpkg 2025-03-24 14:04:51 -04:00
Tyler Goodlet 0653a70f2b Add msg-from-dict constructor helper
Handy for re-constructing a struct-`MsgType` from a `dict` decoded from
wire-bytes wherein the msg failed to decode normally due to a field type
error but you'd still like to show the "potential" msg in struct form,
say inside a `MsgTypeError`'s meta data.

Supporting deats:
- add a `.msg.types.from_dict_msg()` to implement it (the helper).
- also a `.msg.types._msg_table: dict[str, MsgType]` for supporting this
  func ^ as well as providing just a general `MsgType`-by-`str`-name
  lookup.

Unrelated:
- Drop commented idea for still supporting `dict`-msg set via
  `enc/dec_hook()`s that would translate to/from `MsgType`s, but that
  would require a duplicate impl in the runtime.. so eff that XD
2025-03-24 14:04:51 -04:00
Tyler Goodlet 0b28b54e11 Relay `MsgTypeError`s upward in RPC loop via `._deliver_ctx_payload()` 2025-03-24 14:04:51 -04:00
Tyler Goodlet 648695a325 Start tidying up `._context`, use `pack_from_raise()`
Mostly removing commented (and replaced) code blocks lingering from the
ctxc semantics work and new typed-msg-spec `MsgType`s handling AND use
the new `._exceptions.pack_from_raise()` helper to construct
`StreamOverrun` msgs.

Deaterz:
- clean out the drain loop now that it's implemented to handle our
  struct msg types including the `dict`-msg bits left in as
  fallback-reminders, any notes/todos better summarized at the top of
  their blocks, remove any `_final_result_is_set()` related duplicate/legacy
  tidbits.
- use a `case Error()` block in drain loop with fallthrough to `_:`
  always resulting in an rte raise.
- move "XXX" notes into the doc-string for `._deliver_msg()` as
  a "rules" section.
- use `match:` syntax for logging the `result_or_err: MsgType` outcome
  from the final `.result()` call inside `open_context_from_portal()`.
- generally speaking use `MsgType` type annotations throughout!
2025-03-24 14:04:51 -04:00
Tyler Goodlet 62bb11975f Refine `MsgTypeError` handling to relay-up-on-`.recv()`
Such that `Channel.recv()` + `MsgpackTCPStream.recv()` originating
msg-type-errors are not raised at the IPC transport layer but instead
relayed up the runtime stack for eventual handling by user-app code via
the `Context`/`MsgStream` layer APIs.

This design choice leads to a substantial amount of flexibility and
modularity, and avoids `MsgTypeError` handling policies from being
coupled to a particular backend IPC transport layer:
- receive-side msg-type errors, as can be raised and handled in the
  `.open_stream()` "nasty" phase of a ctx, whilst being packed at the
  `MsgCodec`/transport layer (keeping the underlying src decode error
  coupled to the specific transport + interchange lib) and then relayed
  upward to app code for custom handling like a normal Error` msg.
- the policy options for handling such cases could be implemented as
  `@acm` wrappers around `.open_context()`/`.open_stream()` blocks (and
  their respective delivered primitives) OR just plain old async
  generators around `MsgStream.receive()` such that both built-in policy
  handling and custom user-app solutions can be swapped without touching
  any `tractor` internals or providing specialized "registry APIs".
  -> eg. the ignore and relay-invalid-msg-to-sender approach can be more
   easily implemented as embedded `try: except MsgTypeError:` blocks
   around `MsgStream.receive()` possibly applied as either of an
   injected wrapper type around a stream or an async gen that `async
   for`s from the stream.
- any performance based AOT-lang extensions used to implement a policy
  for handling recv-side errors space can avoid knowledge of the lower
  level IPC `Channel` (and-downward) primitives.
- `Context` consuming code can choose to let all msg-type-errs
  bubble and handle them manually (like any other remote `Error`
  shuttled exception).
- we can keep (as before) send-side msg type checks can be raised
  locally and cause offending senders to error and adjust before the
  streaming phase of an IPC ctx.

Impl (related) deats:
- obvi make `MsgpackTCPStream.recv()` yield up any `MsgTypeError`
  constructed by `_mk_msg_type_err()` such that the exception will
  eventually be relayed up to `._rpc.process_messages()` and from
  their delivered to the corresponding ctx-task.
- in support of ^, make `Channel.recv()` detect said mtes and use the
  new `pack_from_raise()` to inject the far end `Actor.uid` for the
  `Error.src_uid`.
- keep raising the send side equivalent (when strict enabled) errors
  inline immediately with no upward `Error` packing or relay.
- improve `_mk_msg_type_err()` cases handling with far more detailed
  `MsgTypeError` "message" contents pertaining to `msgspec` specific
  failure-fixing-tips and type-spec mismatch info:
  * use `.from_decode()` constructor in recv-side case to inject the
    non-spec decoded `msg_dict: dict` and use the new
    `MsgCodec.pld_spec_str: str` when clarifying the type discrepancy
    with the offending field.
  * on send-side, if we detect that an unsupported field type was
    described in the original `src_type_error`, AND there is no
    `msgpack.Encoder.enc_hook()` set, that the real issue is likely
    that the user needs to extend the codec to support the
    non-std/custom type with a hook and link to `msgspec` docs.
  * if one of a `src_type/validation_error` is provided, set that
    error as the `.__cause__` in the new mte.
2025-03-24 14:04:51 -04:00
Tyler Goodlet ae42b91384 Expose `MsgType` and extend `MsgCodec` API a bit
Make a new `MsgType: TypeAlias` for the union of all msg types such that
it can be used in annots throughout the code base; just make
`.msg.__msg_spec__` delegate to it.

Add some new codec methods:
- `pld_spec_str`: for the `str`-casted value of the payload spec,
  generally useful in logging content.
- `msg_spec_items()`: to render a `dict` of msg types to their
  `str()`-casted values with support for singling out a specific
  `MsgType`, type by input `msg` instance.
- `pformat_msg_spec()`: for rendering the (partial) `.msg_spec` as
  a formatted `str` useful in logging.

Oh right, add a `Error._msg_dict: dict` in support of the previous
commit (for `MsgTypeError` packing as RAEs) such that our error msg type
can house a non-type-spec decoded wire-bytes for error
reporting/analysis purposes.
2025-03-24 14:04:51 -04:00
Tyler Goodlet dbebcc54cc Unify `MsgTypeError` as a `RemoteActorError` subtype
Since in the receive-side error case the source of the exception is the
sender side (normally causing a local `TypeError` at decode time), might
as well bundle the error in remote-capture-style using boxing semantics
around the causing local type error raised from the
`msgspec.msgpack.Decoder.decode()` and with a traceback packed from
`msgspec`-specific knowledge of any field-type spec matching failure.

Deats on new `MsgTypeError` interface:
- includes a `.msg_dict` to get access to any `Decoder.type`-applied
  load of the original (underlying and offending) IPC msg into
  a `dict` form using a vanilla decoder which is normally packed into
  the instance as a `._msg_dict`.
- a public getter to the "supposed offending msg" via `.payload_msg`
  which attempts to take the above `.msg_dict` and load it manually into
  the corresponding `.msg.types.MsgType` struct.
- a constructor `.from_decode()` to make it simple to build out error
  instances from a failed decode scope where the aforementioned
  `msgdict: dict` from the vanilla decode can be provided directly.
- ALSO, we now pack into `MsgTypeError` directly just like ctxc in
  `unpack_error()`

This also completes the while-standing todo for `RemoteActorError` to
contain a ref to the underlying `Error` msg as `._ipc_msg` with public
`@property` access that `defstruct()`-creates a pretty struct version
via `.ipc_msg`.

Internal tweaks for this include:
- `._ipc_msg` is the internal literal `Error`-msg instance if provided
  with `.ipc_msg` the dynamic wrapper as mentioned above.
- `.__init__()` now can still take variable `**extra_msgdata` (similar
  to the `dict`-msgdata as before) to maintain support for subtypes
  which are constructed manually (not only by `pack_error()`) and insert
  their own attrs which get placed in a `._extra_msgdata: dict` if no
  `ipc_msg: Error` is provided as input.
- the `.msgdata` is now a merge of any `._extra_msgdata` and
  a `dict`-casted form of any `._ipc_msg`.
- adjust all previous `.msgdata` field lookups to try equivalent field
  reads on `._ipc_msg: Error`.
- drop default single ws indent from `.tb_str` and do a failover lookup
  to `.msgdata` when `._ipc_msg is None` for the manually constructed
  subtype-instance case.
- add a new class attr `.extra_body_fields: list[str]` to allow subtypes
  to declare attrs they want shown in the `.__repr__()` output, eg.
  `ContextCancelled.canceller`, `StreamOverrun.sender` and
  `MsgTypeError.payload_msg`.
- ^-rework defaults pertaining to-^ with rename from
  `_msgdata_keys` -> `_ipcmsg_keys` with latter now just loading directly
  from the `Error` fields def and `_body_fields: list[str]` just taking
  that value and removing the not-so-useful-in-REPL or already shown
  (i.e. `.tb_str: str`) field names.
- add a new mod level `.pack_from_raise()` helper for auto-boxing RAE
  subtypes constructed manually into `Error`s which is normally how
  `StreamOverrun` and `MsgTypeError` get created in the runtime.
- in support of the above expose a `src_uid: tuple` override to
  `pack_error()` such that the runtime can provide any remote actor id
  when packing a locally-created yet remotely-caused RAE subtype.
- adjust all typing to expect `Error`s over `dict`-msgs.

Adjust some tests to match these changes:
- context and inter-peer-cancel tests to make their `.msgdata` related
  checks against the new `.ipc_msg` as well and `.tb_str` directly.
- toss in an extra sleep to `sleep_a_bit_then_cancel_peer()` to keep the
  'canceller' ctx child task cancelled by it's parent in the 'root' for
  the rte-raised-during-ctxc-handling case (apparently now it's
  returning too fast, cool?).
2025-03-24 14:04:51 -04:00
Tyler Goodlet fb94ecd729 Rename `Actor._push_result()` -> `._deliver_ctx_payload()`
Better describes the internal RPC impl/latest-architecture with the msgs
delivered being those which either define a `.pld: PayloadT` that gets
passed up to user code, or the error-msg subset that similarly is raised
in a ctx-linked task.
2025-03-24 14:04:51 -04:00
Tyler Goodlet b3e3e0ff85 Caps-msging test tweaks to get correct failures
These are likely temporary changes but still needed to actually see the
desired/correct failures (of which 5 of 6 tests are supposed to fail rn)
mostly to do with `Start` and `Return` msgs which are invalid under each
test's applied msg-spec.

Tweak set here:
- bit more `print()`s in root and sub for grokin test flow.
- never use `pytes.fail()` in subactor.. should know this by now XD
- comment out some bits that can't ever pass rn and make the underlying
  expected failues harder to grok:
  - the sub's child-side-of-ctx task doing sends should only fail
    for certain msg types like `Started` + `Return`, `Yield`s are
    processed receiver/parent side.
  - don't expect `sent` list to match predicate set for the same reason
    as last bullet.

The outstanding msg-type-semantic validation questions are:
- how to handle `.open_context()` with an input `kwargs` set that
  doesn't adhere to the currently applied msg-spec?
  - should the initial `@acm` entry fail before sending to the child
    side?
- where should received `MsgTypeError`s be raised, at the `MsgStream`
  `.receive()` or lower in the stack?
  - i'm thinking we should mk `MsgTypeError` derive from
    `RemoteActorError` and then have it be delivered as an error to the
    `Context`/`MsgStream` for per-ctx-task handling; would lead to more
    flexible/modular policy overrides in user code outside any defaults
    we provide.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 8ac9ccf65d Finally drop masked `chan.send(None)` related code blocks 2025-03-24 14:04:51 -04:00
Tyler Goodlet 3bccdf6de4 Detail out EoC-by-self log msg 2025-03-24 14:04:51 -04:00
Tyler Goodlet 7686dd7a15 Use `object()` when checking for error field value
Since the field value could be `None` or some other type with
truthy-ness evaluating to `False`..
2025-03-24 14:04:51 -04:00
Tyler Goodlet 7b92d2b1cb Flatten out RPC loop with `match:`/`case:`
Mainly expanding out the runtime endpoints for cancellation to separate
cases and flattening them with the main RPC-request-invoke block, moving
the non-cancel runtime case (where we call `getattr(actor, funcname)`)
inside the main `Start` case (for now) which branches on `ns=="self"`.

Also, add a new IPC msg `class CancelAck(Return):` which is always
included in the default msg-spec such that runtime cancellation (and
eventually all) endpoints return that msg (instead of a `Return`) and
thus sidestep any currently applied `MsgCodec` such that the results
(`bool`s for most cancel methods) are never violating the current type
limit(s) on `Msg.pld`. To support this expose a new variable
`return_msg: Return|CancelAck` param from
`_invoke()`/`_invoke_non_context)()` and set it to `CancelAck` in the
appropriate endpoint case-blocks of the msg loop.

Clean out all the lingering legacy `chan.send(<dict-msg>)` commented
codez from the invoker funcs, with more cleaning likely to come B)
2025-03-24 14:04:51 -04:00
Tyler Goodlet 939f198dd9 Drop `None`-sentinel cancels RPC loop mechanism
Pretty sure we haven't *needed it* for a while, it was always generally
hazardous in terms of IPC msg types, AND it's definitely incompatible
with a dynamically applied typed msg spec: you can't just expect
a `None` to be willy nilly handled all the time XD

For now I'm masking out all the code and leaving very detailed
surrounding notes but am not removing it quite yet in case for strange
reason it is needed by some edge case (though I haven't found according
to the test suite).

Backstory:
------ - ------
Originally (i'm pretty sure anyway) it was added as a super naive
"remote cancellation" mechanism (back before there were specific `Actor`
methods for such things) that was mostly (only?) used before IPC
`Channel` closures to "more gracefully cancel" the connection's parented
RPC tasks. Since we now have explicit runtime-RPC endpoints for
conducting remote cancellation of both tasks and full actors, it should
really be removed anyway, because:
- a `None`-msg setinel is inconsistent with other RPC endpoint handling
  input patterns which (even prior to typed msging) had specific
  msg-value triggers.
- the IPC endpoint's (block) implementation should use
  `Actor.cancel_rpc_tasks(parent_chan=chan)` instead of a manual loop
  through a `Actor._rpc_tasks.copy()`..

Deats:
- mask the `Channel.send(None)` calls from both the `Actor._stream_handler()` tail
  as well as from the `._portal.open_portal()` was connected block.
- mask the msg loop endpoint block and toss in lotsa notes.

Unrelated tweaks:
- drop `Actor._debug_mode`; unused.
- make `Actor.cancel_server()` return a `bool`.
- use `.msg.pretty_struct.Struct.pformat()` to show any msg that is
  ignored (bc invalid) in `._push_result()`.
2025-03-24 14:04:51 -04:00
Tyler Goodlet e87f688c8d Factor `MsgpackTCPStream` msg-type checks
Add both the `.send()` and `.recv()` handling blocks to a common
`_raise_msg_type_err()` which includes detailed error msg formatting:

- the `.recv()` side case does introspection of the `Msg` fields and
  attempting to report the exact (field type related) issue
- `.send()` side does some boxed-error style tb formatting like
  `RemoteActorError`.
- add a `strict_types: bool` to `.send()` to allow for just
  warning on bad inputs versus raising, but always raise from any
  `Encoder` type error.
2025-03-24 14:04:51 -04:00
Tyler Goodlet ffbe471790 Expose `MsgTypeError` from pkg 2025-03-24 14:04:51 -04:00
Tyler Goodlet 0df557d2dd Make `Context.started()` a type checked IPC send
As detailed in the surrounding notes, it's pretty advantageous to always
have the child context task ensure the first msg it relays back is
msg-type checked against the current spec and thus `MsgCodec`. Implement
the check via a simple codec-roundtrip of the `Started` msg such that
the `.pld` payload is always validated before transit. This ensures the
child will fail early and notify the parent before any streaming takes
place (i.e. the "nasty" dialog protocol phase).

The main motivation here is to avoid inter-actor task syncing bugs that
are hard(er) to recover from and/or such as if an invalid typed msg is
sent to the parent, who then ignores it (depending on config), and then
the child thinks the parent is in some presumed state while the parent
is still thinking a first msg has yet to arrive. Doing the stringent
check on the sender side (i.e. the child is sending the "first"
application msg via `.started()`) avoids/sidesteps dealing with such
syncing/coordinated-state problems by keeping the entire IPC dialog in
a "cheap" or "control" style transaction up until a stream is opened.

Iow, the parent task's `.open_context()` block entry can't occur until
the child side is definitely (as much as is possible with IPC msg type
checking) in a correct state spec wise. During any streaming phase in
the dialog the msg-type-checking is NOT done for performance (the
"nasty" protocol phase) and instead any type errors are relayed back
from the receiving side. I'm still unsure whether to take the same
approach on the `Return` msg, since at that point erroring early doesn't
benefit the parent task if/when a msg-type error occurs? Definitely more
to ponder and tinker out here..

Impl notes:
- a gotcha with the roundtrip-codec-ed msg is that it often won't match
  the input `value` bc in the `msgpack` case many native python
  sequence/collection types will map to a common array type due to the
  surjection that `msgpack`'s type-sys imposes.
  - so we can't assert that `started == rt_started` but it may be useful
    to at least report the diff of the type-reduced payload so that the
    caller can at least be notified how the input `value` might be
    better type-casted prior to call, for ex. pre-casting to `list`s.
- added a `._strict_started: bool` that could provide the stringent
  checking if desired in the future.
- on any validation error raise our `MsgTypeError` from it.
- ALSO change over the lingering `.send_yield()` deprecated meth body
  to use a `Yield()`.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 99a2e13c91 Factor boxed-err formatting into new `pformat_boxed_tb()` helper for use elsewhere 2025-03-24 14:04:51 -04:00
Tyler Goodlet d33eb15884 Add buncha notes on `Start` field for "params"
Such that the current `kwargs: dict` field can eventually be strictly
msg-typed (eventually directly from a `@context` def) using modern typed
python's hippest syntactical approach B)

Also proto a new `CancelAck(Return)` subtype msg for supporting msg-spec
agnostic `Actor.cancel_xx()` method calls in the runtime such that
a user can't break cancellation (and thus SC) by dynamically setting
a codec that doesn't allow `bool` results (as an eg. in this case).
Note that the msg isn't used yet in `._rpc` but that's a comin!
2025-03-24 14:04:51 -04:00
Tyler Goodlet c2fc6293aa Extend codec test to for msg-spec parameterizing
Set a diff `Msg.pld` spec per test and then send multiple types to
a child actor making sure the child can only send certain types over
a stream and fails with validation or decode errors ow. The test is also
param-ed both with and without hooks demonstrating how a custom type,
`NamespacePath`, needs them for effective use. The subactor IPC context
child is passed a `expect_ipc_send: dict` which relays the values along
with their expected `.send()`-ability.

Deats on technical refinements:
------ - ------
- added a `iter_maybe_sends()` send-value-as-msg-auditor and predicate
  generator (literally) so as to be able to pre-determine if given the
  current codec and `send_values` which values are expected to be IPC
  transmittable.
- as per ^, the diff value-msgs are first round-tripped inside
  a `Started` msg using the configured codec in the parent/root actor
  before bothering with using IPC primitives + a subactor; this is how
  the `expect_ipc_send` table is generated initially.
- for serializing the specs (`Union[Type]`s as required by `msgspec`),
  added a pair of codec hooks: `enc/dec_type_union()` (that ideally we
  move into a `.msg` submod eventually) which code the type-values as
  a `list[str]` of names.
  - the `dec_` hook had to be modified to NOT raise an error when an
    invalid/unhandled value arrives, this is because we do NOT want the
    RPC msg handling loop to raise on the `async for msg in chan:` and
    instead prefer to ignore and warn (for now, but eventually respond
    with error msg - see notes in hook body) these msgs when sent during
    a streaming phase; `Context.started()` will however error on a bad
    input for the current msg-spec since it is part of the "cheap"
    dialog (again see notes in `._context`) wherein the `Started` msg
    is always roundtripped prior to `Channel.send()` to guarantee
    the child adheres to its own spec.
- tossed in lotsa `print()`s for console groking of the run progress.

Further notes on typed-msging breaking cancellation:
------ - ------
- turns out since the runtime's cancellation implementation, being done
  with `Actor.cancel()` methods and friends will actually break when
  a stringent spec is applied (eg. a single type-spec) since the return
  values from said methods are generally `bool`s..
- this means we do indeed need special handling of "runtime RPC method
  invocations" since ideally a user's msg-spec choices do not break core
  functionality on them XD
=> The obvi solution is to add a/some special sub-`Msg` types for such
  cases, possibly just a `RuntimeReturn(Return)` type that will always
  include a `.pld: bool` for these cancel methods such that their
  results are always handled without msg type errors.

More to come on a (hopefully) elegant solution to that last bit!
2025-03-24 14:04:51 -04:00
Tyler Goodlet 9de2fff273 Use `._testing.break_ipc()` in final advanced fault test child ctx 2025-03-24 14:04:51 -04:00
Tyler Goodlet 8f18c9febf Start a new `._testing.fault_simulation`
Since I needed the `break_ipc()` helper from the
`examples/advanced_faults/ipc_failure_during_stream.py` used in the
`test_advanced_faults` suite, might as well move it into a pkg-wide
importable module. Also changed the default break method to be
`socket_close` which just calls `Stream.socket.close()` underneath in
`trio`.

Also tweak that example to not keep sending after the stream has been
broken since with new `trio` that will raise `ClosedResourceError` and
in the wrapping test we generally speaking want to see a hang and then
cancel via simulated user sent SIGINT/ctl-c.
2025-03-24 14:04:51 -04:00
Tyler Goodlet ed72974ec4 Flip default codec to our `Msg`-spec
Yes, this is "the switch" and will likely cause the test suite to bail
until a few more fixes some in.

Tweaked a couple `.msg` pkg exports:
- remove `__spec__` (used by modules) and change it to `__msg_types:
  lists[Msg]` as well as add a new `__msg_spec__: TypeAlias`, being the
  default `Any` paramed spec.
- tweak the naming of `msg.types` lists of runtime vs payload msgs to:
  `._runtime_msgs` and `._payload_msgs`.
- just build `__msg_types__` out of the above 2 lists.
2025-03-24 14:04:51 -04:00
Tyler Goodlet e1f612996c TOSQUASH f2ce4a3, timeout bump 2025-03-24 14:04:51 -04:00
Tyler Goodlet fc83f4ecf0 Woops, only pack `Error(cid=cid)` if input is not `None` 2025-03-24 14:04:51 -04:00
Tyler Goodlet 09eed9d7e1 WIP porting runtime to use `Msg`-spec 2025-03-24 14:04:51 -04:00
Tyler Goodlet b56b3aa890 Add timeouts around some context test bodies
Since with my in-index runtime-port to our native msg-spec it seems
these ones are hanging B(

- `test_one_end_stream_not_opened()`
- `test_maybe_allow_overruns_stream()`

Tossing in some `trio.fail_after()`s seems to at least gnab them as
failures B)
2025-03-24 14:04:51 -04:00
Tyler Goodlet bc87c51ff1 Get `test_codec_hooks_mod` working with `Msg`s
Though the runtime hasn't been changed over in this patch (it was in the
local index at the time however), the test does now demonstrate that
using a `Started` the correctly typed `.pld` will codec correctly when
passed manually to `MsgCodec.encode/decode()`.

Despite not having the runtime ported to the new shuttle msg set
(meaning the mentioned test will fail without the runtime port patch),
I was able to get this first original test working that limits payload
packets as a `Msg.pld: NamespacePath`this as long as we spec
`enc/dec_hook()`s then the `Msg.pld` will be processed correctly as per:
https://jcristharif.com/msgspec/extending.html#mapping-to-from-native-types
in both the `Any` and `NamespacePath|None` spec cases.
^- turns out in this case -^ that the codec hooks only get invoked on
the unknown-fields NOT the entire `Struct`-msg.

A further gotcha was merging a `|None` into the `pld_spec` since this
test spawns a subactor and opens a context via `send_back_nsp()` and
that func has no explicit `return` - so of course it delivers
a `Return(pld=None)` which will fail if we only spec `NamespacePath`.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 8468bcca36 Get msg spec type limiting working with a `RunVar`
Since `contextvars.ContextVar` seems to reset to the default in every
new task, switching to using `trio.lowlevel.RunVar` kinda gets close to
what we'd like where a child scope can override what's in the rent but
ideally without modifying the rent's. I tried `tricycle.TreeVar` as well
but it also seems to reset across (embedded) nurseries in our runtime;
need to try it again bc apparently that's not how it's suppose to work?

NOTE that for now i'm keeping the `.msg.types._ctxvar_MsgCodec` set to
the `msgspec` default (`Any` types) so that the test suite will still
pass until the runtime is ported to the new msg-spec + codec.

Surrounding and in support of all this the `Msg`-set impl deats changed
a bit as well as various stuff in `.msg` sub-mods:

- drop the `.pld` struct types for `Error`, `Start`, `StartAck` since we
  don't really need the `.pld` payload field in those cases since
  they're runtime control msgs for starting RPC tasks and handling
  remote errors; we can just put the fields directly on each msg since
  the user will never want/need to override the `.pld` field type.

- add a couple new runtime msgs and include them in `msg.__spec__`
  and make them NOT inherit from `Msg` since they are runtime-specific
  and thus have no need for `.pld` type constraints:
  - `Aid` the actor-id identity handshake msg.
  - `SpawnSpec`: the spawn data passed from a parent actor down to a
    a child in `Actor._from_parent()` for which we need a shuttle
    protocol msg, so might as well make it a pendatic one ;)

- fix some `Actor.uid` field types that were type-borked on `Error`

- add notes about how we need built-in `debug_mode` msgs in order to
  avoid msg-type errors when using the TTY lock machinery and
  a different `.pld` spec then the default `Any` is in use..
  -> since `devx._debug.lock_tty_for_child()` and it's client side
  `wait_for_parent_stdin_hijack()` use `Context.started('Locked')`
  and `MsgStream.send('pdb_unlock')` string values as their `.pld`
  contents we'd need to either always do a `ipc_pld_spec | str` or
  pre-define some dedicated `Msg` types which get `Union`-ed in
  for this?

- break out `msg.pretty_struct.Struct._sin_props()` into a helper func
  `iter_fields()` since the impl doesn't require a struct instance.

- as mentioned above since `ContextVar` didn't work as anticipated
  I next tried `tricycle.TreeVar` but that too didn't seem to keep
  the `apply_codec()` setting intact across
  `Portal.open_context()`/`Context.open_stream()` (it kept reverting to
  the default `.pld: Any` default setting) so I finalized on
  a trio.lowlevel.RunVar` for now despite it basically being
  a `global`..
  -> will probably come back to test this with `TreeVar` and some hot
  tips i picked up from @mikenerone in the `trio` gitter, which i put in
  comments surrounding proto-code.
2025-03-24 14:04:51 -04:00
Tyler Goodlet a38ac07af5 Be mega pedantic with msg-spec building
Turns out the generics based payload speccing API, as in
https://jcristharif.com/msgspec/supported-types.html#generic-types,
DOES WORK properly as long as we don't rely on inheritance from `Msg`
a parent `Generic`..

So let's get real pedantic in the `mk_msg_spec()` internals as well as
verification in the test suite!

Fixes in `.msg.types`:
- implement (as part of tinker testing) multiple spec union building
  methods via a `spec_build_method: str` to `mk_msg_spec()` and leave a
  buncha notes around what did and didn't work:
  - 'indexed_generics' is the only method THAT WORKS and the one that
    you'd expect being closest to the `msgspec` docs (link above).
  - 'defstruct' using dynamically defined msgs => doesn't work!
  - 'types_new_class' using dynamically defined msgs but with
    `types.new_clas()` => ALSO doesn't work..

- explicitly separate the `.pld` type-constrainable by user code msg
  set into `types._payload_spec_msgs` putting the others in
  a `types._runtime_spec_msgs` and the full set defined as `.__spec__`
  (moving it out of the pkg-mod and back to `.types` as well).

- for the `_payload_spec_msgs` msgs manually make them inherit `Generic[PayloadT]`
  and (redunantly) define a `.pld: PayloadT` field.

- make `IpcCtxSpec.functype` an in line `Literal`.

- toss in some TODO notes about choosing a better `Msg.cid` type.

Fixes/tweaks around `.msg._codec`:
- rename `MsgCodec.ipc/pld_msg_spec` -> `.msg/pld_spec`
- make `._enc/._dec` non optional fields
- wow, ^facepalm^ , make sure `._ipc.MsgpackTCPStream.__init__()` uses
  `mk_codec()` since `MsgCodec` can't be (easily) constructed directly.

Get more detailed in testing:
- inside the `chk_pld_type()` helper ensure `roundtrip` is always set to
  some value, `None` by default but a bool depending on legit outcome.
  - drop input `generic`; no longer used.
  - drop the masked `typedef` loop from `Msg.__subclasses__()`.
  - for add an `expect_roundtrip: bool` and use to jump into debugger
    when any expectation doesn't match the outcome.
- use new `MsgCodec` field names (as per first section above).
- ensure the encoded msg matches the decoded one from both the ad-hoc
  decoder and codec loaded values.
- ensure the pld checking is only applied to msgs in the
  `types._payload_spec_msgs` set by `typef.__name__` filtering
  since `mk_msg_spec()` now returns the full `.types.Msg` set.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 48606b6c77 Tweak msging tests to match codec api changes
Mostly adjusting input args/logic to various spec/codec signatures and
new runtime semantics:

- `test_msg_spec_xor_pld_spec()` to verify that a shuttle prot spec and
  payload spec are necessarily mutex and that `mk_codec()` enforces it.
- switch to `ipc_msg_spec` input in `mk_custom_codec()` helper.
- drop buncha commented cruft from `test_limit_msgspec()` including no
  longer needed type union instance checks in dunder attributes.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 4251ee4c51 Drop `MsgCodec.decoder()/.encoder()` design
Instead just instantiate `msgpack.Encoder/Decoder` instances inside
`mk_codec()` and assign them directly as `._enc/._dec` fields.
Explicitly take in named-args to both and proxy to the coder/decoder
instantiation calls directly.

Shuffling some codec internals:
- rename `mk_codec()` inputs as `ipc_msg_spec` and `ipc_pld_spec`, make
  them mutex such that a payload type spec can't be passed if the
  built-in msg-spec isn't used.
  => expose `MsgCodec.ipc_pld_spec` directly from `._dec.type`
  => presume input `ipc_msg_spec` is `Any` by default when no
    `ipc_pld_spec` is passed since we have no way atm to enable
    a similar type-restricted-payload feature without a wrapping
    "shuttle protocol" ;)

- move all the payload-sub-decoders stuff prototyped in GH#311
  (inside `.types`) to `._codec` as commented-for-later-maybe `MsgCodec`
  methods including:
  - `.mk_pld_subdec()` for registering
  - `.enc/dec_payload()` for sub-codec field loading.

- also comment out `._codec.mk_tagged_union_dec()` as the orig
  tag-to-decoder table factory, now mostly superseded by
  `.types.mk_msg_spec()` which takes the generic parameterizing approach
  instead.

- change naming to `types.mk_msg_spec(payload_type_union)` input, making
  it more explicit that it expects a `Union[Type]`.

Oh right, and start exposing all the `.types.Msg` subtypes in the `.msg`
subpkg in prep for usage throughout the runtime B)
2025-03-24 14:04:50 -04:00
Tyler Goodlet 89bc5ab8c4 Change to multi-line-static-`dict` style msgs
Re-arranging such that element-orders are line-arranged to our new
IPC `.msg.types.Msg` fields spec in prep for replacing the current
`dict`-as-msg impls with the `msgspec.Struct` native versions!
2025-03-24 14:04:50 -04:00
Tyler Goodlet e1e87c95c5 Tweak msg-spec test suite mod name 2025-03-24 14:04:50 -04:00
Tyler Goodlet c5985169cc Init def of "SC shuttle prot" with "msg-spec-limiting"
As per the long outstanding GH issue this starts our rigorous journey
into an attempt at a type-safe, cross-actor SC, IPC protocol Bo

boop -> https://github.com/goodboy/tractor/issues/36

The idea is to "formally" define our SC "shuttle (dialog) protocol" by
specifying a new `.msg.types.Msg` subtype-set which can fully
encapsulate all IPC msg schemas needed in order to accomplish
cross-process SC!

The msg set deviated a little in terms of (type) names from the existing
`dict`-msgs currently used in the runtime impl but, I think the name
changes are much better in terms of explicitly representing the internal
semantics of the actor runtime machinery/subsystems and the
IPC-msg-dialog required for SC enforced RPC.

------ - ------

In cursory, the new formal msgs-spec includes the following msg-subtypes
of a new top-level `Msg` boxing type (that holds the base field schema
for all msgs):

- `Start` to request RPC task scheduling by passing a `FuncSpec` payload
  (to replace the currently used `{'cmd': ... }` dict msg impl)

- `StartAck` to allow the RPC task callee-side to report a `IpcCtxSpec`
  payload immediately back to the caller (currently responded naively via
  a `{'functype': ... }` msg)

- `Started` to deliver the first value from `Context.started()`
  (instead of the existing `{'started': ... }`)

- `Yield` to shuttle `MsgStream.send()`-ed values (instead of
  our `{'yield': ... }`)

- `Stop` to terminate a `Context.open_stream()` session/block
  (over `{'stop': True }`)

- `Return` to deliver the final value from the `Actor.start_remote_task()`
  (which is a `{'return': ... }`)

- `Error` to box `RemoteActorError` exceptions via a `.pld: ErrorData`
  payload, planned to replace/extend the current `RemoteActorError.msgdata`
  mechanism internal to `._exceptions.pack/unpack_error()`

The new `tractor.msg.types` includes all the above msg defs as well an API
for rendering a "payload type specification" using a
`payload_type_spec: Union[Type]` that can be passed to
`msgspec.msgpack.Decoder(type=payload_type_spec)`. This ensures that
(for a subset of the above msg set) `Msg.pld: PayloadT` data is
type-parameterized using `msgspec`'s new `Generic[PayloadT]` field
support and thus enables providing for an API where IPC `Context`
dialogs can strictly define the allowed payload-datatype-set via type
union!

Iow, this is the foundation for supporting `Channel`/`Context`/`MsgStream`
IPC primitives which are type checked/safe as desired in GH issue:
- https://github.com/goodboy/tractor/issues/365

Misc notes on current impl(s) status:
------ - ------
- add a `.msg.types.mk_msg_spec()` which uses the new `msgspec` support
  for `class MyStruct[Struct, Generic[T]]` parameterize-able fields and
  delivers our boxing SC-msg-(sub)set with the desired `payload_types`
  applied to `.pld`:
  - https://jcristharif.com/msgspec/supported-types.html#generic-types
  - as a note this impl seems to need to use `type.new_class()` dynamic
    subtype generation, though i don't really get *why* still.. but
    without that the `msgspec.msgpack.Decoder` doesn't seem to reject
    `.pld` limited `Msg` subtypes as demonstrated in the new test.

- around this ^ add a `.msg._codec.limit_msg_spec()` cm which exposes
  this payload type limiting API such that it can be applied per task
  via a `MsgCodec` in app code.

- the orig approach in https://github.com/goodboy/tractor/pull/311 was
  the idea of making payload fields `.pld: Raw` wherein we could have
  per-field/sub-msg decoders dynamically loaded depending on the
  particular application-layer schema in use. I don't want to lose the
  idea of this since I think it might be useful for an idea I have about
  capability-based-fields(-sharing, maybe using field-subset
  encryption?), and as such i've kept the (ostensibly) working impls in
  TODO-comments in `.msg._codec` wherein maybe we can add
  a `MsgCodec._payload_decs: dict` table for this later on.
  |_ also left in the `.msg.types.enc/decmsg()` impls but renamed as
    `enc/dec_payload()` (but reworked to not rely on the lifo codec
    stack tables; now removed) such that we can prolly move them to
    `MsgCodec` methods in the future.

- add an unused `._codec.mk_tagged_union_dec()` helper which was
  originally factored out the  proto-code but didn't end up working
  as desired with the new parameterized generic fields approach (now
  in `msg.types.mk_msg_spec()`)

Testing/deps work:
------ - ------
- new `test_limit_msgspec()` which ensures all the `.types` content is
  correct but without using the wrapping APIs in `._codec`; i.e. using
  a in-line `Decoder` instead of a `MsgCodec`.

- pin us to `msgspec>=0.18.5` which has the needed generic-types support
  (which took me way too long yester to figure out when implementing all
  this XD)!
2025-03-24 14:04:49 -04:00
Tyler Goodlet e77333eb73 Move the pretty-`Struct` stuff to a `.pretty_struct`
Leave all the proto native struct-msg stuff in `.types` since i'm
thinking it's the right name for the mod that will hold all the built-in
SCIPP msgspecs longer run. Obvi the naive codec stack stuff needs to be
cleaned out/up and anything useful moved into `._codec` ;)
2025-03-24 14:04:33 -04:00
Tyler Goodlet ae434ae8a4 Merge original content from PR into `.msg.types` for now 2025-03-24 14:04:33 -04:00
Tyler Goodlet 8c23f83889 Re-think, `msgspec`-multi-typed msg dialogs
The greasy details are strewn throughout a `msgspec` issue:
https://github.com/jcrist/msgspec/issues/140

and specifically this code was mostly written as part of POC example in
this comment:
https://github.com/jcrist/msgspec/issues/140#issuecomment-1177850792

This work obviously pertains to our desire and prep for typed messaging
and capabilities aware msg-oriented-protocols in . I added a "wants
to have" method to `Context` showing how I think we could offer a pretty
neat msg-type-set-as-capability-for-protocol system.

XXX NOTE XXX: this commit was rewritten during a rebase from a very old
version as per the prior commit.
2025-03-24 14:04:33 -04:00
Tyler Goodlet b06754db3a WIP tagged union message type API
XXX NOTE XXX: this is a heavily modified commit from the original
(ec226463) which was super out of date when rebased onto the current
branch. I went through a manual conflict rework and removed all the
legacy segments as well as rename-moved this original mod
`tractor.msg.py` -> `tractor.msg/_old_msg.py`. Further the
`NamespacePath` type def was discarded from this mod since it was from
a super old version which was already moved to a `.msg.ptr` submod.

As per original questions and discussion with `msgspec` author:
- https://github.com/jcrist/msgspec/issues/25
- https://github.com/jcrist/msgspec/issues/140

this prototypes a new (but very naive) `msgspec.Struct` codec
implementation which will be more filled out in the next commit.
2025-03-24 14:04:33 -04:00
Tyler Goodlet 213e083dc6 Proto `MsgCodec`, an interchange fmt modify API
Fitting in line with the issues outstanding:
- : (msg)spec-ing out our SCIPP (structured-con-inter-proc-prot).
  (https://github.com/goodboy/tractor/issues/36)

- : adding strictly typed IPC msg dialog schemas, more or less
  better described as "dialog/transaction scoped message specs"
  using `msgspec`'s tagged unions and custom codecs.
  (https://github.com/goodboy/tractor/issues/196)

- : using modern static type-annots to drive capability based
  messaging and RPC.
  (statically https://github.com/goodboy/tractor/issues/365)

This is a first draft of a new API for dynamically overriding IPC msg
codecs for a given interchange lib from any task in the runtime. Right
now we obviously only support `msgspec` but ideally this API holds
general enough to be used for other backends eventually (like
`capnproto`, and apache arrow).

Impl is in a new `tractor.msg._codec` with:
- a new `MsgCodec` type for encapsing `msgspec.msgpack.Encoder/Decoder`
  pairs and configuring any custom enc/dec_hooks or typed decoding.
- factory `mk_codec()` for creating new codecs ad-hoc from a task.
- `contextvars` support for a new `trio.Task` scoped
  `_ctxvar_MsgCodec: ContextVar[MsgCodec]` named 'msgspec_codec'.
- `apply_codec()` for temporarily modifying the above per task
  as needed around `.open_context()` / `.open_stream()` operation.

A new test (suite) in `test_caps_msging.py`:
- verify a parent and its child can enable the same custom codec (in
  this case to transmit `NamespacePath`s) with tons of pedantic ctx-vars
  checks.
- ToDo: still need to implement  msg types in order to be able to get
  decodes working (as in `MsgStream.receive()` will deliver an already
  created `NamespacePath` obj) since currently all msgs come packed in `dict`-msg
  wrapper packets..
  -> use the proto from PR  to get nested `msgspec.Raw` processing up
  and running Bo
2025-03-24 14:04:33 -04:00
Tyler Goodlet 154ef67c8e Prepare to offer (dynamic) `.msg.Codec` overrides
By simply allowing an input `codec: tuple` of funcs for now to the
`MsgpackTCPStream` transport but, ideally wrapping this in a `Codec`
type with an API for dynamic extension of the interchange lib's msg
processing settings. Right now we're tied to `msgspec.msgpack` for this
transport but with the right design this can likely extend to other libs
in the future.

Relates to starting feature work toward , , .
2025-03-24 14:04:33 -04:00
goodboy 470d349ef1 Merge pull request 'Drop old `setup.py` and deps included in `trio>=0.24.0`' () from pkg_tidying into main
Reviewed-on: 
2025-03-24 17:43:43 +00:00
Tyler Goodlet 627c514614 Add rendezvous proto link 2025-03-24 13:30:12 -04:00
Tyler Goodlet 33fcc036bd Aggregate guest-mode link, fill out IPC stack feat
Such that we just link on `guest`_ and use in the feat line as well as
the code ex. Fill out the IPC-stack feature bullet and put the
`.trionics` one last. Also fix-n-finish the `uv` shell install section.
2025-03-24 13:25:54 -04:00
Tyler Goodlet 799306ec4c Tweak supervison-proto into line 2025-03-24 13:05:38 -04:00
Tyler Goodlet aace10ccfb Add in zmq protocol links to feats list 2025-03-24 12:54:12 -04:00
Tyler Goodlet 0272936fdc Update readme for `uv`, refine feats list
Also make all code example snippets a sub-section of a new `Example
codez` section in prep to reduce the amount of code in the readme which
instead will be simply linked to in the repo in the future.
2025-03-24 12:07:27 -04:00
Tyler Goodlet db31bbfee2 Drop `trio-typing` as dep
Hasn't been needed for a while since the type-annots have been exposed
from core since `trio>=0.24`. Allows us to drop a buncha sub-deps as
well like,
- `async-generator`
- `importlib-metadata`
- `mypy-extensions`
- `typing-extensions`
- `zipp`

Yah, don't really know why i listed all those but..
2025-03-23 00:33:44 -04:00
Tyler Goodlet 96738a094f Drop legacy `setup.py`, we use `uv` now dog
Also remove the old `requirements-test/docs.txt` files moving the docs
deps as a masked TODO to our `pyproject.toml`.
2025-03-23 00:31:16 -04:00
goodboy ba81e5106c Merge pull request 'Use `uv` for packaging' () from uv_migration_pre_msgspec_in_runtime into main
Landed-in: 
2025-03-21 19:21:19 +00:00
Tyler Goodlet d927ed82d8 Mask not-yet-existing `.devx.pformat` import 2025-03-21 00:18:05 -04:00
Tyler Goodlet 9324d82ff1 Handle cpython builds with `libedit` for `readline`
Since `uv`'s cpython distributions are built this way `pdbp`'s tab
completion was breaking (as was vi-mode). This adds a new
`.devx._enable_readline_feats()` import hook which checks for the
appropriate library and applies settings accordingly.
2025-03-21 00:18:05 -04:00
Tyler Goodlet 7f70e09c33 Add in some dev deps for @goodboy
Namely since i use `xonsh` for a main shell, this includes adding it as
well as related tooling. Obvi bump the `uv.lock`.

Some other stuff retained from `poetry` days,
- add usage-comments around various (optional) deps.
- add toml section separator lines.
- go with 2-space indent.
- add comment on `trio>0.27` needed for py3.13+
2025-03-21 00:18:05 -04:00
Tyler Goodlet a80829a702 Disable invalid line in `ruff` config? 2025-03-21 00:18:05 -04:00
Tyler Goodlet 3a7e3505b4 Add a `ruff.toml` with ignore set taken from old `pyproject.toml` content 2025-03-21 00:18:05 -04:00
Guillermo Rodriguez e27d63b75f Migrate to uv using "uvx migrate-to-uv", use msgspec from git due to python 3.13 compat 2025-03-21 00:18:05 -04:00
goodboy e8bd834b5b
Merge pull request from goodboy/pause_from_sync_w_greenback
Pause from sync (with `greenback`), `log.devx()`, hide `@acm` frames
2025-03-21 00:17:28 -04:00
Tyler Goodlet 863751b47b Add `enable_stack_on_sig: bool` for `stackscope` toggle 2025-03-20 23:22:45 -04:00
Tyler Goodlet 46c8dbef1f Bleh, make `log.devx()` level less then cancel but > `.runtime()` 2025-03-20 23:22:45 -04:00
Tyler Goodlet e7dbb52b34 Tweaks to debugger examples
Light stuff like comments, typing, and a couple API usage updates.
2025-03-20 23:22:45 -04:00
Tyler Goodlet d044629cce Woops, make `log.devx()` level less `.error()` 2025-03-20 23:22:45 -04:00
Tyler Goodlet 8832cdfe0d Make `log.devx()` level below `.pdb()`
Kinda like a "runtime"-y level for `.pdb()` (which is more or less like
an `.info()` for our debugger subsys) which can be used to report
internals info for those hacking on `.devx` tools.

Also, inject only the *last* 6 digits of the `id(Task)` in
`pformat_task_uid()` output by default.
2025-03-20 23:22:45 -04:00
Tyler Goodlet f6fc43d58d Include truncated `id(trio.Task)` for task info in log header 2025-03-20 23:22:45 -04:00
Tyler Goodlet cdc513f25d Add a `.log.at_least_level()` predicate 2025-03-20 23:22:45 -04:00
Tyler Goodlet 9eaee7a060 Woops, make `log.devx()` level 600 2025-03-20 23:22:45 -04:00
Tyler Goodlet 63c087f08d Use `log.devx()` for `stackscope` messages 2025-03-20 23:22:45 -04:00
Tyler Goodlet d5f80365b5 Add a `log.devx()` level 2025-03-20 23:22:45 -04:00
Tyler Goodlet d20f711fb0 Tweak `breakpoint()` usage error message 2025-03-20 23:22:45 -04:00
Tyler Goodlet 21509791e3 Start a `devx._code` mod
Starting with a little sub-sys for tracing caller frames by marking them
with a dunder var (`__runtimeframe__` by default) and then scanning for
that frame such that code that is *calling* our APIs can be reported
easily in logging / tracing output.

New APIs:
- `find_caller_info()` which does the scan and delivers a,
- `CallerInfo` which (attempts) to expose both the runtime frame-info
  and frame of the caller func along with `NamespacePath` properties.

Probably going to re-implement the dunder var bit as a decorator later
so we can bind in the literal func-object ref instead of trying to look
it up with `get_class_from_frame()`, since it's kinda hacky/non-general
and def doesn't work for closure funcs..
2025-03-20 23:22:45 -04:00
Tyler Goodlet ce6974690b Relay `SIGUSR1` to subactors for `stackscope` tracing
Since obvi we don't want to just only see the trace in the root most of
the time ;)

Currently the sig keeps firing twice in the root though, and i'm not
sure why yet..
2025-03-20 23:22:45 -04:00
Tyler Goodlet 972325a28d Add defaul rtv for `use_greeback: bool = False` 2025-03-20 23:22:45 -04:00
Tyler Goodlet b4f890bd58 Flip to `.pause()` in subactor bp example 2025-03-20 23:22:45 -04:00
Tyler Goodlet e2fa5a4d05 Add `maybe_enable_greenback: bool` flag to `open_root_actor()` 2025-03-20 23:22:45 -04:00
Tyler Goodlet 2f4c019f39 Hide `._entry`/`._child` frames, tweak some more type annots 2025-03-20 23:22:45 -04:00
Tyler Goodlet 2b1dbcb541 TO-CHERRY: Error on `breakpoint()` without `debug_mode=True`?
Not sure if this is a good tactic (yet) but it at least covers us from
getting user's confused by `breakpoint()` usage causing REPL clobbering.
Always set an explicit rte raising breakpoint hook such that the user
realizes they can't use `.pause_from_sync()` without enabling debug
mode.

** CHERRY-PICK into `pause_from_sync_w_greenback` branch! **
2025-03-20 23:22:45 -04:00
Tyler Goodlet 49ebdc2e6a Oof, fix walrus assign causes name-error edge case
Only warn log on a non-`trio` async lib when in the main thread to
avoid a name error when in the non-`asyncio` non-main-thread case.

=> To cherry into the `.pause_from_sync()` feature branch.
2025-03-20 23:22:45 -04:00
Tyler Goodlet daf37ed24c Provision for infected-`asyncio` debug mode support
It's **almost** there, we're just missing the final translation code to
get from an `asyncio` side task to be able to call
`.devx._debug..wait_for_parent_stdin_hijack()` to do root actor TTY
locking. Then we just need to ensure internals also do the right thing
with `greenback()` for equivalent sync `breakpoint()` style pause
points.

Since i'm deferring this until later, tossing in some xfail tests to
`test_infected_asyncio` with TODOs for the needed implementation as well
as eventual test org.

By "provision" it means we add:
- `greenback` init block to `_run_asyncio_task()` when debug mode is
  enabled (but which will currently rte when `asyncio` is detected)
  using `.bestow_portal()` around the `asyncio.Task`.
- a call to `_debug.maybe_init_greenback()` in the `run_as_asyncio_guest()`
  guest-mode entry point.
- as part of `._debug.Lock.is_main_trio_thread()` whenever the async-lib
  is not 'trio' error lock the backend name (which is obvi `'asyncio'`
  in this use case).
2025-03-20 22:37:51 -04:00
Tyler Goodlet 0701874033 Drop extra newline from log msg 2025-03-20 22:37:51 -04:00
Tyler Goodlet 4621c8c1b9 Change all `| None` -> `|None` in `._runtime` 2025-03-20 22:37:51 -04:00
Tyler Goodlet a69f1a61a5 Add todo-notes for hiding `@acm` frames
In the particular case of the `Portal.open_context().__aexit__()` frame,
due to usage of `contextlib.asynccontextmanager`, we can't easily hook
into monkeypatching a `__tracebackhide__` set nor catch-n-reraise around
the block exit without defining our own `.__aexit__()` impl. Thus, it's
prolly most sane to do something with an override of
`contextlib._AsyncGeneratorContextManager` or the public exposed
`AsyncContextDecorator` (which uses the former internally right?).

Also fixup some old `._invoke` mod paths in comments and just show
`str(eoc)` in `.open_stream().__aexit__()` terminated-by-EoC log msg
since the `repr()` form won't pprint the IPC msg nicely..
2025-03-20 22:37:51 -04:00
Tyler Goodlet 0c9e1be883 Tweak main thread predicate to ensure `trio.run()`
Change the name to `Lock.is_main_trio_thread()` indicating that when
`True` the thread is both the main one **and** the one that called
`trio.run()`. Add a todo for just copying the
`trio._util.is_main_thread()` impl (since it's private / may change) and
some brief notes about potential usage of
`trio.from_thread.check_cancelled()` to detect non-`.to_thread` thread
spawns.
2025-03-20 22:37:51 -04:00
Tyler Goodlet 8731ab3134 Refine and test `tractor.pause_from_sync()`
Now supports use from any `trio` task, any sync thread started with
`trio.to_thread.run_sync()` AND also via `breakpoint()` builtin API!
The only bit missing now is support for `asyncio` tasks when in infected
mode.. Bo

`greenback` setup/API adjustments:
- move `._rpc.maybe_import_gb()` to -> `devx._debug` and factor out the cached
  import checking into a sync func whilst placing the async `.ensure_portal()`
  bootstrapping into a new async `maybe_init_greenback()`.
- use the new init-er func inside `open_root_actor()` with the output
  predicating whether we override the `breakpoint()` hook.

core `devx._debug` implementation deatz:
- make `mk_mpdb()` only return the `pdp.Pdb` subtype instance since
  the sigint unshielding func is now accessible from the `Lock`
  singleton from anywhere.

- add non-main thread support (at least for `trio.to_thread` use cases)
  to our `Lock` with a new `.is_trio_thread()` predicate that delegates
  directly to `trio`'s internal version.

- do `Lock.is_trio_thread()` checks inside any methods which require
  special provisions when invoked from a non-main `trio` thread:
  - `.[un]shield_sigint()` methods since `signal.signal` usage is only
    allowed from cpython's main thread.
  - `.release()` since `trio.StrictFIFOLock` can only be called from
    a `trio` task.

- rework `.pause_from_sync()` itself to directly call `._set_trace()`
  and don't bother with `greenback._await()` when we're already calling
  it from a `.to_thread.run_sync()` thread, oh and try to use the
  thread/task name when setting `Lock.local_task_in_debug`.

- make it an RTE for now if you try to use `.pause_from_sync()` from any
  infected-`asyncio` task, but support is (hopefully) coming soon!

For testing we add a new `test_debugger.py::test_pause_from_sync()`
which includes a ctrl-c parametrization around the
`examples/debugging/sync_bp.py` script which includes all currently
supported/working usages:
- `tractor.pause_from_sync()`.
- via `breakpoint()` overload.
- from a `trio.to_thread.run_sync()` spawn.
2025-03-20 22:37:51 -04:00
Tyler Goodlet b38ff36e04 First draft workin minus non-main-thread usage! 2025-03-20 22:37:51 -04:00
goodboy 819889702f
Merge pull request from goodboy/remote_inceptions
Remote inceptions: improved `RemoteActorError` boxing of inter-actor exceptions
2025-03-20 22:37:00 -04:00
Tyler Goodlet a36ee01592 Add missing `consider_namespace_packages=False,` to `import_path()` 2025-03-20 20:58:56 -04:00
Tyler Goodlet dd9fe0b043 Add `tests/__init__.py` for `.conftest` imports
I must have had a local touched file but never committed or something?
Seems that new `pytest` requires a top level `tests` pkg in order for
relative `.conftest` imports to work.
2025-03-20 20:53:54 -04:00
Tyler Goodlet e10ab9741d Lul, don't overwrite 'tb_str' with src actor's..
This is what was breaking the nested debugger test (where it was failing
on the traceback content matching) and it makes sense.. XD
=> We always want to use the locally boxed `RemoteActorError`'s
traceback content NOT overwrite it with that from the src actor..

Also gets rid of setting the `'relay_uid'` since it's pulled from the
final element in the `'relay_path'` anyway.
2025-03-20 20:35:02 -04:00
Tyler Goodlet 91a970091f Extend inter-peer cancel tests for "inceptions"
Use new `RemoteActorError` fields in various assertions particularly
ensuring that an RTE relayed through the spawner from the little_bro
shows up at the client with the right number of entries in the
`.relay_path` and that the error is raised in the client as desired in
the original use case from `modden`'s remote spawn spawn request API
(which was kinda the whole original motivation to finally get all this
multi-actor error relay stuff workin).

Case extensions:
- RTE relayed from little_bro through spawner to client when
  `raise_sub_spawn_error_after` is set; in this case test should raise
  the relayed and RAE boxed RTE right up to the `trio.run()`.
  -> ensure the `rae.src_uid`, `.relay_uid` are set correctly.
  -> ensure ctx cancels are no acked.
- use `expect_ctxc()` around root's `tell_little_bro()` usage.
- do `debug_mode` assertions when enabled by test harness in each actor
  layer.
- obvi use new `.src_type`/`.boxed_type` for final error propagation
  assertions.
2025-03-20 20:35:02 -04:00
Tyler Goodlet 5bf550b64a Adjust all `RemoteActorError.type` using tests
To instead use the new `.boxed_type` B)
2025-03-20 20:35:02 -04:00
Tyler Goodlet a3a3d0b8cb Fix `.boxed_type` facepalm, drop `.src_actor_uid`
The misname of `._boxed_type` as `._src_type` was only manifesting as
a reallly strange boxing error with a packed exception-group, not sure
how or why only that but it's fixed now XD

Start refining/cleaning out stuff for sure we don't need (based on
multiple local test runs):

- discard `.src_actor_uid` fully since test set has been moved over to
  `.src_uid`; this means also removing the `.msgdata` insertion from
  `pack_error()`; a patch to all internals is coming next obvi!

- don't pass `boxed_type` to `RemoteActorError.__init__()` from
  `unpack_error()` since it's now set directly via the
  `.msgdata["boxed_type_str"]`/`error_msg: dict` input , but in the case
  where **it is passed as an arg** (only for ctxc in `._rpc._invoke()`
  rn) make sure we only do the `.__init__()` insert when `boxed_type is
  not None`.
2025-03-20 20:35:02 -04:00
Tyler Goodlet c1e0328669 First try "relayed boxed errors", or "inceptions"
Since adding more complex inter-peer (actor) testing scenarios, we
definitely have an immediate need for `trio`'s style of "inceptions" but
for nesting `RemoteActorError`s as they're relayed through multiple
actor-IPC hops. So for example, a remote error relayed "through" some
proxy actor to another ends up packing a `RemoteActorError` into another
one such that there are 2 layers of RAEs with the first
containing/boxing an original src actor error (type).

In support of this extension to `RemoteActorError` we add:

- `get_err_type()` error type resolver helper (factored fromthe
  body of `unpack_error()`) to be used whenever rendering
  `.src_type`/`.boxed_type`.

- `.src_type_str: str` which is pulled from `.msgdata` and holds the
  above (eventually when unpacked) type as `str`.
- `._src_type: BaseException|None` for the original
  "source" actor's error as unpacked in any remote (actor's) env and
  exposed as a readonly property `.src_type`.

- `.boxed_type_str: str` the same as above but for the "last" boxed
  error's type; when the RAE is unpacked at its first hop this will
  be **the same as** `.src_type_str`.
- `._boxed_type: BaseException` which now similarly should be "rendered"
  from the below type-`str` field instead of passed in as a error-type
  via `boxed_type` (though we still do for the ctxc case atm, see
  notes).
 |_ new sanity checks in `.__init__()` mostly as a reminder to handle
   that ^ ctxc case ^ more elegantly at some point..
 |_ obvi we discard the previous `suberror_type` input arg.

- fully remove the `.type`/`.type_str` properties instead expecting
  usage of `.boxed_/.src_` equivalents.
- start deprecation of `.src_actor_uid` and make it delegate to new
  `.src_uid`
- add `.relay_uid` propery for the last relay/hop's actor uid.
- add `.relay_path: list[str]` which holds the per-hop updated sequence
  of relay actor uid's which consecutively did boxing of an RAE.
- only include `.src_uid` and `.relay_path` in reprol() output.
- factor field-to-str rendering into a new `_mk_fields_str()`
  and use it in `.__repr__()`/`.reprol()`.
- add an `.unwrap()` to (attempt to) render the src error.

- rework `pack_error()` to handle inceptions including,
  - packing the correct field-values for the new `boxed_type_str`, `relay_uid`,
    `src_uid`, `src_type_str`.
  - always updating the `relay_path` sequence with the uid of the
    current actor.

- adjust `unpack_error()` to match all these changes,
  - pulling `boxed_type_str` and passing any resolved `boxed_type` to
    `RemoteActorError.__init__()`.
  - use the new `Context.maybe_raise()` convenience method.

Adjust `._rpc` packing to `ContextCancelled(boxed_type=trio.Cancelled)`
and tweak some more log msg formats.
2025-03-20 20:35:02 -04:00
Tyler Goodlet cfb74e588d Get remaining suites passing..
..by ensuring `reg_addr` fixture value passthrough to subactor eps
2025-03-20 20:35:02 -04:00
goodboy 3d2b6613e8
Merge pull request from goodboy/multihomed
Multihomed transport (server) addrs 🕶️
2025-03-20 20:34:13 -04:00
Tyler Goodlet 2b124447c8 Unmask `pytest.ini` log-capture lines (again) 2025-03-20 19:50:31 -04:00
Tyler Goodlet 5ffdda762a More spaceless union type annots 2025-03-20 19:50:31 -04:00
Tyler Goodlet 9082efbe68 Add a `._state._runtime_vars['_registry_addrs']`
Such that it's set to whatever `Actor.reg_addrs: list[tuple]` is during
the actor's init-after-spawn guaranteeing each actor has at least the
registry infos from its parent. Ensure we read this if defined over
`_root._default_lo_addrs` in `._discovery` routines, namely
`.find_actor()` since it's the one API normally used without expecting
the runtime's `current_actor()` to be up.

Update the latest inter-peer cancellation test to use the `reg_addr`
fixture (and thus test this new runtime-vars value via `find_actor()`
usage) since it was failing if run *after* the infected `asyncio` suite
due to registry contact failure.
2025-03-20 19:50:31 -04:00
Tyler Goodlet 14f34c111a `_root`: drop unused `typing` import 2025-03-20 19:50:31 -04:00
Tyler Goodlet f947bdf80c Use `import <name> as <name>,` style over `__all__` in pkg mod 2025-03-20 19:50:31 -04:00
Tyler Goodlet dbd79d8beb Log chan-server-startup failures via `.exception()` 2025-03-20 19:50:31 -04:00
Tyler Goodlet 15a4a2a51e `.discovery.get_arbiter()`: add warning around this now deprecated usage 2025-03-20 19:50:31 -04:00
Tyler Goodlet ebf9909cc4 Add `open_root_actor(ensure_registry: bool)`
Allows forcing the opened actor to either obtain the passed registry
addrs or raise a runtime error.
2025-03-20 19:50:31 -04:00
Tyler Goodlet 2d541fdd9b Fix doc string "its" typo.. 2025-03-20 19:50:31 -04:00
Tyler Goodlet 5f0bfeae57 Test with `any(portals)` since `gather_contexts()` will return `list[None | tuple]` 2025-03-20 19:50:31 -04:00
Tyler Goodlet 8b0b4abb3c Change remaining internals to use `Actor.reg_addrs` 2025-03-20 19:50:31 -04:00
Tyler Goodlet 51bd38976f Expose per-actor registry addrs via `.reg_addrs`
Since it's handy to be able to debug the *writing* of this instance var
(particularly when checking state passed down to a child in
`Actor._from_parent()`), rename and wrap the underlying
`Actor._reg_addrs` as a settable `@property` and add validation to
the `.setter` for sanity - actor discovery is a critical functionality.

Other tweaks:
- fix `.cancel_soon()` to pass expected argument..
- update internal runtime error message to be simpler and link to GH issues.
- use new `Actor.reg_addrs` throughout core.
2025-03-20 19:50:31 -04:00
Tyler Goodlet 4868bf225c Always dynamically re-read the `._root._default_lo_addrs` value in `find_actor()` 2025-03-20 19:50:31 -04:00
Tyler Goodlet f834b35aa9 Ensure `registry_addrs` is always set to something 2025-03-20 19:50:31 -04:00
Tyler Goodlet 6d671f69b8 Rename fixture `arb_addr` -> `reg_addr` and set the session value globally as `._root._default_lo_addrs` 2025-03-20 19:50:31 -04:00
Tyler Goodlet 94c89fd425 Facepalm, `wait_for_actor()` dun take an addr `list`.. 2025-03-20 19:50:31 -04:00
Tyler Goodlet 0246c824b9 ._root: set a `_default_lo_addrs` and apply it when not provided by caller 2025-03-20 19:50:31 -04:00
Tyler Goodlet 2e17b084b2 Always set default reg addr in `find_actor()` if not defined 2025-03-20 19:50:31 -04:00
Tyler Goodlet 61d82d47c2 Oof, default reg addrs needs to be in `list[tuple]` form.. 2025-03-20 19:50:31 -04:00
Tyler Goodlet 7246749137 Add post-mortem catch around failed transport addr binds to aid with runtime debugging 2025-03-20 19:50:31 -04:00
Tyler Goodlet 4db377c01d Rename to `parse_maddr()` and fill out doc strings 2025-03-20 19:50:31 -04:00
Tyler Goodlet ef4c4be0bb Add libp2p style "multi-address" parser from `piker`
Details are in the module docs; this is a first draft with lotsa room
for refinement and extension.
2025-03-20 19:50:31 -04:00
Tyler Goodlet 7ce4bc489e Init-support for "multi homed" transports
Since we'd like to eventually allow a diverse set of transport
(protocol) methods and stacks, and a multi-peer discovery system for
distributed actor-tree applications, this reworks all runtime internals
to support multi-homing for any given tree on a logical host. In other
words any actor can now bind its transport server (currently only
unsecured TCP + `msgspec`) to more then one address available in its
(linux) network namespace. Further, registry actors (now dubbed
"registars" instead of "arbiters") can also similarly bind to multiple
network addresses and provide discovery services to remote actors via
multiple addresses which can now be provided at runtime startup.

Deats:
- adjust `._runtime` internals to use a `list[tuple[str, int]]` (and
  thus pluralized) socket address sequence where applicable for transport
  server socket binds, now exposed via `Actor.accept_addrs`:
  - `Actor.__init__()` now takes a `registry_addrs: list`.
  - `Actor.is_arbiter` -> `.is_registrar`.
  - `._arb_addr` -> `._reg_addrs: list[tuple]`.
  - always reg and de-reg from all registrars in `async_main()`.
  - only set the global runtime var `'_root_mailbox'` to the loopback
    address since normally all in-tree processes should have access to
    it, right?
  - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]`
- make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]`
  and defaults when not passed.
- change `ActorNursery.start_..()` methods take `bind_addrs: list` and
  pass down through the spawning layer(s) via the parent-seed-msg.
- generalize all `._discovery()` APIs to accept `registry_addrs`-like
  inputs and move all relevant subsystems to adopt the "registry" style
  naming instead of "arbiter":
  - make `find_actor()` support batched concurrent portal queries over
    all provided input addresses using `.trionics.gather_contexts()` Bo
  - syntax: move to using `async with <tuples>` 3.9+ style chained
    @acms.
  - a general modernization of the code to a python 3.9+ style.
  - start deprecation and change to "registry" naming / semantics:
    - `._discovery.get_arbiter()` -> `.get_registry()`
2025-03-20 19:50:31 -04:00
Tyler Goodlet dec2b1f0f5 Reapply "Port all tests to new `reg_addr` fixture name"
This reverts-the-revert of commit
bc13599e1f which was needed to land pre
`multihomed` feat branch history.
2025-03-20 19:50:31 -04:00
goodboy 3ccbfd7e54
Merge pull request from goodboy/devx_subpkg
Start ` tractor.devx` sub-pkg
2025-03-20 19:48:42 -04:00
Tyler Goodlet 8d318a8ac5 Flip a last `MultiError` to a beg, add todo on `@stream` func 2025-03-20 15:07:27 -04:00
Tyler Goodlet d5eec6eb6c Re-revert back to `.devx` subpkg after rebase.. 2025-03-20 15:07:27 -04:00
Tyler Goodlet a88564549a Yahh, add `.devx` package to installed subpkgs.. 2025-03-20 15:07:27 -04:00
Tyler Goodlet f028181e19 Add `stackscope` as dep, drop legacy `pdb` issue cruft 2025-03-20 15:07:27 -04:00
Tyler Goodlet 3a317c1581 Enable `stackscope` render via root in debug mode
If `stackscope` is importable and debug_mode is enabled then we by
default call and report `.devx.enable_stack_on_sig()` is set B)

This makes debugging unexpected (SIGINT ignoring) hangs a cinch!
2025-03-20 15:07:27 -04:00
Tyler Goodlet 65e49696e7 Woops, fix `_post_mortem()` type sig..
We're passing a `extra_frames_up_when_async=2` now (from prior attempt
to hide `CancelScope.__exit__()` when `shield=True`) and thus both
`debug_func`s must accept it 🤦

On the brighter side found out that the `TypeError` from the call-sig
mismatch was actually being swallowed entirely so add some
`.exception()` msgs for such cases to at least alert the dev they broke
stuff XD
2025-03-20 15:07:27 -04:00
Tyler Goodlet e834297503 Add `shield: bool` support to `.pause()`
It's been on the todo for a while and I've given up trying to properly
hide the `trio.CancelScope.__exit__()` frame for now instead opting to
just `log.pdb()` a big apology XD

Users can obvi still just not use the flag and wrap `tractor.pause()` in
their own cs block if they want to avoid having to hit `'up'` in the pdb
REPL if needed in a cancelled task-scope.

Impl deatz:
- factor orig `.pause()` impl into new `._pause()` so that we can more tersely
  wrap the original content depending on `shield: bool` input; only open
  the cancel-scope when shield is set to avoid aforemented extra strack
  frame annoyance.
- pass through `shield` to underlying `_pause` and `debug_func()` so we
  can actually know when so log our apology.
- add a buncha notes to new `.pause()` wrapper regarding the inability
  to hide the cancel-scope `.__exit__()`, inluding that overriding the
  code in `trio._core._run.CancelScope` doesn't seem to solve the issue
  either..

Unrelated `maybe_wait_for_debugger()` tweaks:
- don't read `Lock.global_actor_in_debug` more then needed, rename local
  read var to `in_debug` (since it can also hold the root actor uid, not
  just sub-actors).
- shield the `await debug_complete.wait()` since ideally we avoid the
  root cancellation child-actors in debug even when the root calls this
  func in a cancelled scope.
2025-03-20 15:07:27 -04:00
Tyler Goodlet e3bb9c914c Mk debugger tests work for arbitrary pre-REPL format
Since this was changed as part of overall project wide logging format
updates, and i ended up changing the both the crash and pause `.pdb()`
msgs to include some multi-line-ascii-"stuff", might as well make the
pre-prompt checks in the test suite more flexible to match.

As such, this exposes 2 new constants inside the `.devx._debug` mod:
- `._pause_msg: str` for the pre `tractor.pause()` header emitted via
  `log.pdb()` and,
- `._crash_msg: str` for the pre `._post_mortem()` equiv when handling
  errors in debug mode.

Adjust the test suite to use these values and thus make us more capable
to absorb changes in the future as well:
- add a new `in_prompt_msg()` predicate, very similar to `assert_before()`
  but minus `assert`s which takes in a `parts: list[str]` to match
  in the pre-prompt stdout.
- delegate to `in_prompt_msg()` in `assert_before()` since it was mostly
  duplicate minus `assert`.
- adjust all previous `<patt> in before` asserts to instead use
  `in_prompt_msg()` with separated pre-prompt-header vs. actor-name
  `parts`.
- use new `._pause/crash_msg` values in all such calls including any
  `assert_before()` cases.
2025-03-20 15:07:27 -04:00
Tyler Goodlet 526add2cae Support `maybe_wait_for_debugger(header_msg: str)`
Allow callers to stick in a header to the `.pdb()` level emitted msg(s)
such that any "waiting status" content is only shown if the caller
actually get's blocked waiting for the debug lock; use it inside the
`._spawn` sub-process reaper call.

Also, return early if `Lock.global_actor_in_debug == None` and thus
only enter the poll loop when actually needed, consequently raise
if we fall through the loop without acquisition.
2025-03-20 15:07:27 -04:00
Tyler Goodlet 1fb4d7318b Fix `.devx.maybe_wait_for_debugger()` polling deats
When entered by the root actor avoid excessive polling cycles by,
- blocking on the `Lock.no_remote_has_tty: trio.Event` and breaking
  *immediately* when set (though we should really also lock
  it from the root right?) to avoid extra loops..
- shielding the `await trio.sleep(poll_delay)` call to avoid any local
  cancellation causing the (presumably root-actor task) caller to move
  on (possibly to cancel its children) and instead to continue
  poll-blocking until the lock is actually released by its user.
- `break` the poll loop immediately if no remote locker is detected.
- use `.pdb()` level for reporting lock state changes.

Also add a #TODO to handle calls by non-root actors as it pertains to
2025-03-20 15:07:27 -04:00
Tyler Goodlet 199ca48cc4 Add `stackscope` tree pprinter triggered by SIGUSR1
Can be optionally enabled via a new `enable_stack_on_sig()` which will
swap in the SIGUSR1 handler. Much thanks to @oremanj for writing this
amazing project, it's thus far helped me fix some very subtle hangs
inside our new IPC-context cancellation machinery that would have
otherwise taken much more manual pdb-ing and hair pulling XD

Full credit for `dump_task_tree()` goes to the original project author
with some minor tweaks as was handed to me via the trio-general matrix
room B)

Slight changes from orig version:
- use a `log.pdb()` emission to pprint to console
- toss in an ex sh CLI cmd to trigger the dump from another terminal
  using `kill` + `pgrep`.
2025-03-20 15:07:27 -04:00
Tyler Goodlet 5b3bcbaa7d Only use `greenback` if actor-runtime is up.. 2025-03-20 15:07:27 -04:00
Tyler Goodlet 8647421ef9 Ignore `greenback` import error if not installed 2025-03-20 15:07:27 -04:00
Tyler Goodlet ba9448d52f Change old `._debug._pause()` name, cherry to re `greenback` 2025-03-20 15:07:27 -04:00
Tyler Goodlet f5c35dca55 Runtime import `.get_root()` in stdin hijacker to avoid import cycle 2025-03-20 15:07:27 -04:00
Tyler Goodlet cebc2cb515 Ignore kbis in `open_crash_handler()` by default 2025-03-20 15:07:27 -04:00
Tyler Goodlet 5042f1fdb8 Comment all `.pause(shield=True)` attempts again, need to solve cancel scope `.__exit__()` frame hiding issue.. 2025-03-20 15:07:27 -04:00
Tyler Goodlet 5912fecdc9 Add shielding support to `.pause()`
Implement it like you'd expect using simply a wrapping
`trio.CancelScope` which is itself shielded by the input `shield: bool`
B)

There's seemingly still some issues with the frame selection when the
REPL engages and not sure how to resolve it yet but at least this does
indeed work for practical purposes. Still needs a test obviously!
2025-03-20 15:07:27 -04:00
Tyler Goodlet cca4f952ed Move `maybe_open_crash_handler()` CLI `--pdb`-driven wrapper to debug mod 2025-03-20 15:07:27 -04:00
Tyler Goodlet ab0c0fb71d Start `.devx.cli` extensions for pop CLI frameworks
Starting of with just a `typer` (and thus transitively `click`)
`typer.Typer.callback` hook which allows passthrough of the `--ll
<loglevel: str>` and `--pdb <debug_mode: bool>` flags for use when
building CLIs that use the runtime Bo

Still needs lotsa refinement and obviously better docs but, the doc
string for `load_runtime_vars()` shows how to use the underlying
`.devx._debug.open_crash_handler()` via a wrapper that can be passed the
`--pdb` flag and then enable debug mode throughout the entire actor
system.
2025-03-20 15:07:27 -04:00
Tyler Goodlet b00ba158f1 Kick off `.devx` subpkg for our dev tools B)
Where `.devx` is "developer experience", a hopefully broad enough subpkg
name for all the slick stuff planned to augment working on the actor
runtime 💥

Move the `._debug` module into the new subpkg and adjust rest of core
code base to reflect import path change. Also add a new
`.devx._debug.open_crash_handler()` manager for wrapping any sync code
outside a `trio.run()` which is handy for eventual CLI addons for
popular frameworks like `click`/`typer`.
2025-03-20 15:07:27 -04:00
Tyler Goodlet 93f489e263 Expose `Channel` @ pkg level, drop `_debug.pp()` alias 2025-03-20 15:07:27 -04:00
Tyler Goodlet fa5f458de0 Move `.to_asyncio` to modern optional value type annots 2025-03-20 15:07:27 -04:00
Tyler Goodlet 6de4a5a9f3 Map `breakpoint()` built-in to new `.pause_from_sync()` ep 2025-03-20 15:07:27 -04:00
Tyler Goodlet ab8bd9b787 Fix frame-selection display on first REPL entry
For whatever reason pdb(p), and in general, will show the frame of the
*next* python instruction/LOC on initial entry (at least using
`.set_trace()`), as such remove the `try/finally` block in the sync
code entrypoint `.pause_from_sync()`, and also since doesn't seem like
we really need it anyway.

Further, and to this end:
- enable hidden frames support in our default config.
- fix/drop/mask all the frame ref-ing/mangling we had prior since it's no
  longer needed as well as manual `Lock` releasing which seems to work
  already by having the `greenback` spawned task do it's normal thing?
- move to no `Union` type annots.
- hide all frames that can add "this is the runtime confusion" to
  traces.
2025-03-20 15:07:27 -04:00
Tyler Goodlet 1deed8dbee ._runtime: log level tweaks, use crit for stale debug lock detection 2025-03-20 15:07:27 -04:00
Tyler Goodlet 36d2aa1852 Add longer "required reading" list B) 2025-03-20 15:07:27 -04:00
Tyler Goodlet f0417d802b First proto: use `greenback` for sync func breakpointing
This works now for supporting a new `tractor.pause_from_sync()`
`tractor`-aware-replacement for `Pdb.set_trace()` from sync functions
which are also scheduled from our runtime. Uses `greenback` to do all
the magic of scheduling the bg `tractor._debug._pause()` task and
engaging the normal TTY locking machinery triggered by `await
tractor.breakpoint()`

Further this starts some public API renaming, making a switch to
`tractor.pause()` from `.breakpoint()` which IMO much better expresses
the semantics of the runtime intervention required to suffice
multi-process "breakpointing"; it also is an alternate name for the same
in computer science more generally: https://en.wikipedia.org/wiki/Breakpoint
It also avoids using the same name as the `breakpoint()` built-in which
is important since there **is alot more going on** when you call our
equivalent API.

Deats of that:
- add deprecation warning for `tractor.breakpoint()`
- add `tractor.pause()` and a shorthand, easier-to-type, alias `.pp()`
  for "pause-point" B)
- add `pause_from_sync()` as the new `breakpoint()`-from-sync-function
  hack which does all the `greenback` stuff for the user.

Still TODO:
- figure out where in the runtime and when to call
  `greenback.ensure_portal()`.
- fix the frame selection issue where
  `trio._core._ki._ki_protection_decorator:wrapper` seems to be always
  shown on REPL start as the selected frame..
2025-03-20 15:07:27 -04:00
Tyler Goodlet 62a0fff2fd Add a debug-mode-breakpoint-causes-hang case!
Only found this by luck more or less (while working on something in
a client project) and it turns out we can actually get to (yet another)
hang state where SIGINT will be ignored by the root actor on teardown..

I've added all the necessary logic flags to reproduce. We obviously need
a follow up bug issue and a test suite to replicate!

It appears as though the following are required based on very light
tinkering:
- infected asyncio mode active
- debug mode active
- the `trio` context must breakpoint *before* `.started()`-ing
- the `asyncio` must **not** error
2025-03-20 15:07:26 -04:00
Tyler Goodlet d65e4bbad7 Add (first-draft) infected-`asyncio` actor task uses debugger example 2025-03-20 15:07:26 -04:00
goodboy ee372933a7
Merge pull request from goodboy/ctx_cancel_semantics_and_overruns
`Context` semantics for cross-actor-task cancellation and overruns
2025-03-20 15:06:57 -04:00
Tyler Goodlet 96cdcd8f39 Pin to exact `trio` version that still has loose egs 2025-03-20 14:24:21 -04:00
Tyler Goodlet bc13599e1f Revert "Port all tests to new `reg_addr` fixture name"
This reverts commit 715348c5c2.
2025-03-19 15:34:30 -04:00
Tyler Goodlet 54576851e9 Add a `debug_mode: bool` fixture via `--tpdb` flag
Allows tests (including any `@tractor_test`s) to subscribe to a CLI flag
`--tpdb` (for "tractor python debugger") which the session can provide
to tests which can then proxy the value to `open_root_actor()` (via
`open_nursery()`) when booting the runtime - thus enabling our debug
mode globally to any subscribers B)

This is real handy if you have some failures but can't determine the
root issue without jumping into a `pdbp` REPL inside a (sub-)actor's
spawned-task.
2025-03-19 15:34:30 -04:00
Tyler Goodlet 2a5ff82061 Only run CI on py3.11 2025-03-19 15:34:30 -04:00
Tyler Goodlet f2d3f0cc21 Backport skipping `examples/multihost/` in tests
This was actually fixed on a downstream dev branch (adding py3.13
support i think?); so backport it here to get us running again on 3.11.
2025-03-19 15:34:30 -04:00
Tyler Goodlet 6b282bfa06 Add `._testing` as subpkg.. 2025-03-19 15:34:30 -04:00
Tyler Goodlet 11bab13a06 Various adjustments to fix breakage after rebase
- Remove `exceptiongroup` import,
- pin to py 3.11 in `setup.py`
- revert any lingering `tractor.devx` imports; sub-pkg is coming in
  a downstream PR!
- remove weird double `@property` lingering from conflict reso..
- modern `pytest` requires conftest mod mods to be  relative imported.
2025-03-19 15:30:59 -04:00
Tyler Goodlet 9a8cd13894 Another cancel-req-invalid log msg fmt tweak 2025-03-16 16:06:26 -04:00
Tyler Goodlet 3706abca71 Adjust advanced faults test(s) for absorbed EoCs
More or less just simplifies to not seeing the stream closure errors and
instead expecting KBIs from the simulated user who 'ctl-cs after hang'.

Toss in a little `stuff_hangin_ctlc()` to the script to wrap all that
and always check stream closure before sending the final KBI.
2025-03-16 16:06:26 -04:00
Tyler Goodlet 771fc33801 Absorb EoCs via `Context.open_stream()` silently
I swear long ago it used to operate this way but, I guess this finalizes
the design decision. It makes a lot more sense to *not* propagate any
`trio.EndOfChannel` raised from a `Context.open_stream() as stream:`
block when that EoC is due to graceful-explicit stream termination.
We use the EoC much like a `StopAsyncIteration` where the error
indicates termination of the stream due to either:
- reception of a stop IPC msg indicating the far end ended the stream
  (gracecfully),
- closure of the underlying `Context._recv_chan` either by the runtime
  or due to user code having called `MsgStream.aclose()`.

User code shouldn't expect to handle EoC outside the block since the
`@acm` having closed should indicate the exactly same lifetime state
(of said stream) ;)

Deats:
- add special EoC handler in `.open_stream()` which silently "absorbs"
  the error only when the stream is already marked as closed (meaning
  the EoC indeed corresponds to IPC closure) with an assert for now
  ensuring the error is the same as set to `MsgStream._eoc`.
- in `MsgStream.receive()` break up the handlers for EoC and
  `trio.ClosedResourceError` since the error instances are saved to
  different variables and we **don't** want to rewrite the exception in
  the eoc case (normally to mask `trio` internals in tbs) bc we need the
  instance to be the exact one for doing checks inside
  `.open_stream().__aexit__()` to absorb it.

Other surrounding "improvements":
- start using the new `Context.maybe_raise()` helper where it can easily
  replace existing equivalent block-sections.
- use new `RemoteActorError.src_uid` as required.
2025-03-16 16:06:26 -04:00
Tyler Goodlet a87df3009f Drop now-deprecated deps on modern `trio`/Python
- `trio_typing` is nearly obsolete since `trio >= 0.23`
- `exceptiongroup` is built-in to python 3.11
- `async_generator` primitives have lived in `contextlib` for quite
  a while!
2025-03-16 16:06:24 -04:00
Tyler Goodlet 05f28c8728 Pin to `trio>=0.24` to avoid `trio_typing` 2025-03-16 15:52:55 -04:00
Tyler Goodlet 85825cdd76 Add `.trionics._broadcast` todos for py 3.12 2025-03-16 15:52:55 -04:00
Tyler Goodlet a5bc113fde Start a `._rpc` module
Since `._runtime` was getting pretty long (> 2k LOC) and much of the RPC
low-level machinery is fairly isolated to a handful of task-funcs, it
makes sense to re-org the RPC task scheduling and driving msg loop to
its own code space.

The move includes:
- `process_messages()` which is the main IPC business logic.
- `try_ship_error_to_remote()` helper, to box local errors for the wire.
- `_invoke()`, the core task scheduler entrypoing used in the msg loop.
- `_invoke_non_context()`, holds impls for non-`@context` task starts.
- `_errors_relayed_via_ipc()` which does all error catch-n-boxing for
   wire-msg shipment using `try_ship_error_to_remote()` internally.

Also inside `._runtime` improve some `Actor` methods docs.
2025-03-16 15:52:53 -04:00
Tyler Goodlet 4f7823cf55 Move `Portal.open_context()` impl to `._context`
Finally, since normally you need the content from `._context.Context`
and surroundings in order to effectively grok `Portal.open_context()`
anyways, might as well move the impl to the ctx module as
`open_context_from_portal()` and just bind it on the `Portal` class def.

Associated/required tweaks:
- avoid circ import on `.devx` by only import
  `.maybe_wait_for_debugger()` when debug mode is set.
- drop `async_generator` usage, not sure why this hadn't already been
  changed to `contextlib`?
- use `@acm` alias throughout `._portal`
2025-03-16 15:32:13 -04:00
Tyler Goodlet 544cb40533 Attempt at better internal traceback hiding
Previously i was trying to approach this using lots of
`__tracebackhide__`'s in various internal funcs but since it's not
exactly straight forward to do this inside core deps like `trio` and the
stdlib, it makes a bit more sense to optionally catch and re-raise
certain classes of errors from their originals using `raise from` syntax
as per:
https://docs.python.org/3/library/exceptions.html#exception-context

Deats:
- litter `._context` methods with `__tracebackhide__`/`hide_tb` which
  were previously being shown but that don't need to be to application
  code now that cancel semantics testing is finished up.
- i originally did the same but later commented it all out in `._ipc`
  since error catch and re-raise instead in higher level layers
  (above the transport) seems to be a much saner approach.
- add catch-n-reraise-from in `MsgStream.send()`/.`receive()` to avoid
  seeing the depths of `trio` and/or our `._ipc` layers on comms errors.

Further this patch adds some refactoring to use the
same remote-error shipper routine from both the actor-core in the RPC
invoker:
- rename it as `try_ship_error_to_remote()` and call it from
  `._invoke()` as well as it's prior usage.
- make it optionally accept `cid: str` a `remote_descr: str` and of
  course a `hide_tb: bool`.

Other misc tweaks:
- add some todo notes around `Actor.load_modules()` debug hooking.
- tweak the zombie reaper log msg and timeout value ;)
2025-03-16 15:30:08 -04:00
Tyler Goodlet 389b305d3b Add (back) a `tractor._testing` sub-pkg
Since importing from our top level `conftest.py` is not scaleable
or as "future forward thinking" in terms of:
- LoC-wise (it's only one file),
- prevents "external" (aka non-test) example scripts from importing
  content easily,
- seemingly(?) can't be used via abs-import if using
  a `[tool.pytest.ini_options]` in a `pyproject.toml` vs.
  a `pytest.ini`, see:
  https://docs.pytest.org/en/8.0.x/reference/customize.html#pyproject-toml)

=> Go back to having an internal "testing" pkg like `trio` (kinda) does.

Deats:
- move generic top level helpers into pkg-mod including the new
  `expect_ctxc()` (which i needed in the advanced faults testing script.
- move `@tractor_test` into `._testing.pytest` sub-mod.
- adjust all the helper imports to be a `from tractor._testing import <..>`

Rework `test_ipc_channel_break_during_stream()` and backing script:
- make test(s) pull `debug_mode` from new fixture (which is now
  controlled manually from `--tpdb` flag) and drop the previous
  parametrized input.
- update logic in ^ test for "which-side-fails" cases to better match
  recently updated/stricter cancel/failure semantics in terms of
  `ClosedResouruceError` vs. `EndOfChannel` expectations.
- handle `ExceptionGroup`s with expected embedded errors in test.
- better pendantics around whether to expect a user simulated KBI.
- for `examples/advanced_faults/ipc_failure_during_stream.py` script:
  - generalize ipc breakage in new `break_ipc()` with support for diff
    internal `trio` methods and a #TODO for future disti frameworks
  - only make one sub-actor task break and the other just stream.
  - use new `._testing.expect_ctxc()` around ctx block.
  - add a bit of exception handling with `print()`s around ctxc (unused
    except if 'msg' break method is set) and eoc cases.
  - don't break parent side ipc in loop any more then once
    after first break, checked via flag var.
  - add a `pre_close: bool` flag to control whether
    `MsgStreama.aclose()` is called *before* any ipc breakage method.

Still TODO:
- drop `pytest.ini` and add the alt section to `pyproject.py`.
 -> currently can't get `--rootdir=` opt to work.. not showing in
   console header.
 -> ^ also breaks on 'tests' `enable_modules` imports in subactors
   during discovery tests?
2025-03-16 15:28:28 -04:00
Tyler Goodlet 1975b92dba Add `an: ActorNursery` var placeholder for final log msg 2025-03-16 15:22:01 -04:00
Tyler Goodlet 31ccdd79d7 Tweak some tests for spurious failues
With the seeming cause that some cases occasionally raise
`ExceptionGroup` instead of a (collapsed out) single error which, in
those cases at least try to check that `.exceptions` has the original
error.
2025-03-16 15:22:01 -04:00
Tyler Goodlet cbaf4fc05b Add a open-ctx-with-self test
Found exactly why trying this won't work when playing around with
opening workspaces in `modden` using a `Portal.open_context()` back to
the 'bigd' root actor: the RPC machinery only registers one entry in
`Actor._contexts` which will get overwritten by each task's side and
then experience race-based IPC msging errors (eg. rxing `{'started': _}`
on the callee side..). Instead make opening a ctx back to the self-actor
a runtime error describing it as an invalid op.

To match:
- add a new test `test_ctx_with_self_actor()` to the context semantics
  suite.
- tried out adding a new `side: str` to the `Actor.get_context()` (and
  callers) but ran into not being able to determine the value from in
  `._push_result()` where it's needed to figure out which side to push
  to.. So, just leaving the commented arg (passing) in the runtime core
  for now in case we can come back to trying to make it work, tho i'm
  thinking it's not the right hack anyway XD
2025-03-16 15:19:51 -04:00
Tyler Goodlet 68a3969585 Let `MsgStream.receive_nowait()` take in msg key list
Call it `allow_msg_keys: list[str] = ['yield']` and set it to accept
`['yield', 'return']` from the drain loop in `.aclose()`. Only pass the
last key error to `_raise_from_no_key_in_msg()` in the fall-through
case.

Somehow this seems to prevent all the intermittent test failures i was
seeing in local runs including when running the entire suite all in
sequence; i ain't complaining B)
2025-03-16 14:01:50 -04:00
Tyler Goodlet cf68e075c9 Unify some log msgs in `.to_asyncio`
Much like similar recent changes throughout the core, build out `msg:
str` depending on error cases and emit with `.cancel()` level as
appropes. Also mute (via level) some duplication in the cancel case
inside `_run_asyncio_task()` for console noise reduction.
2025-03-16 14:01:50 -04:00
Tyler Goodlet f730749dc9 Assign `ctx._local_error` ASAP from `.open_context()`
Such that `.outcome` related fields render nicely asap for logging
withing `Portal.open_context()` itself.
2025-03-16 14:01:50 -04:00
Tyler Goodlet c8775dee41 Tweak `Context.repr_outcome()` for KBIs
Since apparently `str(KeyboardInterrupt()) == ''`? So instead add little
`<str> or repr(merr)` expressions throughout to avoid blank strings
rendering if various `repr()`/`.__str__()` outputs..
2025-03-16 14:01:50 -04:00
Tyler Goodlet fd2391539e Support a `._state.last_actor()` getter
Not sure if it's really that useful other then for reporting errors from
`current_actor()` but at least it alerts `tractor` devs and/or users
when the runtime has already terminated vs. hasn't been started
yet/correctly.

Set the `._last_actor_terminated: tuple` in the root's final block which
allows testing for an already terminated tree which is the case where
`._state._current_actor == None` and the last is set.
2025-03-16 14:01:50 -04:00
Tyler Goodlet 8e3a2a9297 Make `Actor._cancel_task(requesting_uid: tuple)` required arg 2025-03-16 14:01:50 -04:00
Tyler Goodlet f90ca0668b Woops, fix one last `ctx._cancelled_caught` in drain loop 2025-03-16 14:01:50 -04:00
Tyler Goodlet 36a81a60cd Adjust `asyncio` test for stricter ctx-self-cancels
Use `expect_ctx()` around the portal cancellation case, toss in
a `'context'` parametrization and return just the `Context.outcome` from
`main()` B)
2025-03-16 14:01:50 -04:00
Tyler Goodlet c2480c2b97 Update ctx test suites to stricter semantics
Including mostly tweaking asserts on relayed `ContextCancelled`s and
the new pub ctx properties: `.outcome`, `.maybe_error`, etc. as it
pertains to graceful (absorbed) remote cancellation vs. loud ctxc cases
expected to be raised by any `Portal.cancel_actor()` style teardown.

Start checking a variety internals like `._remote/local_error`,
`._is_self_cancelled()`, `._is_final_result_set()`, `._cancel_msg`
where applicable.

Also factor out the new `expect_ctxc()` checker to our `conftest.py` for
use in other suites.
2025-03-16 14:01:50 -04:00
Tyler Goodlet 7b1528abed (Event) more pedantic `.cancel_acked: bool` def
Changes the condition logic to be more strict and moves it to a private
`._is_self_cancelled() -> bool` predicate which can be used elsewhere
(instead of having almost similar duplicate checks all over the
place..) and allows taking in a specific `remote_error` just for
verification purposes (like for tests).

Main strictness distinctions are now:
- obvi that `.cancel_called` is set (this filters any
  `Portal.cancel_actor()` or other out-of-band RPC),
- the received `ContextCancelled` **must** have its `.canceller` set to
  this side's `Actor.uid` (indicating we are the requester).
- `.src_actor_uid` **must** be the same as the `.chan.uid` (so the error
  must have originated from the opposite side's task.
- `ContextCancelled.canceller` should be already set to the `.chan.uid`
  indicating we received the msg via the runtime calling
  `._deliver_msg()` -> `_maybe_cancel_and_set_remote_error()` which
  ensures the error is specifically destined for this ctx-task exactly
  the same as how `Actor._cancel_task()` sets it from an input
  `requesting_uid` arg.

In support of the above adjust some impl deats:
- add `Context._actor: Actor` which is set once in `mk_context()` to
  avoid issues (particularly in testing) where `current_actor()` raises
  after the root actor / runtime is already exited. Use `._actor.uid` in
  both `.cancel_acked` (obvi) and '_maybe_cancel_and_set_remote_error()`
  when deciding whether to call `._scope.cancel()`.
- always cast `.canceller` to `tuple` if not null.
- delegate `.cancel_acked` directly to new private predicate (obvi).
- always set `._canceller` from any `RemoteActorError.src_actor_uid` or
  failing over to the `.chan.uid` when a non-remote error (tho that
  shouldn't ever happen right?).
- more extensive doc-string for `.cancel()` detailing the new strictness
  rules about whether an eventual `.cancel_acked` might be set.

Also tossed in even more logging format tweaks by adding a
`type_only: bool` to `.repr_outcome()` as desired for simpler output in
the `state: <outcome-repr-here>` and `.repr_rpc()` sections of the
`.__str__()`.
2025-03-16 14:01:50 -04:00
Tyler Goodlet c5228e7be5 Set `._cancel_msg` to RPC `{cmd: 'self._cancel_task', ..}` msg
Like how we set `Context._cancel_msg` in `._deliver_msg()` (in
which case normally it's an `{'error': ..}` msg), do the same when any
RPC task is remotely cancelled via `Actor._cancel_task` where that task
doesn't yet have a cancel msg set yet.

This makes is much easier to distinguish between ctx cancellations due
to some remote error vs. Explicit remote requests via any of
`Actor.cancel()`, `Portal.cancel_actor()` or `Context.cancel()`.
2025-03-16 14:01:50 -04:00
Tyler Goodlet 9966dbdfc1 Tweak inter-peer `._scope` state asserts
We don't expect `._scope.cancelled_caught` to be set really ever on
inter-peer cancellation since no ctx is ever cancelling itself, a peer
cancels some other and then bubbles back to all other peers.

Also add `ids: lambda` for `error_during_ctxerr_handling` param to
`test_peer_canceller()`
2025-03-16 14:01:50 -04:00
Tyler Goodlet 7fb1c45ac7 Tweak inter-peer tests for new/refined semantics
Buncha subtle details changed mostly to do with when `Context.cancel()`
gets called on "real" remote errors vs. (peer requested) cancellation
and then local side handling of `ContextCancelled`.

Specific changes to make tests pass:
- due to raciness with `sleeper_ctx.result()` raising the ctxc locally
  vs. the child-peers receiving similar ctxcs themselves (and then
  erroring and propagating back to the root parent), we might not see
  `._remote_error` set during the sub-ctx loops (except for the sleeper
  itself obvi).
- do not expect `.cancel_called`/`.cancel_caught` to be set on any
  sub-ctx since currently `Context.cancel()` is only called non-shielded
  and thus is not in invoked when `._scope.cancel()` is called as part
  of each root-side ctx ref/block handling the inter-peer ctxc.
- do not expect `Context._scope.cancelled_caught` to be set in most cases
  (even the sleeper)

TODO Outstanding adjustments not fixed yet:
-[ ] `_scope.cancelled_caught` checks outside the `.open_context()`
  blocks.
2025-03-16 14:01:50 -04:00
Tyler Goodlet 59d6d0cd7f Woops, add `.msg` sub-pkg to install set 2025-03-16 14:01:50 -04:00
Tyler Goodlet ffed35e263 `._entry`: use same msg info in start/terminate log 2025-03-16 14:01:50 -04:00
Tyler Goodlet 885ba04908 Tweak `._portal` log content to use `Context.repr_outcome()` 2025-03-16 14:01:50 -04:00
Tyler Goodlet 1879243257 Flip rpc tests over to use `ExceptionGroup` on new `trio` 2025-03-16 14:01:50 -04:00
Tyler Goodlet 4fb34772e7 Mega-refactor on `._invoke()` targeting `@context`s
Since eventually we want to implement all other RPC "func types" as
contexts underneath this starts the rework to move all the other cases
into a separate func not only to simplify the main `._invoke()` body but
also as a reminder of the intention to do it XD

Details of re-factor:
- add a new `._invoke_non_context()` which just moves all the old blocks
  for non-context handling to a single def.
- factor what was basically just the `finally:` block handler (doing all
  the task bookkeeping) into a new `@acm`: `_errors_relayed_via_ipc()`
  with that content packed into the post-`yield` (also with a `hide_tb:
  bool` flag added of course).
  * include a `debug_kbis: bool` for when needed.
- since the `@context` block is the only type left in the main
  `_invoke()` body, de-dent it so it's more grok-able B)

Obviously this patch also includes a few improvements regarding
context-cancellation-semantics (for the `context` RPC case) on the
callee side in order to match previous changes to the `Context` api:
- always setting any ctxc as the `Context._local_error`.
- using the new convenience `.maybe_raise()` topically (for now).
- avoiding any previous reliance on `Context.cancelled_caught` for
  anything public of meaning.

Further included is more logging content updates:
- being pedantic in `.cancel()` msgs about whether termination is caused
  by error or ctxc.
- optional `._invoke()` traceback hiding via a `hide_tb: bool`.
- simpler log headers throughout instead leveraging new `.__repr__()` on
  primitives.
- buncha `<= <actor-uid>` sent some message emissions.
- simplified handshake statuses reporting.

Other subsys api changes we need to match:
- change to `Channel.transport`.
- avoiding any `local_nursery: ActorNursery` waiting when the
  `._implicit_runtime_started` is set.

And yes, lotsa more comments for #TODOs dawg.. since there's always
somethin!
2025-03-16 14:01:48 -04:00
Tyler Goodlet 1c9589cfc4 Avoid `ctx.cancel()` after ctxc rxed in `.open_context()`
In the case where the callee side delivers us a ctxc with `.canceller`
set we can presume that remote cancellation already has taken place and
thus we don't need to do the normal call-`Context.cancel()`-on-error
step. Further, in the case where we do call it also handle any
`trio.CloseResourceError` gracefully with a `.warning()`.

Also, originally I had added a post-`yield`-maybe-raise to attempt
handling any remote ctxc the same as for the local case (i.e. raised
from `yield` line) wherein if we get a remote ctxc the same handler
branch-path would trigger, thus avoiding different behaviour in that
case. I ended up masking it out (but can't member why.. ) as it seems
the normal `.result()` call and its internal handling gets the same
behaviour? I've left in the heavily commented code in case it ends up
being the better way to go; likely making the move to having a single
code in both cases is better even if it is just a matter of deciding
whether to swallow the ctxc or not in the `.cancel_acked` case.

Further teensie improvements:
- obvi improve/simplify log msg contents as in prior patches.
- use the new `maybe_wait_for_debugger(header_msg: str)` if/when waiting
  to exit in debug mode.
- another `hide_tb: bool` frame hider flag.
- rando type-annot updates of course :)
2025-03-15 00:08:13 -04:00
Tyler Goodlet 910c07db06 Deep `Context` refinements
Spanning from the pub API, to instance `repr()` customization (for
logging/REPL content), to the impl details around the notion of a "final
outcome" and surrounding IPC msg draining mechanics during teardown.

A few API and field updates:

- new `.cancel_acked: bool` to replace what we were mostly using
  `.cancelled_caught: bool` for but, for purposes of better mapping the
  semantics of remote cancellation of parallel executing tasks; it's set
  only when `.cancel_called` is set and a ctxc arrives with
  a `.canceller` field set to the current actor uid indicating we
  requested and received acknowledgement from the other side's task
  that is cancelled gracefully.

- strongly document and delegate (and prolly eventually remove as a pub
  attr) the `.cancelled_caught` property entirely to the underlying
  `._scope: trio.CancelScope`; the `trio` semantics don't really map
  well to the "parallel with IPC msging"  case in the sense that for
  us it breaks the concept of the ctx/scope closure having "caught"
  something instead of having "received" a msg that the other side has
  "acknowledged" (i.e. which for us is the completion of cancellation).

- new `.__repr__()`/`.__str__()` format that tries to tersely yet
  comprehensively as possible display everything you need to know about
  the 3 main layers of an SC-linked-IPC-context:
  * ipc: the transport + runtime layers net-addressing and prot info.
  * rpc: the specific linked caller-callee task signature details
    including task and msg-stream instances.
  * state: current execution and final outcome state of the task pair.
  * a teensie extra `.repr_rpc` for a condensed rpc signature.

- new `.dst_maddr` to get a `libp2p` style "multi-address" (though right
  now it's just showing the transport layers so maybe we should move to
  to our `Channel`?)

- new public instance-var fields supporting more granular remote
  cancellation/result/error state:
  * `.maybe_error: Exception|None` for any final (remote) error/ctxc
    which computes logic on the values of `._remote_error`/`._local_error`
    to determine the "final error" (if any) on termination.
  * `.outcome` to the final error or result (or `None` if un-terminated)
  * `.repr_outcome()` for a console/logging friendly version of the
    final result or error as needed for the `.__str__()`.

- new private interface bits to support all of ^:
  * a new "no result yet" sentinel value, `Unresolved`, using a module
    level class singleton that `._result` is set too (instead of
    `id(self)`) to both determine if and present when no final result
    from the callee has-yet-been/was delivered (ever).
    => really we should get rid of `.result()` and change it to
    `.wait_for_result()` (or something)u
  * `_final_result_is_set()` predicate to avoid waiting for an already
    delivered result.
  * `._maybe_raise()` proto-impl that we should use to replace all the
    `if re:` blocks it can XD
  * new `._stream: MsgStream|None` for when a stream is opened to aid
    with the state repr mentioned above.

Tweaks to the termination drain loop `_drain_to_final_msg()`:

- obviously (obvi) use all the changes above when determining whether or
  not a "final outcome" has arrived and thus breaking from the loop ;)
  * like the `.outcome` `.maybe_error`  and `._final_ctx_is_set()` in
    the `while` pred expression.

- drop the `_recv_chan.receive_nowait()` + guard logic since it seems
  with all the surrounding (and coming soon) changes to
  `Portal.open_context()` using all the new API stuff (mentioned in
  first bullet set above) we never hit the case of inf-block?

Oh right and obviously a ton of (hopefully improved) logging msg content
changes, commented code removal and detailed comment-docs strewn about!
2025-03-15 00:08:13 -04:00
Tyler Goodlet d8d206b93f Make stream draining status logs `.debug()` level 2025-03-15 00:08:11 -04:00
Tyler Goodlet fb55784798 Add `._implicit_runtime_started` mark, better logs
After some deep logging improvements to many parts of `._runtime`,
I realized a silly detail where we are always waiting on any opened
`local_nursery: ActorNursery` to signal exit from
`Actor._stream_handler()` even in the case of being an implicitly opened
root actor (`open_root_actor()` wasn't called by user/app code) via
`._supervise.open_nursery()`..

So, to address this add a `ActorNursery._implicit_runtime_started: bool`
that can be set and then checked to avoid doing the unnecessary
`.exited.wait()` (and any subsequent warn logging on an exit timeout) in
that special but most common case XD

Matching with other subsys log format refinements, improve readability
and simplicity of the actor-nursery supervisory log msgs, including:
- simplify and/or remove any content that more or less duplicates msg
  content found in emissions from lower-level primitives and sub-systems
  (like `._runtime`, `_context`, `_portal` etc.).
- add a specific `._open_and_supervise_one_cancels_all_nursery()`
  handler block for `ContextCancelled` to log with `.cancel()` level
  noting that the case is a "remote cancellation".
- put the nursery-exit and actor-tree shutdown status into a single msg
  in the `implicit_runtime` case.
2025-03-15 00:06:15 -04:00
Tyler Goodlet 1bc858cd00 Spawn naming and log format tweaks
- rename `.soft_wait()` -> `.soft_kill()`
- rename `.do_hard_kill()` -> `.hard_kill()`
- adjust any `trio.Process.__repr__()` log msg contents to have the
  little tree branch prefix: `'|_'`
2025-03-15 00:06:15 -04:00
Tyler Goodlet 04aea5c4db Add field-first subproca `.info()` to `._entry` 2025-03-15 00:06:13 -04:00
Tyler Goodlet 7bb44e6930 Add "fancier" remote-error `.__repr__()`-ing
Our remote error box types `RemoteActorError`, `ContextCancelled` and
`StreamOverrun` needed a console display makeover particularly for
logging content and `repr()` in higher level primitives like `Context`.

This adds a more "dramatic" str-representation to showcase the
underlying boxed traceback content more sensationally (via ascii-art
emphasis) as well as support a more terse `.reprol()` (representation
for one-line) format that can be used for types that track remote
errors/cancels like with `Context._remote_error`.

Impl deats:
- change `RemoteActorError.__repr__()` formatting to show (sub-type
  specific) `.msgdata` fields in a multi-line format (similar to our new
  `.msg.types.Struct` style) followed by some ascii accented delimiter
  lines to emphasize any `.msgdata["tb_str"]` packed by the remote
- for rme and subtypes allow picking the specifically relevant fields
  via a type defined `.reprol_fields: list[str]` and pick for each
  subtype:
   |_ `RemoteActorError.src_actor_uid`
   |_ `ContextCancelled.canceller`
   |_ `StreamOverrun.sender`

- add `.reprol()` to show a `repr()`-on-one-line formatted string that
  can be used by other multi-line-field-`repr()` styled composite types
  as needed in (high level) logging info.
- toss in some mod level `_body_fields: list[str]` for summary of such
  fields (if needed).
- add some new rae (remote-actor-error) props:
  - `.type` around a newly named `.boxed_type`
  - `.type_str: str`
  - `.tb_str: str`
2025-03-15 00:05:31 -04:00
Tyler Goodlet 2cc712cd81 Fix `Channel.__repr__()` safety, renames to `._transport`
Hit a reallly weird bug in the `._runtime` IPC msg handling loop where
it seems that by `str.format()`-ing a `Channel` before initializing it
would put the `._MsgTransport._agen()` in an already started state
causing an irrecoverable core startup failure..

I presume it's something to do with delegating to the
`MsgpackTCPStream.__repr__()` and, something something.. the
`.set_msg_transport(stream)` getting called to too early such that
`.msgstream.__init__()` is called thus init-ing the `._agen()` before
necessary? I'm sure there's a design lesson to be learned in here
somewhere XD

This was discovered while trying to add more "fancy" logging throughout
said core for the purposes of cobbling together an init attempt at
libp2p style multi-address representations for our IPC primitives. Thus
I also tinker here with adding some new fields to `MsgpackTCPStream`:
- `layer_key`: int = 4
- `name_key`: str = 'tcp'
- `codec_key`: str = 'msgpack'

Anyway, just changed it so that if `.msgstream` ain't set then we just
return a little "null repr" `str` value thinger.

Also renames `Channel.msgstream` internally to `._transport` with
appropriate pub `@property`s added such that everything else won't break
;p

Also drops `Optional` typing vis-a-vi modern union syntax B)
2025-03-15 00:05:31 -04:00
Tyler Goodlet c421f7e722 Make `NamespacePath` kinda support methods..
Obviously we can't deterministic-ally call `.load_ref()` (since you'd
have to point to an `id()` or something and presume a particular
py-runtime + virt-mem space for it to exist?) but it at least helps with
the `str` formatting for logging purposes (like `._cancel_rpc_tasks()`)
when `repr`-ing ctxs and their specific "rpc signatures".

Maybe in the future getting this working at least for singleton types
per process (like `Actor` XD ) will be a thing we can support and make
some sense of.. Bo
2025-03-15 00:05:31 -04:00
Tyler Goodlet 1c217ef36f Add #TODO for generating func-sig type-annots as `str` for pprinting 2025-03-14 22:49:38 -04:00
Tyler Goodlet d7f2f51f7f Bring in pretty-ified `msgspec.Struct` extension
Originally designed and used throughout `piker`, the subtype adds some
handy pprinting and field diffing extras often handy when viewing struct
types in logging or REPL console interfaces B)

Obvi this rejigs the `tractor.msg` mod into a sub-pkg and moves the
existing namespace obj-pointer stuff into a new `.msg.ptr` sub mod.
2025-03-14 22:49:21 -04:00
Tyler Goodlet cfcbc4da01 Add test for `modden` sub-spawner-server hangs on cancel
As per a lot of the recent refinements to `Context` cancellation, add
a new test case to replicate the original hang-on-cancel found with
`modden` when using a client actor to spawn a subactor in some other
tree where despite `Context.cancel()` being called the requesting client
would hang on the opened context with the server.

The specific scenario added here is to have,
- root actor spawns 2 children: a client and a spawn server.
- the spawn server opens with a spawn-request serve loop and begins to
  wait for the client.
- client spawns and connects to the sibling spawn server, requests to
  spawn a sub-actor, the "little bro", connects to it then does some
  echo streaming, cancels the request with it's sibling (the spawn
  server) which should in turn cancel the root's-grandchild and result
  in a cancel-ack back to the client's `.open_context()`.
- root ensures that it can also connect to the grandchild (little bro),
  do the same echo streaming, then ensure everything tears down
  correctly after cancelling all the children.

More refinements to come here obvi in the specific cancellation
semantics and possibly causes.

Also tweaks the other tests in suite to use the new `Context` properties
recently introduced and similarly updated in the previous patch to the
ctx-semantics suite.
2025-03-14 22:18:31 -04:00
Tyler Goodlet 664ae87588 Make `@context`-cancelled tests more pedantic
In order to match a very significant and coming-soon patch set to the
IPC `Context` and `Channel` cancellation semantics with significant but
subtle changes to the primitives and runtime logic:

- a new set of `Context` state pub meth APIs for checking exact
  inter-actor-linked-task outcomes such as `.outcome`, `.maybe_error`,
  and `.cancel_acked`.

- trying to move away from `Context.cancelled_caught` usage since the
  semantics from `trio` don't really map well (in terms of cancel
  requests and how they result in cancel-scope graceful closure) and
  `.cancel_acked: bool` is a better approach for IPC req-resp msging.
  - change test usage to access `._scope.cancelled_caught` directly.

- more pedantic ctxc-raising expects around the "type of self
  cancellation" and final outcome in ctxc cases:
  - `ContextCancelled` is raised by ctx (`Context.result()`) consumer
    methods when `Portal.cancel_actor()` is called (since it's an
    out-of-band request) despite `Channel._cancel_called` being set.
  - also raised by `.open_context().__aexit__()` on close.
  - `.outcome` is always `.maybe_error` is always one of
    `._local/remote_error`.
2025-03-14 22:18:31 -04:00
Tyler Goodlet e1d7004aec Add a `pytest.ini` config 2025-03-14 22:18:31 -04:00
Tyler Goodlet a97b45d90b WIP final impl of ctx-cancellation-semantics 2025-03-14 22:18:31 -04:00
Tyler Goodlet a388d3185b Few more log msg tweaks in runtime 2025-03-14 22:18:31 -04:00
Tyler Goodlet 4d0df1bb4a Call `actor.cancel(None)` from root to avoid mismatch with (any future) meth sig changes 2025-03-14 22:18:31 -04:00
Tyler Goodlet 5eb62b3e9b Tweak broadcast fanout test to never inf loop
Since a bug in the new `MsgStream.aclose()` impl's drain block logic was
triggering an actual inf loop (by not ever canceller the streamer child
actor), make sure we put a loop limit on the `inf_streamer`()` XD

Also add a bit more deats to the test `print()`s in each actor and toss
in `debug_mode` fixture support.
2025-03-14 22:18:31 -04:00
Tyler Goodlet 1be296c725 Add note that maybe `Context._eoc` should be set by caller? 2025-03-14 22:18:31 -04:00
Tyler Goodlet 9420ea0c14 Tweak `Actor` cancel method signatures
Besides improving a bunch more log msg contents similarly as before this
changes the cancel method signatures slightly with different arg names:

for `.cancel()`:
- instead of `requesting_uid: str` take in a `req_chan: Channel`
  since we can always just read its `.uid: tuple` for logging and
  further we can then offer the `chan=None` case indicating a
  "self cancel" (since there's no "requesting channel").
- the semantics of "requesting" here better indicate that the IPC connection
  is an IPC peer and further (eventually) will allow permission checking
  against given peers for cancellation requests.
- when `chan==None` we also define a meth-internal `requester_type: str`
  differently for logging content :)
- add much more detailed `.cancel()` content around the requester, its
  type, and any debugger related locking steps.

for `._cancel_task()`:
- change the `chan` arg to `parent_chan: Channel` since "parent"
  correctly indicates that the channel is the parent of the locally
  spawned rpc task to cancel; in fact no other chan should be able to
  cancel tasks parented/spawned by other channels obvi!
- also add more extensive meth-internal `.cancel()` logging with a #TODO
  around showing only the "relevant/lasest" `Context` state vars in such
  logging content.

for `.cancel_rpc_tasks()`:
- shorten `requesting_uid` -> `req_uid`.
- add `parent_chan: Channel` to be similar as above in `._cancel_task()`
  (since it's internally delegated to anyway) which replaces the prior
  `only_chan` and use it to filter to only tasks spawned by this channel
  (thus as their "parent") as before.
- instead of `if tasks:` to enter, invert and `return` early on
  `if not tasks`, for less indentation B)
- add WIP str-repr format (for `.cancel()` emissions) to show
  a multi-address (maddr) + task func (via the new `Context._nsf`) and
  report all cancel task targets with it a "tree"; include #TODO to
  finalize and implement some utils for all this!

To match ensure we adjust `process_messages()` self/`Actor` cancel
handling blocks to provide the new `kwargs` (now with `dict`-merge
syntax) to `._invoke()`.
2025-03-14 22:18:29 -04:00
Tyler Goodlet 9194e5774b Fix overruns test to avoid return-beats-ctxc race
Turns out that py3.11 might be so fast that iterating a EoC-ed
`MsgStream` 1k times is faster then a `Context.cancel()` msg
transmission from a parent actor to it's child (which i guess makes
sense). So tweak the test to delay 5ms between stream async-for iteration
attempts when the stream is detected to be `.closed: bool` (coming in
patch) or `ctx.cancel_called == true`.
2025-03-14 22:16:39 -04:00
Tyler Goodlet 51a3f1bef4 Add `pformat()` of `ActorNursery._children` to logging
Such that you see the children entries prior to exit instead of the
prior somewhat detail/use-less logging. Also, rename all `anursery` vars
to just `an` as is the convention in most examples.
2025-03-14 22:16:37 -04:00
Tyler Goodlet ca1b8e0224 Set any `._eoc` to the err in `_raise_from_no_key_in_msg()`
Since that's what we're now doing in `MsgStream._eoc` internal
assignments (coming in future patch), do the same in this exception
re-raise-helper and include more extensive doc string detailing all
the msg-type-to-raised-error cases. Also expose a `hide_tb: bool` like
we have already in `unpack_error()`.
2025-03-14 22:13:14 -04:00
Tyler Goodlet e403d63eb7 Better logging for cancel requests in IPC msg loop
As similarly improved in other parts of the runtime, adds much more
pedantic (`.cancel()`) logging content to indicate the src of remote
cancellation request particularly for `Actor.cancel()` and
`._cancel_task()` cases prior to `._invoke()` task scheduling. Also add
detailed case comments and much more info to the
"request-to-cancel-already-terminated-RPC-task" log emission to include
the `Channel` and `Context.cid` deats.

This helped me find the src of a race condition causing a test to fail
where a callee ctx task was returning a result *before* an expected
`ctx.cancel()` request arrived B). Adding much more pedantic
`.cancel()` msg contents around the requester's deats should ensure
these cases are much easier to detect going forward!

Also, simplify the `._invoke()` final result/error log msg to only put
*one of either* the final error or returned result above the `Context`
pprint.
2025-03-14 22:13:12 -04:00
Tyler Goodlet 3c385c6949 Use `NamespacePath` in `Context` mgmt internals
The only case where we can't is in `Portal.run_from_ns()` usage (since we
pass a path with `self:<Actor.meth>`) and because `.to_tuple()`
internally uses `.load_ref()` which will of course fail on such a path..

So or now impl as,
- mk `Actor.start_remote_task()` take a `nsf: NamespacePath` but also
  offer a `load_nsf: bool = False` such that by default we bypass ref
  loading (maybe this is fine for perf long run as well?) for the
  `Actor`/'self:'` case mentioned above.
- mk `.get_context()` take an instance `nsf` obvi.

More logging msg format tweaks:
- change msg-flow related content to show the `Context._nsf`, which,
  right, is coming follow up commit..
- bunch more `.runtime()` format updates to show `msg: dict` contents
  and internal primitives with trailing `'\n'` for easier reading.
- report import loading `stackscope` in subactors.
2025-03-14 22:11:57 -04:00
Tyler Goodlet b28df738fe Drop extra "
" when logging actor nursery errors
2025-03-14 21:49:15 -04:00
Tyler Goodlet 5fa040c7db Add `NamespacePath._ns` todo for `self:<ns.meth>` support 2025-03-14 21:49:15 -04:00
Tyler Goodlet 27b750e907 Emit warning on any `ContextCancelled.canceller == None` 2025-03-14 21:49:15 -04:00
Tyler Goodlet 96150600fb Make ctx tests support `debug_mode: bool` fixture
Such that with `--tpdb` passed (sub)actors will engage the `pdbp` REPL
automatically and so that we can use the new `stackscope` support when
complex cases hang Bo

Also,
- simplified some type-annots (ns paths),
- doc-ed an inter-peer test func with some ascii msg flows,
- added a bottom #TODO for replicating the scenario i hit in `modden`
  where a separate client actor-tree was hanging on cancelling a `bigd`
  sub-workspace..
2025-03-14 21:49:15 -04:00
Tyler Goodlet 338ea5529c .log: more multi-line styling 2025-03-14 16:41:08 -04:00
Tyler Goodlet 6bc67338cf Better subproc supervisor logging, todo for
Given i just similarly revamped a buncha `._runtime` log msg formatting,
might as well do something similar inside the spawning machinery such
that groking teardown sequences of each supervising task is much more
sane XD

Mostly this includes doing similar `'<field>: <value>\n'` multi-line
formatting when reporting various subproc supervision steps as well as
showing a detailed `trio.Process.__repr__()` as appropriate.

Also adds a detailed #TODO according to the needs of  for which
we're going to need some internal mechanism for intermediary parent
actors to determine if a given debug tty locker (sub-actor) is one of
*their* (transitive) children and thus stall the normal
cancellation/teardown sequence until that locker is complete.
2025-03-14 16:41:06 -04:00
Tyler Goodlet fd20004757 _supervise: iter nice expanded multi-line `._children` tups with typing 2025-03-14 16:34:17 -04:00
Tyler Goodlet ddc2e5f0f8 WIP: solved the modden client hang.. 2025-03-14 16:34:10 -04:00
Tyler Goodlet 4b0aa5e379 Baboso! fix `chan.send(None)` indent.. 2025-03-14 15:49:37 -04:00
Tyler Goodlet 6a303358df Improved log msg formatting in core
As part of solving some final edge cases todo with inter-peer remote
cancellation (particularly a remote cancel from a separate actor
tree-client hanging on the request side in `modden`..) I needed less
dense, more line-delimited log msg formats when understanding ipc
channel and context cancels from console logging; this adds a ton of
that to:
- `._invoke()` which now does,
  - better formatting of `Context`-task info as multi-line
    `'<field>: <value>\n'` messages,
  - use of `trio.Task` (from `.lowlevel.current_task()` for full
    rpc-func namespace-path info,
  - better "msg flow annotations" with `<=` for understanding
    `ContextCancelled` flow.
- `Actor._stream_handler()` where in we break down IPC peers reporting
  better as multi-line `|_<Channel>` log msgs instead of all jammed on
  one line..
- `._ipc.Channel.send()` use `pformat()` for repr of packet.

Also tweak some optional deps imports for debug mode:
- add `maybe_import_gb()` for attempting to import `greenback`.
- maybe enable `stackscope` tree pprinter on `SIGUSR1` if installed.

Add a further stale-debugger-lock guard before removal:
- read the `._debug.Lock.global_actor_in_debug: tuple` uid and possibly
  `maybe_wait_for_debugger()` when the child-user is known to have
  a live process in our tree.
- only cancel `Lock._root_local_task_cs_in_debug: CancelScope` when
  the disconnected channel maps to the `Lock.global_actor_in_debug`,
  though not sure this is correct yet?

Started adding missing type annots in sections that were modified.
2025-03-14 15:49:36 -04:00
Tyler Goodlet c85757aee1 Let `pack_error()` take a msg injected `cid: str|None` 2025-03-14 15:31:16 -04:00
Tyler Goodlet 9fc9b10b53 Add `StreamOverrun.sender: tuple` for better handling
Since it's generally useful to know who is the cause of an overrun (say
bc you want your system to then adjust the writer side to slow tf down)
might as well pack an extra `.sender: tuple[str, str]` actor uid field
which can be relayed through `RemoteActorError` boxing. Add an extra
case for the exc-type to `unpack_error()` to match B)
2025-03-14 14:14:54 -04:00
Tyler Goodlet a86275996c Offer `unpack_error(hid_tb: bool)` for `pdbp` REPL config 2025-03-14 14:14:54 -04:00
Tyler Goodlet b5431c0343 Never mask original `KeyError` in portal-error unwrapper, for now? 2025-03-14 14:14:54 -04:00
Tyler Goodlet cdee6f9354 Try allowing multi-pops of `_Cache.locks` for now? 2025-03-14 14:14:53 -04:00
Tyler Goodlet a2f1bcc23f Use `import <blah> as blah` over `__all__` in `.trionics` 2025-03-14 14:14:53 -04:00
Tyler Goodlet 4aa89bf391 Bump timeout on resource cache test a bitty bit. 2025-03-14 14:14:53 -04:00
Tyler Goodlet 45e9cb4d09 `_root`: drop unused `typing` import 2025-03-14 14:14:53 -04:00
Tyler Goodlet 27c5ffe5a7 Move missing-key-in-msg raiser to `._exceptions`
Since we use basically the exact same set of logic in
`Portal.open_context()` when expecting the first `'started'` msg factor
and generalize `._streaming._raise_from_no_yield_msg()` into a new
`._exceptions._raise_from_no_key_in_msg()` (as per the lingering todo)
which obvi requires a more generalized / optional signature including
a caller specific `log` obj. Obvi call the new func from all the other
modules X)
2025-03-14 14:14:50 -04:00
Tyler Goodlet 914efd80eb Fmt repr as multi-line style call 2025-03-14 14:14:11 -04:00
Tyler Goodlet 2d2d1ca1c4 Drop unused walrus assign of `re` 2025-03-14 14:14:11 -04:00
Tyler Goodlet 74aa5aa9cd `StackLevelAdapter._log(stacklevel: int)` for custom levels..
Apparently (and i don't know if this was always broken [i feel like no?]
or is a recent change to stdlib's `logging` stuff) we need increment the
`stacklevel` input by one for our custom level methods now? Without this
you're going to see the path to the method's-callstack-frame on every
emission instead of to the caller's. I first noticed this when debugging
the workspace layer spawning in `modden.bigd` and then verified it in
other depended projects..

I guess we should add some tests for this as well XD
2025-03-14 14:14:11 -04:00
Tyler Goodlet 44e386dd99 ._child: remove some unused imports.. 2025-03-14 13:56:25 -04:00
Tyler Goodlet 13fbcc723f Guarding for IPC failures in `._runtime._invoke()`
Took me longer then i wanted to figure out the source of
a failed-response to a remote-cancellation (in this case in `modden`
where a client was cancelling a workspace layer.. but disconnects before
receiving the ack msg) that was triggering an IPC error when sending the
error msg for the cancellation of a `Actor._cancel_task()`, but since
this (non-rpc) `._invoke()` task was trying to send to a now
disconnected canceller it was resulting in a `BrokenPipeError` (or similar)
error.

Now, we except for such IPC errors and only raise them when,
1. the transport `Channel` is for sure up (bc ow what's the point of
   trying to send an error on the thing that caused it..)
2. it's definitely for handling an RPC task

Similarly if the entire main invoke `try:` excepts,
- we only hide the call-stack frame from the debugger (with
  `__tracebackhide__: bool`) if it's an RPC task that has a connected
  channel since we always want to see the frame when debugging internal
  task or IPC failures.
- we don't bother trying to send errors to the context caller (actor)
  when it's a non-RPC request since failures on actor-runtime-internal
  tasks shouldn't really ever be reported remotely, only maybe raised
  locally.

Also some other tidying,
- this properly corrects for the self-cancel case where an RPC context
  is cancelled due to a local (runtime) task calling a method like
  `Actor.cancel_soon()`. We now set our own `.uid` as the
  `ContextCancelled.canceller` value so that other-end tasks know that
  the cancellation was due to a self-cancellation by the actor itself.
  We still need to properly test for this though!
- add a more detailed module doc-str.
- more explicit imports for `trio` core types throughout.
2025-03-14 13:56:23 -04:00
Tyler Goodlet 315f0fc7eb More thurough hard kill doc strings 2025-03-14 13:48:35 -04:00
Tyler Goodlet fea111e882 Tons of interpeer test cleanup
Drop all the nested `@acm` blocks and defunct comments from initial
validations. Add some todos for cases that are still unclear such as
whether the caller / streamer should have `.cancelled_caught == True` in
it's teardown.
2025-03-14 13:44:09 -04:00
Tyler Goodlet a1bf4db1e3 Get inter-peer suite passing with all `Context` state checks!
Definitely needs some cleaning and refinement but this gets us to stage
1 of being pretty frickin correct i'd say 💃
2025-03-14 13:44:09 -04:00
Tyler Goodlet bac9523ecf Adjust test details where `Context.cancel()` is called
We can now make asserts on `.cancelled_caught` and `_remote_error` vs.
`_local_error`. Expect a runtime error when `Context.open_stream()` is
called AFTER `.cancel()` and the remote `ContextCancelled` hasn't
arrived (yet). Adjust to `'itself'` string in self-cancel case.
2025-03-14 13:44:09 -04:00
Tyler Goodlet abe31e9e2c Fix `Context.result()` call to be in runtime scope 2025-03-14 13:44:09 -04:00
Tyler Goodlet 0222180c11 Tweak `Channel._cancel_called` comment 2025-03-14 13:44:09 -04:00
Tyler Goodlet 7d5fda4485 Be ultra-correct in `Portal.open_context()`
This took way too long to get right but hopefully will give us grok-able
and correct context exit semantics going forward B)

The main fixes were:
- always shielding the `MsgStream.aclose()` call on teardown to avoid
  bubbling a `Cancelled`.
- properly absorbing any `ContextCancelled` in cases due to "self
  cancellation" using the new `Context.canceller` in the logic.
- capturing any error raised by the `Context.result()` call in the
  "normal exit, result received" case and setting it as the
  `Context._local_error` so that self-cancels can be easily measured via
  `Context.cancelled_caught` in same way as remote-error caused
  cancellations.
- extremely detailed comments around all of the cancellation-error cases
  to avoid ever getting confused about the control flow in the future XD
2025-03-14 13:44:08 -04:00
Tyler Goodlet f5fcd8ca2e Be mega-pedantic with `ContextCancelled` semantics
As part of extremely detailed inter-peer-actor testing, add much more
granular `Context` cancellation state tracking via the following (new)
fields:
- `.canceller: tuple[str, str]` the uuid of the actor responsible for
  the cancellation condition - always set by
  `Context._maybe_cancel_and_set_remote_error()` and replaces
  `._cancelled_remote` and `.cancel_called_remote`. If set, this value
  should normally always match a value from some `ContextCancelled`
  raised or caught by one side of the context.
- `._local_error` which is always set to the locally raised (and caller
  or callee task's scope-internal) error which caused any
  eventual cancellation/error condition and thus any closure of the
  context's per-task-side-`trio.Nursery`.
- `.cancelled_caught: bool` is now always `True` whenever the local task
  catches (or "silently absorbs") a `ContextCancelled` (a `ctxc`) that
  indeed originated from one of the context's linked tasks or any other
  context which raised its own `ctxc` in the current `.open_context()` scope.
  => whenever there is a case that no `ContextCancelled` was raised
  **in** the `.open_context().__aexit__()` (eg. `ctx.result()` called
  after a call `ctx.cancel()`), we still consider the context's as
  having "caught a cancellation" since the `ctxc` was indeed silently
  handled by the cancel requester; all other error cases are already
  represented by mirroring the state of the `._scope: trio.CancelScope`
  => IOW there should be **no case** where an error is **not raised** in
  the context's scope and `.cancelled_caught: bool == False`, i.e. no
  case where `._scope.cancelled_caught == False and ._local_error is not
  None`!
- always raise any `ctxc` from `.open_stream()` if `._cancel_called ==
  True` - if the cancellation request has not already resulted in
  a `._remote_error: ContextCancelled` we raise a `RuntimeError` to
  indicate improper usage to the guilty side's task code.
- make `._maybe_raise_remote_err()` a sync func and don't raise
  any `ctxc` which is matched against a `.canceller` determined to
  be the current actor, aka a "self cancel", and always set the
  `._local_error` to any such `ctxc`.
- `.side: str` taken from inside `.cancel()` and unused as of now since
  it might be better re-written as a similar `.is_opener() -> bool`?
- drop unused `._started_received: bool`..
- TONS and TONS of detailed comments/docs to attempt to explain all the
  possible cancellation/exit cases and how they should exhibit as either
  silent closes or raises from the `Context` API!

Adjust the `._runtime._invoke()` code to match:
- use `ctx._maybe_raise_remote_err()` in `._invoke()`.
- adjust to new `.canceller` property.
- more type hints.
- better `log.cancel()` msging around self-cancels vs. peer-cancels.
- always set the `._local_error: BaseException` for the "callee" task
  just like `Portal.open_context()` now will do B)

Prior we were raising any `Context._remote_error` directly and doing
(more or less) the same `ContextCancelled` "absorbing" logic (well
kinda) in block; instead delegate to the method
2025-03-14 13:42:55 -04:00
Tyler Goodlet 04217f319a Raise a `MessagingError` from the src error on msging edge cases 2025-03-14 13:42:15 -04:00
Tyler Goodlet 8cb8390201 Move `MessagingError` into `._exceptions` set 2025-03-14 13:42:15 -04:00
Tyler Goodlet 5035617adf Dump `.msgdata` in `RemoteActorError.__repr__()` 2025-03-14 13:42:15 -04:00
Tyler Goodlet 715348c5c2 Port all tests to new `reg_addr` fixture name 2025-03-14 13:42:15 -04:00
Tyler Goodlet fdf0c43bfa Type out the full-fledged streaming ex. 2025-03-14 13:40:19 -04:00
Tyler Goodlet f895c96600 Add masked super timeout line to `do_hard_kill()` for would-be runtime hackers 2025-03-14 13:40:19 -04:00
Tyler Goodlet ca1a1476bb Add a first serious inter-peer remote cancel suite
Tests that appropriate `Context` exit state, the relay of
a `ContextCancelled` error and its `.canceller: tuple[str, str]` value
are set when an inter-peer cancellation happens via an "out of band"
request method (in this case using `Portal.cancel_actor()` and that
cancellation is propagated "horizontally" to other peers. Verify that
any such cancellation scenario which also experiences an "error during
`ContextCancelled` handling" DOES NOT result in that further error being
suppressed and that the user's exception bubbles out of the
`Context.open_context()` block(s) appropriately!

Likely more tests to come as well as some factoring of the teardown
state checks where possible.

Pertains to serious testing the major work landing in 
2025-03-14 13:40:19 -04:00
Tyler Goodlet a7c36a9cbe Tidy/clarify another `._runtime` comment 2025-03-14 13:40:19 -04:00
Tyler Goodlet 22e4b324b1 Get mega-pedantic in `Portal.open_context()`
Specifically in the `.__aexit__()` phase to ensure remote,
runtime-internal, and locally raised error-during-cancelled-handling
exceptions are NEVER masked by a local `ContextCancelled` or any
exception group of `trio.Cancelled`s.

Also adds a ton of details to doc strings including extreme detail
surrounding the `ContextCancelled` raising cases and their processing
inside `.open_context()`'s exception handler blocks.

Details, details:
- internal rename `err`/`_err` stuff to just be `scope_err` since it's
  effectively the error bubbled up from the context's surrounding (and
  cross-actor) "scope".
- always shield `._recv_chan.aclose()` to avoid any `Cancelled` from
  masking the `scope_err` with a runtime related `trio.Cancelled`.
- explicitly catch the specific set of `scope_err: BaseException` that
  we can reasonably expect to handle instead of the catch-all parent
  type including exception groups, cancels and KBIs.
2025-03-14 13:40:18 -04:00
Tyler Goodlet 89ed8b67ff Drop `msg` kwarg from `Context.cancel()`
Well first off, turns out it's never used and generally speaking
doesn't seem to help much with "runtime hacking/debugging"; why would
we need to "fabricate" a msg when `.cancel()` is called to self-cancel?

Also (and since `._maybe_cancel_and_set_remote_error()` now takes an
`error: BaseException` as input and thus expects error-msg unpacking
prior to being called), we now manually set `Context._cancel_msg: dict`
just prior to any remote error assignment - so any case where we would
have fabbed a "cancel msg" near calling `.cancel()`, just do the manual
assign.

In this vein some other subtle changes:
- obviously don't set `._cancel_msg` in `.cancel()` since it's no longer
  an input.
- generally do walrus-style `error := unpack_error()` before applying
  and setting remote error-msg state.
- always raise any `._remote_error` in `.result()` instead of returning
  the exception instance and check before AND after the underlying mem
  chan read.
- add notes/todos around `raise self._remote_error from None` masking of
  (runtime) errors in `._maybe_raise_remote_err()` and use it inside
  `.result()` since we had the inverse duplicate logic there anyway..

Further, this adds and extends a ton of (internal) interface docs and
details comments around the `Context` API including many subtleties
pertaining to calling `._maybe_cancel_and_set_remote_error()`.
2025-03-14 13:37:55 -04:00
Tyler Goodlet 11bbf15817 `._exceptions`: typing and error unpacking updates
Bump type annotations to 3.10+ style throughout module as well as fill
out doc strings a bit. Inside `unpack_error()` pop any `error_dict: dict`
and,
- return `None` early if not found,
- versus pass directly as `**error_dict` to the error constructor
  instead of a double field read.
2025-03-14 13:36:16 -04:00
Tyler Goodlet a18663213a Add comments around diff between `C/context` refs 2025-03-14 13:36:16 -04:00
Tyler Goodlet d4d09b6071 Factor non-yield stream msg processing into helper
Since both `MsgStream.receive()` and `.receive_nowait()` need the same
raising logic when a non-stream msg arrives (so that maybe an
appropriate IPC translated error can be raised) move the `KeyError`
handler code into a new `._streaming._raise_from_no_yield_msg()` func
and call it from both methods to make the error-interface-raising
symmetrical across both methods.
2025-03-14 13:36:16 -04:00
Tyler Goodlet 6d10f0c516 Always raise remote (cancelled) error if set
Previously we weren't raising a remote error if the local scope was
cancelled during a call to `Context.result()` which is problematic if
the caller WAS NOT the requester for said remote cancellation; in that
case we still want a `ContextCancelled` raised with the `.canceller:
str` set to the cancelling actor uid.

Further fix a naming bug where the (seemingly older) `._remote_err` was
being set to such an error instead of `._remote_error` XD
2025-03-14 13:36:16 -04:00
Tyler Goodlet fa9b57bae0 Write more comprehensive `Portal.cancel_actor()` doc str 2025-03-14 13:36:16 -04:00
Tyler Goodlet 81776a6238 Drop pause line from ctx cancel handler block in test 2025-03-14 13:36:16 -04:00
Tyler Goodlet 144d1f4d94 Msg-ified `ContextCancelled`s sub-error type should always be just, its type.. 2025-03-14 13:36:16 -04:00
Tyler Goodlet 51fdf3524c Start inter-peer cancellation test mod
Move over relevant test from the "context semantics" test module which
was already verifying peer-caused-`ContextCancelled.canceller: tuple`
error info and propagation during an inter-peer cancellation scenario.

Also begin a more general set of inter-peer cancellation tests starting
with the simplest case where when a peer is cancelled the parent should
NOT get an "muted" `trio.Cancelled` and instead
a `tractor.ContextCancelled` with a `.canceller: tuple` which points to
the sibling actor which requested the peer cancel.
2025-03-14 13:36:16 -04:00
Tyler Goodlet cff69d07fe Mk `gather_contexts()` support `@acm`s yielding `None`
We were using a `all(<yielded values>)` condition which obviously won't
work if the batched managers yield any non-truthy value. So instead see
the `unwrapped: dict` with the `id(mngrs)` and only unblock once all
values have been filled in to be something that is not that value.
2025-03-14 13:36:16 -04:00
Tyler Goodlet ee94d6d62c Teensie tidy up on actor doc string 2025-03-14 13:36:16 -04:00
Tyler Goodlet 89b84ed6c0 Make `NamespacePath` work on object refs
Detect if the input ref is a non-func (like an `object` instance) in
which case grab its type name using `type()`. Wrap all the name-getting
into a new `_mk_fqpn()` static meth: gets the "fully qualified path
name" and returns path and name in tuple; port other methds to use it.
Refine and update the docs B)
2025-03-14 13:36:16 -04:00
Tyler Goodlet f33f689f34 .log: more correct handling for `get_logger(__name__)` usage 2025-03-14 13:36:16 -04:00
Tyler Goodlet 7507e269ec Just import `mp` top level in `._spawn` 2023-06-14 15:32:15 -04:00
Tyler Goodlet 17ae449160 Tidy up `typing` imports in broadcaster mod 2023-06-14 15:31:52 -04:00
Tyler Goodlet 6495688730 Drop `Optional` style from runtime mod 2023-05-25 16:00:05 -04:00
Tyler Goodlet a0276f41c2 Remote cancellation runtime-internal vars renames
- `Context._cancel_called_remote` -> `._cancelled_remote` since "called"
  implies the cancellation was "requested" when it could be due to
  another error and the actor uid is the value - only set once the far
  end task scope is terminated due to either error or cancel, which has
  nothing to do with *what* caused the cancellation.
- `Actor._cancel_called_remote` -> `._cancel_called_by_remote` which
  emphasizes that this variable is **only set** IFF some remote actor
  **requested that** this actor's runtime be cancelled via
  `Actor.cancel()`.
2023-05-19 14:31:55 -04:00
Tyler Goodlet ead9e418de Expose `allow_overruns` to `Portal.open_context()`
Turns out you can get a case where you might be opening multiple
ctx-streams concurrently and during the context opening phase you block
for all contexts to open, but then when you eventually start opening
streams some slow to start context has caused the others become in an
overrun state.. so we need to let the caller control whether that's an
error ;)

This also needs a test!
2023-05-15 10:00:45 -04:00
Tyler Goodlet 60791ed546 Oof, fix remaining `Actor.cancel()` in `Actor._from_parent()` 2023-05-15 10:00:45 -04:00
Tyler Goodlet 7293b82bcc Tweak doc string 2023-05-15 10:00:45 -04:00
Tyler Goodlet 20d75ff934 Move move context code into new `._context` mod 2023-05-15 10:00:45 -04:00
Tyler Goodlet 041d7da721 Drop caller cancels overrun test; covered in new tests 2023-05-15 10:00:45 -04:00
Tyler Goodlet 04e4397a8f Ignore drainer-task nursery RTE during context exit 2023-05-15 10:00:45 -04:00
Tyler Goodlet 968f13f9ef Set `Context._scope_nursery` on callee side too
Because obviously we probably want to support `allow_overruns` on the
remote callee side as well XD

Only found the bugs fixed in this patch this thanks to writing a much
more exhaustive test set for overrun cases B)
2023-05-15 10:00:45 -04:00
Tyler Goodlet f9911c22a4 Seriously cover all overrun cases
This actually caught further runtime bugs so it's gud i tried..
Add overrun-ignore enabled / disabled cases and error catching for all
of them. More or less this should cover every possible outcome when
it comes to setting `allow_overruns: bool` i hope XD
2023-05-15 10:00:45 -04:00
Tyler Goodlet 63adf73b4b Adjust aio test for silent cancellation by parent 2023-05-15 10:00:45 -04:00
Tyler Goodlet f1e9c0be93 Fix cluster test to use `allow_overruns` 2023-05-15 10:00:45 -04:00
Tyler Goodlet 6db656fecf Flip allocate log msgs to debug 2023-05-15 10:00:45 -04:00
Tyler Goodlet 6994d2026d Drop brackpressure usage from fan out tests 2023-05-15 10:00:45 -04:00
Tyler Goodlet c72026091e Remote `Context` cancellation semantics rework B)
This adds remote cancellation semantics to our `tractor.Context`
machinery to more closely match that of `trio.CancelScope` but
with operational differences to handle the nature of parallel tasks interoperating
across multiple memory boundaries:

- if an actor task cancels some context it has opened via
  `Context.cancel()`, the remote (scope linked) task will be cancelled
  using the normal `CancelScope` semantics of `trio` meaning the remote
  cancel scope surrounding the far side task is cancelled and
  `trio.Cancelled`s are expected to be raised in that scope as per
  normal `trio` operation, and in the case where no error is raised
  in that remote scope, a `ContextCancelled` error is raised inside the
  runtime machinery and relayed back to the opener/caller side of the
  context.
- if any actor task cancels a full remote actor runtime using
  `Portal.cancel_actor()` the same semantics as above apply except every
  other remote actor task which also has an open context with the actor
  which was cancelled will also be sent a `ContextCancelled` **but**
  with the `.canceller` field set to the uid of the original cancel
  requesting actor.

This changeset also includes a more "proper" solution to the issue of
"allowing overruns" during streaming without attempting to implement any
form of IPC streaming backpressure. Implementing task-granularity
backpressure cross-process turns out to be more or less impossible
without augmenting out streaming protocol (likely at the cost of
performance). Further allowing overruns requires special care since
any blocking of the runtime RPC msg loop task effectively can block
control msgs such as cancels and stream terminations.

The implementation details per abstraction layer are as follows.

._streaming.Context:
- add a new contructor factor func `mk_context()` which provides
  a strictly private init-er whilst allowing us to not have to define
  an `.__init__()` on the type def.
- add public `.cancel_called` and `.cancel_called_remote` properties.
- general rename of what was the internal `._backpressure` var to
  `._allow_overruns: bool`.
- move the old contents of `Actor._push_result()` into a new
  `._deliver_msg()` allowing for better encapsulation of per-ctx
  msg handling.
 - always check for received 'error' msgs and process them with the new
   `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to
   the local task, thus guaranteeing error and cancellation handling
   despite any overflow handling.
- add a new `._drain_overflows()` task-method for use with new
  `._allow_overruns: bool = True` mode.
 - add back a `._scope_nursery: trio.Nursery` (allocated in
   `Portal.open_context()`) who's sole purpose is to spawn a single task
   which runs the above method; anything else is an error.
 - augment `._deliver_msg()` to start a task and run the above method
   when operating in no overrun mode; the task queues overflow msgs and
   attempts to send them to the underlying mem chan using a blocking
   `.send()` call.
 - on context exit, any existing "drainer task" will be cancelled and
   remaining overflow queued msgs are discarded with a warning.
- rename `._error` -> `_remote_error` and set it in a new method
  `_maybe_cancel_and_set_remote_error()` which is called before
  processing
- adjust `.result()` to always call `._maybe_raise_remote_err()` at its
  start such that whenever a `ContextCancelled` arrives we do logic for
  whether or not to immediately raise that error or ignore it due to the
  current actor being the one who requested the cancel, by checking the
  error's `.canceller` field.
 - set the default value of `._result` to be `id(Context()` thus avoiding
   conflict with any `.result()` actually being `False`..

._runtime.Actor:
- augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to
  take a `requesting_uid: tuple` indicating the source actor of every
  cancellation request.
- pass through the new `Context._allow_overruns` through `.get_context()`
- call the new `Context._deliver_msg()` from `._push_result()` (since
  the factoring out that method's contents).

._runtime._invoke:
- `TastStatus.started()` back a `Context` (unless an error is raised)
  instead of the cancel scope to make it easy to set/get state on that
  context for the purposes of cancellation and remote error relay.
- always raise any remote error via `Context._maybe_raise_remote_err()`
  before doing any `ContextCancelled` logic.
- assign any `Context._cancel_called_remote` set by the `requesting_uid`
  cancel methods (mentioned above) to the `ContextCancelled.canceller`.

._runtime.process_messages:
- always pass a `requesting_uid: tuple` to `Actor.cancel()` and
  `._cancel_task` to that any corresponding `ContextCancelled.canceller`
  can be set inside `._invoke()`.
2023-05-15 10:00:45 -04:00
Tyler Goodlet 90e41016b9 Only tuplize `.canceller` if non-`None` 2023-05-15 10:00:45 -04:00
Tyler Goodlet f54c415060 Move `NoRuntime` import inside `current_actor()` to avoid cycle 2023-05-15 10:00:45 -04:00
Tyler Goodlet 03644f59cc Augment test cases for callee-returns-result early
Turns out stuff was totally broken in these cases because we're either
closing the underlying mem chan too early or not handling the
"allow_overruns" mode's cancellation correctly..
2023-05-15 10:00:45 -04:00
Tyler Goodlet 67f82c6ebd Add new remote error introspection attrs
To handle both remote cancellation this adds `ContextCanceled.canceller:
tuple` the uid of the cancel requesting actor and is expected to be set
by the runtime when servicing any remote cancel request. This makes it
possible for `ContextCancelled` receivers to know whether "their actor
runtime" is the source of the cancellation.

Also add an explicit `RemoteActor.src_actor_uid` which better formalizes
the notion of "which remote actor" the error originated from.

Both of these new attrs are expected to be packed in the `.msgdata` when
the errors are loaded locally.
2023-05-15 10:00:45 -04:00
Tyler Goodlet 71cd445319 Add new set of context cancellation tests
These will verify new changes to the runtime/messaging core which allows
us to adopt an "ignore cancel if requested by us" style handling of
`ContextCancelled` more like how `trio` does with
`trio.Nursery.cancel_scope.cancel()`. We now expect
a `ContextCancelled.canceller: tuple` which is set to the actor uid of
the actor which requested the cancellation which eventually resulted in
the remote error-msg.

Also adds some experimental tweaks to the "backpressure" test which it
turns out is very problematic in coordination with context cancellation
since blocking on the feed mem chan to some task will block the ipc msg
loop and thus handling of cancellation.. More to come to both the test
and core to address this hopefully since right now this test is failing.
2023-05-15 10:00:45 -04:00
Tyler Goodlet 220b244508 Log waiter task cancelling msg as cancel-level 2023-05-15 10:00:45 -04:00
Tyler Goodlet 831790377b Assign `RemoteActorError` boxed error type for context cancelleds 2023-05-15 10:00:45 -04:00
Tyler Goodlet e80e0a551f Change a bunch of log levels to cancel, including any `ContextCancelled` handling 2023-05-15 10:00:45 -04:00
Tyler Goodlet b3f9251eda Add some log-level method doc-strings 2023-05-15 10:00:45 -04:00
Tyler Goodlet 903537ce04 Tweak context doc str 2023-05-15 10:00:45 -04:00
Tyler Goodlet d75343106b More single doc-strs in discovery mod 2023-05-15 10:00:45 -04:00
Tyler Goodlet cfb2bc0fee Enable `Context` backpressure by default; avoid startup race-crashes? 2023-05-15 10:00:45 -04:00
goodboy e5ee2e3de8
Merge pull request from goodboy/switch_to_pdbp
Switch to `pdbp` 🏄🏼
2023-05-15 09:58:58 -04:00
Tyler Goodlet 41aa91c8eb Add news file 2023-05-15 09:35:59 -04:00
Tyler Goodlet 6758e4487c Drop lingering `pdbpp` comment-refs in tests 2023-05-15 09:14:42 -04:00
Tyler Goodlet 1c3893a383 Drop commented `pdbpp` import logic 2023-05-15 09:01:55 -04:00
Tyler Goodlet 73befac9bc Switch to `pdbp` in test reqs 2023-05-15 09:01:27 -04:00
Tyler Goodlet 79622bbeea Restore `breakpoint()` hook after runtime exits
Previously we were leaking our (pdb++) override into the Python runtime
which would always result in a runtime error whenever `breakpoint()` is
called outside our runtime; after exit of the root actor . This
explicitly restores any previous hook override (detected during startup)
or deletes the hook and restores the environment if none existed prior.

Also adds a new WIP debugging example script to ensure breakpointing
works as normal after runtime close; this will be added to the test
suite.
2023-05-15 00:47:29 -04:00
Tyler Goodlet 95535b2226 Some more 3.10+ optional type sigs 2023-05-15 00:47:29 -04:00
Tyler Goodlet 87c6e09d6b Switch readme links to point @ `pdbp` B) 2023-05-14 22:52:24 -04:00
Tyler Goodlet 9ccd3a74b6 More detailed preface description 2023-05-14 22:38:47 -04:00
Tyler Goodlet ae4ff5dc8d pdbp: adding typing to config settings vars 2023-05-14 22:38:46 -04:00
Tyler Goodlet 705538398f `pdbp`: turn off line truncating by default, fixes terminal resizing stuff 2023-05-14 22:38:16 -04:00
Tyler Goodlet 86aef5238d Hide actor nursery exit frame 2023-05-14 21:24:26 -04:00
Tyler Goodlet cc82447db6 First try: switch debug machinery over to `pdbp` B) 2023-05-14 21:24:26 -04:00
Tyler Goodlet 23cffbd940 Use multiline import for debug mod 2023-05-14 21:24:26 -04:00
Tyler Goodlet 3d202272c4 Change over debugger tests to use `PROMPT` var.. 2023-05-14 21:24:26 -04:00
Tyler Goodlet 63cdb0891f Switch to `pdbp` since noone is maintaining `pdbpp` 2023-05-14 21:24:26 -04:00
goodboy 0f7db27b68
Merge pull request from goodboy/drop_proc_actxmngr
`trio.Process.aclose()`?
2023-05-14 20:59:53 -04:00
Tyler Goodlet c53d62d2f7 Add news file 2023-05-14 20:31:26 -04:00
Tyler Goodlet f667d16d66 Copy the now deprecated `trio.Process.aclose()`
Move it into our `_spawn.do_hard_kill()` since we do indeed rely on
the particular process killing sequence on "soft kill" failure cases.
2023-05-14 19:31:50 -04:00
Tyler Goodlet 24a062341e Just call `trio.Process.aclose()` directly for now? 2023-04-02 14:34:41 -04:00
goodboy e714bec8db
Merge pull request from kehrazy/patch-1
fixed the `Zombie` example having wrong indentation
2023-04-01 12:11:47 -04:00
Igor 009cd6552e
fixed the `Zombie` example having wrong indentation 2023-03-31 17:50:46 +03:00
goodboy 649c5e7504
Merge pull request from goodboy/breceiver_internals
Avoid inf recursion in `BroadcastReceiver.receive()`
2023-01-30 14:01:13 -05:00
Tyler Goodlet 203f95615c Add nooz 2023-01-30 12:42:26 -05:00
Tyler Goodlet efb8bec828 Add a basic no-raise-on lag test 2023-01-30 12:26:07 -05:00
Tyler Goodlet 8637778739 Expose `raise_on_lag: bool` flag through factory 2023-01-30 12:18:23 -05:00
Tyler Goodlet 47166e45f0 Be explicit with passthrough kwargs (there's so few) 2023-01-29 17:31:21 -05:00
Tyler Goodlet 4ce2dcd12b Switch back to raising `Lagged` by default
Makes the broadcast test suite not hang xD, and is our expected default
behaviour. Also removes a ton of commented legacy cruft from before the
refactor to remove the `.receive()` recursion and fixes some typing.

Oh right, and in the case where there's only one subscriber left we warn
log about it since in theory we could actually entirely unwind the
bcaster back to the original underlying, though not sure if that's sane
or works for some use cases (like wanting to have some other subscriber
get added dynamically later).
2023-01-29 15:03:34 -05:00
Tyler Goodlet 80f983818f Ignore monkey patched `.send()` type annot 2023-01-29 15:03:34 -05:00
Tyler Goodlet 6ba29f8d56 Recurse and get the last value when in warn mode 2023-01-29 15:03:34 -05:00
Tyler Goodlet 2707a0e971 Add `._raise_on_lag` flag to disable `Lag` raising 2023-01-29 15:03:34 -05:00
Tyler Goodlet c8efcdd0d3 Drop `ReceiveMsgStream` from test suite 2023-01-29 15:03:34 -05:00
Tyler Goodlet 9f9907271b Merge `ReceiveMsgStream` and `MsgStream`
Since one-way streaming can be accomplished by just *not* sending on one
side (and/or thus wrapping such usage in a more restrictive API), we
just drop the recv-only parent type. The only method different was
`MsgStream.send()`, now merged in. Further in usage of `.subscribe()`
we monkey patch the underlying stream's `.send()` onto the delivered
broadcast receiver so that subscriber tasks can two-way stream as though
using the stream directly.

This allows us to more definitively drop `tractor.open_stream_from()` in
the longer run if we so choose as well; note currently this will
potentially create an issue if a caller tries to `.send()` on such a one
way stream.
2023-01-29 15:03:34 -05:00
Tyler Goodlet c2367c1c5e Better `trio`-ize `BroadcastReceiver` internals
Driven by a bug found in `piker` where we'd get an inf recursion error
due to `BroadcastReceiver.receive()` being called when consumer tasks
are awoken but no value is ready to `.nowait_receive()`.

This new rework takes an approach closer to the interface and internals
of `trio.MemoryReceiveChannel` particularly in terms of,

- implementing a `BroadcastReceiver.receive_nowait()` and using it
  within the async `.receive()`.
- failing over to an internal `._receive_from_underlying()` when the
  `_nowait()` call raises `trio.WouldBlock`.
- adding `BroadcastState.statistics()` for debugging and testing
  dropping recursion from `.receive()`.
2023-01-29 15:03:34 -05:00
goodboy a777217674
Merge pull request from goodboy/ipc_failure_while_streaming
Ipc failure while streaming
2023-01-29 15:02:54 -05:00
Tyler Goodlet 13c9eadc8f Move result log msg up and drop else block 2023-01-29 14:55:02 -05:00
Tyler Goodlet af6c325072 Bump up legacy streaming timeout a smidgen 2023-01-29 14:55:02 -05:00
Tyler Goodlet 195d2f0ed4 Add nooz 2023-01-29 14:55:02 -05:00
Tyler Goodlet aa4871b13d Call `MsgStream.aclose()` in `Context.open_stream.__aexit__()`
We weren't doing this originally I *think* just because of the path
dependent nature of the way the code was developed (originally being
mega pedantic about one-way vs. bidirectional streams) but, it doesn't
seem like there's any issue just calling the stream's `.aclose()`; also
have the benefit of just being less code and logic checks B)
2023-01-29 14:55:02 -05:00
Tyler Goodlet 556f4626db Tweak warning msg for still-alive-after-cancelled actor 2023-01-29 14:55:02 -05:00
Tyler Goodlet 3967c0ed9e Add a simplified zombie lord specific process reaping test 2023-01-29 14:55:02 -05:00
Tyler Goodlet e34823aab4 Add parent vs. child cancels first cases 2023-01-29 14:55:02 -05:00
Tyler Goodlet 6c35ba2cb6 Add IPC breakage on both parent and child side
With the new fancy `_pytest.pathlib.import_path()` we can do real
parametrization of the example-script-module code and thus configure
whether the child, parent, or both silently break the IPC connection.

Parametrize the test for all the above mentioned cases as well as the
case where the IPC never breaks but we still simulate the user hammering
ctl-c / SIGINT to terminate the actor tree. Adjust expected errors based
on each case and heavily document each of these.
2023-01-29 14:55:02 -05:00
Tyler Goodlet 3a0817ff55 Skip `advanced_faults/` subset in docs examples tests 2023-01-29 14:55:02 -05:00
Tyler Goodlet 7fddb4416b Handle `mp` spawn method cases in test suite 2023-01-29 14:55:02 -05:00
Tyler Goodlet 1d92f2552a Adjust other examples tests to expect `pathlib` objects 2023-01-29 14:55:02 -05:00
Tyler Goodlet 4f8586a928 Wrap ex in new test, change dir helpers to use `pathlib.Path` 2023-01-29 14:55:02 -05:00
Tyler Goodlet fb9ff45745 Move example to a new `advanced_faults` egs subset dir 2023-01-29 14:55:02 -05:00
Tyler Goodlet 36a83cb306 Refine example to drop IPC mid-stream
Use a task nursery in the subactor to spawn tasks which cancel the IPC
channel mid stream to simulate the most concurrent case we're likely to
see. Make `main()` accept a `debug_mode: bool` for parametrization. Fill
out detailed comments/docs on this example.
2023-01-29 14:55:02 -05:00
Tyler Goodlet 7394a187e0 Name one-way streaming (con generators) what it is 2023-01-29 14:55:02 -05:00
Tyler Goodlet df01294bb2 Show more functiony syntax in ctx-cancelled log msgs 2023-01-29 14:55:02 -05:00
Tyler Goodlet ddf3d0d1b3 Show tracebacks for un-shipped/propagated errors 2023-01-29 14:55:02 -05:00
Tyler Goodlet 158569adae Add WIP example of silent IPC breaks while streaming 2023-01-29 14:55:02 -05:00
Tyler Goodlet 97d5f7233b Fix uid2nursery lookup table type annot 2023-01-29 14:55:02 -05:00
Tyler Goodlet d27c081a15 Ensure arbiter sockaddr type before usage 2023-01-29 14:55:02 -05:00
Tyler Goodlet a4874a3227 Always set the `parent_exit: trio.Event` on exit 2023-01-29 14:55:02 -05:00
Tyler Goodlet de04bbb2bb Don't raise on a broken IPC-context when sending stop msg 2023-01-29 14:55:02 -05:00
Tyler Goodlet 4f977189c0 Handle broken mem chan on `Actor._push_result()`
When backpressure is used and a feeder mem chan breaks during msg
delivery (usually because the IPC allocating task already terminated)
instead of raising we simply warn as we do for the non-backpressure
case.

Also, add a proper `Actor.is_arbiter` test inside `._invoke()` to avoid
doing an arbiter-registry lookup if the current actor **is** the
registrar.
2023-01-29 14:55:02 -05:00
goodboy 9fd62cf71f
Merge pull request from goodboy/deprecate_arbiter_addr
Begin deprecation of `arbiter_addr` -> `registry_addr`
2023-01-26 16:05:41 -05:00
Tyler Goodlet 606efa5bb7 Adjust daemon command to use new `registry_addr` 2023-01-26 16:00:08 -05:00
Tyler Goodlet 121a8cc891 Drop `Optional` usage from root mod 2023-01-26 16:00:08 -05:00
Tyler Goodlet c54b8ca4ba Begin deprecation of `arbiter_addr` -> `registry_addr` 2023-01-26 16:00:08 -05:00
goodboy de93c8257c
Merge pull request from goodboy/prompt_on_ctrlc
Re-draw `pdbpp` prompt on `SIGINT`
2023-01-26 15:56:37 -05:00
Tyler Goodlet 5b8a87d0f6 Slightly better `xonsh` check hack, fix typing 2023-01-26 15:48:15 -05:00
Tyler Goodlet 9e5c8ce6f6 Add nooz file 2023-01-26 15:39:03 -05:00
Tyler Goodlet 965cd406a2 Use std `pdbpp` release 2023-01-26 15:27:55 -05:00
Tyler Goodlet 2e278ceb74 Add a super hacky check for `xonsh`, smh.. 2023-01-26 15:26:43 -05:00
Tyler Goodlet 6d124db7c9 Never run ctlc-with-intermediary-actor cases locally either 2023-01-26 12:44:13 -05:00
Tyler Goodlet dba8118553 Always attempt prompt redraw on ctl-c in REPL
The stdlib has all sorts of muckery with ignoring SIGINT in the
`Pdb._cmdloop()` but here we just override all that since we don't trust
their decisions about cancellation handling whatsoever. Adds
a `Lock.repl: MultiActorPdb` attr which is set by any task which
acquires root TTY lock indicating (via actor global state) that the
current actor is using the debugger REPL and can be expected to re-draw
the prompt on SIGINT. Further we mask out log messages from any actor
who also has the `shield_sigint_handler()` enabled to avoid logging
noise when debugging.
2023-01-26 12:44:13 -05:00
Tyler Goodlet fca2e7c10e Simplify closed abruptly log msg 2023-01-26 12:44:13 -05:00
Tyler Goodlet 5ed62c5c54 Add note about intermediary-actor in debug issue 2023-01-26 12:44:13 -05:00
goodboy 588b7ca7bf
Merge pull request from goodboy/harden_cluster_tests
Harden cluster tests
2022-12-12 15:02:23 -05:00
Tyler Goodlet d8214735b9 Add bugfix nooz 2022-12-12 14:53:59 -05:00
Tyler Goodlet 48f6d514ef Handle earlier name error crash in debug test 2022-12-12 14:05:32 -05:00
Tyler Goodlet 6c8cacc9d1 Adjust all default is `None` annots (per new `mypy`) 2022-12-12 13:18:22 -05:00
Tyler Goodlet 38326e8c15 Avoid error on context double pops 2022-12-11 23:46:33 -05:00
Tyler Goodlet b5192cca8e Always greedily `list`-cast`mngrs` input sequence 2022-12-11 23:20:58 -05:00
Tyler Goodlet c606be8c64 Passthrough runtime kwargs from `open_actor_cluster()` 2022-12-11 19:56:08 -05:00
Tyler Goodlet d8e48e29ba Add `mngrs=(<gen_comprehension>)` test 2022-12-11 19:56:01 -05:00
goodboy a0f6668ce8
Merge pull request from goodboy/exceptiongroups
`ExceptiongGroup`s and `trio>=0.22`
2022-10-14 20:11:26 -04:00
Tyler Goodlet 274c66cf9d Add nooz 2022-10-14 19:42:23 -04:00
Tyler Goodlet f2641c8964 Avoid "task never called `.started()`" runtime erros when cancelling 2022-10-14 19:42:23 -04:00
Tyler Goodlet c47575997a Expand nested case to include error prop and breakpointing 2022-10-14 19:42:23 -04:00
Tyler Goodlet f39414ce12 Drop error-repacking for `.run_in_actor()`s block
If we pack the nursery parent task's error into the `errors` table
directly in the handler, we don't need to specially handle packing that
same error into any exception group raised while handling sub-actor
cancellation; drops some ugly indentation ;)
2022-10-14 19:42:23 -04:00
Tyler Goodlet 0a1bf8e57d Tolerate eg in runtime test teardown 2022-10-14 19:42:23 -04:00
Tyler Goodlet e298b70edf Drop added `.pdp()` level msgs used duringn dev 2022-10-14 19:42:23 -04:00
Tyler Goodlet c0dd5d7ffc Adjust multi-daemon test to be more deterministic 2022-10-14 19:42:23 -04:00
Tyler Goodlet 347591c348 Expect egs in tests which retreive portal results 2022-10-14 19:42:23 -04:00
Tyler Goodlet 38f9d35dee Fix errors table type annot 2022-10-14 19:42:23 -04:00
Tyler Goodlet 88448f7281 Fix handler type annot 2022-10-14 19:42:23 -04:00
Tyler Goodlet 0956d5f461 Restore the `trio` SIGINT handler, cancel root lock tasks on no-peers
Pretty sure this is the final touch to alleviate all our debug lock
headaches! Instead of trying to revert to the "last" handler (as `pdb`
does internally in the stdlib) we always just revert to the handler
`trio` registers during startup. Further this seems to allow cancelling
the root-side locking task if it's detected as stale IFF we only do this
when the root actor is in a "no more IPC peers" state.

Deatz:
- (always) set `._debug.Lock._trio_handler` as the `trio` version, not
  some last used handler to make sure we're getting the ctrl-c handling
  we want when not in debug mode.
- assign the trio handler in `open_root_actor()`
  `._runtime._async_main()` to be sure it's applied in subactors as well
  as the root.
- only do debug lock blocking and root-side-locking-task cancels when
  a "no peers" condition is detected in the root actor: i.e. no IPC
  channels are detected by the root meaning it's impossible any actor
  has a sane lock-state ongoing for debug mode.
2022-10-14 18:18:01 -04:00
Tyler Goodlet c646c79a82 Adjust root-errors debug tests for blocking and egs 2022-10-14 18:18:01 -04:00
Tyler Goodlet 33f2234baf Hide some stack layers the user doesn't really need to see 2022-10-14 18:18:01 -04:00
Tyler Goodlet 7521bded3d Pack error from the parent task into the actor nursery 2022-10-14 18:16:51 -04:00
Tyler Goodlet 0f523b65fb Change cancel test over the exception group 2022-10-14 18:16:51 -04:00
Tyler Goodlet 50fe098e06 First pass, swap `MultiError` for `BaseExceptionGroup` 2022-10-14 18:16:51 -04:00
Tyler Goodlet d87d6af7e1 Add `exceptiongroup` (3.11 backport lib) as dep 2022-10-14 18:16:51 -04:00
Tyler Goodlet df69aedcd5 Pin to latest `trio` version 2022-10-14 18:16:51 -04:00
Tyler Goodlet b15e4ed9ce Adjust "no arbiter" test for new runtime defaults
Turns out this test was being silently ignored due to incorrect usage of
sync opening of our `.open_nursery()` block (with a `with` not `async
with`) and thus was an noop XD

Instead this fixes the test to call a `tractor` discovery built-in
without starting the runtime (which is now done implicitly when a user
opens a nursery) which should result in the prior expected outcome,
a `RuntimeError`.
2022-10-12 12:46:20 -04:00
Tyler Goodlet 98056f6ed7 Move logging context map into `log.py` module 2022-10-12 12:46:20 -04:00
goodboy 247d3448ae
Merge pull request from goodboy/debug_lock_blocking
Debug lock blocking
2022-10-12 12:41:14 -04:00
Tyler Goodlet fc17f6790e Bump `towncrier` alpha version 2022-10-12 12:36:09 -04:00
Tyler Goodlet b81b6be98a Drop extra log msgs, some old commented code 2022-10-12 12:35:35 -04:00
Tyler Goodlet 72fbda4cef Add nooz file 2022-10-12 12:35:11 -04:00
Tyler Goodlet fb721f36ef Support debug-lock blocking, use on no-more IPC
This is a lingering debugger locking race case we needed to handle:

- child crashes acquires TTY lock in root and attaches to `pdb`
- child IPC goes down such that all channels to the root are broken
  / non-functional.
- root is stuck thinking the child is still in debug even though it
  can't be contacted and the child actor machinery hasn't been
  cancelled by its parent.
- root get's stuck in deadlock with child since it won't send a cancel
  request until the child is finished debugging, but the child can't
  unlock the debugger bc IPC is down.

To avoid this scenario add debug lock blocking list via
`._debug.Lock._blocked: set[tuple]` which holds actor uids for any actor
that is detected by the root as having no transport channel connections
with said root (of which at least one should exist if this sub-actor at
some point acquired the debug lock). The root consequently checks this
list for any actor that tries to (re)acquire the lock and blocks with
a `ContextCancelled`. When a debug condition is tested in
`._runtime._invoke` the context's `._enter_debugger_on_cancel` which
is set to `False` if the actor is on the block list in which case the
post-mortem entry is skipped.

Further this adds a root-locking-task side cancel scope to
`Lock._root_local_task_cs_in_debug` which can be cancelled by the root
runtime when a stale lock is detected after all IPC channels for the
actor have been torn down. NOTE: right now we're NOT doing this since it
seems to cause test failures likely due because it may cause pre-mature
cancellation and maybe needs a bit more experimenting?
2022-10-11 20:00:05 -04:00
Tyler Goodlet 734d8dd663 Move `trio` scope outside first inter-task-chan receive 2022-10-11 20:00:05 -04:00
Tyler Goodlet 30ea7a06b0 Avoid inf nursery hang by reversing `async with` ordering 2022-10-11 20:00:05 -04:00
Tyler Goodlet 3398153c52 Add timeout around `trio`-callee-task 2022-10-11 20:00:05 -04:00
Tyler Goodlet 1c480e6c92 Add `Context` cancel message and debug toggle flag
In the case of a callee-side context cancelling itself it can be handy
to let the caller-side task know (even if through logging) that the
cancel was due to some known reason. Make `.cancel()` accept such
a message on the callee side and have it included in the
`._runtime._invoke()` raised `ContextCancelled` emission.

Also add a `Context._trigger_debugger_on_cancel: bool` flag which can be
set to `False` to avoid the debugger post-mortem crash mode from
engaging on cross-context tasks which cancel themselves for a known
reason (as is needed for blocked tasks in the debug TTY-lock machinery).
2022-10-11 20:00:05 -04:00
goodboy dfdad4d1fa
Merge pull request from goodboy/callable_key_maybe_open_context
Callable key input to maybe open context
2022-10-10 00:32:27 -04:00
Tyler Goodlet b892bc74f6 Add trivial news snippet 2022-10-09 21:27:23 -04:00
Tyler Goodlet 44b59f3338 Go back to a `global` single-ton nursery per actor
Turns out the lifetime mgmt of separate nurseries per delegate manager
is tricky; a new nursery can't be naively allocated on cache-misses since
it may get closed by some early terminating task instead of by the "last
using" consumer task. In theory if we allocate using the same logic as
that used for the last-task-triggers-exit then this should work?

For now just go back to a single global nursery per `_Cache` which still
avoids use of the internal actor service nursery.
2022-10-09 21:27:23 -04:00
Tyler Goodlet 7a719ac2a7 Use one nursery per unique manager (signature)
Instead of sticking all `trionics.maybe_open_context()` tasks inside the
actor's (root) service nursery, open a unique one per manager function
instance (id).

Further, accept a callable for the `key` such that a user can have
more flexible control on the caching logic and move the
`maybe_open_nursery()` helper out of the portal mod and into this
trionics "managers" module.
2022-10-09 21:27:23 -04:00
goodboy 9e6266dda3
Merge pull request from goodboy/spawn_backend_table
Spawn backend table
2022-10-09 21:26:28 -04:00
Tyler Goodlet b1abec543f Add trivial news snippet 2022-10-09 18:51:31 -04:00
Tyler Goodlet 93b9d2dc2d Drop dynamic backend-spawn-method test generation 2022-10-09 18:29:50 -04:00
Tyler Goodlet 4d808757a6 Fix start method name in logging propagation test 2022-10-09 18:22:55 -04:00
Tyler Goodlet 7e5bb0437e Go to latest `mypy` version in CI 2022-10-09 18:13:45 -04:00
Tyler Goodlet b19f08d9f0 Fill out new backend names in ci script 2022-10-09 18:08:07 -04:00
Tyler Goodlet 2c20b2d64f Fix import to load from `conftest.py` 2022-10-09 18:03:17 -04:00
Tyler Goodlet 023b6fc845 Drop `tractor.testing` sub-package 2022-10-09 17:57:02 -04:00
Tyler Goodlet d24fae8381 'Rename mp spawn methods to have a `'mp_'` prefix' 2022-10-09 17:54:55 -04:00
Tyler Goodlet 5ab98513b7 Move `@tractor_test` into `conftest.py` 2022-10-09 17:14:20 -04:00
Tyler Goodlet 90f4912580 Organize process spawning into lookup table
Instead of the logic branching create a table `._spawn._methods`
which is used to lookup the desired backend framework (in this case
still only one of `multiprocessing` or `trio`) and make the top level
`.new_proc()` do the lookup and any common logic. Use a `typing.Literal`
to define the lookup table's key set.

Repair and ignore a bunch of type-annot related stuff todo with `mypy`
updates and backend-specific process typing.
2022-10-09 16:51:21 -04:00
goodboy 6e24e16068
Merge pull request from goodboy/pin_pre_trio_0.22
Pin pre-0.22 bc exception groups break everything
2022-10-09 16:26:56 -04:00
Tyler Goodlet 15047341bd Ignore forserver override attrs with `mypy` 2022-10-09 16:14:11 -04:00
Tyler Goodlet dc295ab227 Pin pre-0.22 bc exception groups break everything 2022-10-09 16:11:06 -04:00
goodboy 6a0337b69d
Merge pull request from goodboy/lifetime_stack_tests
Expose lifetime stack as class attr, add base test suite
2022-09-16 18:09:24 -04:00
Tyler Goodlet e609183242 Expose lifetime stack as class attr, add base test suite 2022-09-15 23:50:15 -04:00
goodboy 368e9f3f7c
Merge pull request from goodboy/we_bein_all_matchy
3.10 and friends
2022-09-15 23:49:34 -04:00
Tyler Goodlet 10eeda2d2b Use built-ins for all data-structure-type annotations 2022-09-15 23:41:28 -04:00
Tyler Goodlet a113e22bb9 Add trivial nooz snippet 2022-09-15 23:41:28 -04:00
Tyler Goodlet ad19bf2cf1 Remove `tractor.run()` once and for all
It's been deprecated for a while now and all docs and tests have been
changed.

Closes 
2022-09-15 23:41:28 -04:00
Tyler Goodlet 9aef03772a Expose `Actor` at pkg level, adjust debug type annots 2022-09-15 23:41:28 -04:00
Tyler Goodlet 7548dba8f2 Change to new doc string style 2022-09-15 23:41:28 -04:00
Tyler Goodlet ba4d4e9af3 Change test import 2022-09-15 23:41:28 -04:00
Tyler Goodlet 208d56af2c Make `async_main()` a module func 2022-09-15 23:41:28 -04:00
Tyler Goodlet a3a5bc267e Make `process_messages()` a mod func 2022-09-15 23:41:28 -04:00
Tyler Goodlet d4084b2032 Rename our core module to `_runtime` 2022-09-15 23:41:28 -04:00
Tyler Goodlet 1e6b4d5dd4 Drop `msgspec` min pin 2022-09-15 23:41:28 -04:00
Tyler Goodlet c613acfe5c Start alpha 6 dev, ensure py3.10+ 2022-09-15 23:41:28 -04:00
goodboy fea9dc7065
Merge pull request from goodboy/debug_event_guard
Add debug complete event `None`-guard for when already reset
2022-09-15 23:20:38 -04:00
goodboy e558c427de
Merge pull request from goodboy/disable_win_ci
Disable win tests in CI
2022-09-15 23:20:26 -04:00
Tyler Goodlet f07c3aa4a1 Add nooz 2022-09-15 19:39:34 -04:00
Tyler Goodlet bafd10a260 Make `maybe_open_context()` re-entrant safe, use per factory locks 2022-09-15 19:02:02 -04:00
Tyler Goodlet 5ad540c417 Add debug complete event `None`-guard for when already reset 2022-09-15 19:02:02 -04:00
Tyler Goodlet 83b44cf469 Flip over PR number in readme 2022-09-15 18:54:51 -04:00
Tyler Goodlet 1f2001020e Mention disabled windows CI in readme 2022-09-15 18:46:34 -04:00
Tyler Goodlet 71f9881a60 Drop windows from CI until we get a collab that actually uses it XD 2022-09-15 18:36:45 -04:00
Tyler Goodlet e24645eec8 Drop `pytest` 3.10 issue comment, add todo for `pyreadline3` 2022-09-15 18:36:37 -04:00
Tyler Goodlet c3cdeeb3ba Drop `pytest` full trace flag, use `pip list` 2022-09-15 18:36:27 -04:00
Tyler Goodlet 9bd534df83 Drop 3.9 from CI jobs 2022-09-15 18:36:15 -04:00
goodboy c1d700f257
Merge pull request from goodboy/alpha5
`alpha5` release!
2022-08-03 14:36:52 -04:00
Tyler Goodlet 14c6e34658 Add summary section 2022-08-03 11:42:53 -04:00
Tyler Goodlet 3393bc23e4 Generate release news 2022-08-03 11:41:23 -04:00
Tyler Goodlet 171f1bc243 Move to using `pyproject.toml` for `towncrier`
Add explicit fragment types based on `pytest`'s config
and don't manually spec the version.
2022-08-03 11:36:23 -04:00
Tyler Goodlet ee02cd2496 Move misplaced fragment for 2022-08-03 10:54:22 -04:00
Tyler Goodlet 4c5d435aac Fix towncrier bug entry suffix 2022-08-03 10:21:37 -04:00
Tyler Goodlet a9b4a61620 Flip to non-dev version tag 2022-08-03 10:21:07 -04:00
goodboy 641ed7a32a
Merge pull request from goodboy/signint_saviour
Ignore SIGINT when in a debugger REPL
2022-08-03 09:26:54 -04:00
Tyler Goodlet cc5f60bba0 List deps in CI 2022-08-02 18:19:03 -04:00
Tyler Goodlet 8f1fe2376a Simplify all hooks to a common `Lock.release()` 2022-08-02 18:14:05 -04:00
Tyler Goodlet 65540f3e2a Add nooz 2022-08-02 15:29:33 -04:00
Tyler Goodlet 650313dfef Drop legacy handler blocks factored into `_acquire_debug_lock()` 2022-08-02 12:50:27 -04:00
Tyler Goodlet e4006da6f4 Drop `pdbpp` bug notes, add follow up issue note 2022-08-02 12:48:40 -04:00
Tyler Goodlet 7f6169a050 Drop legacy commented/todo remote debug helper block 2022-08-02 12:43:14 -04:00
Tyler Goodlet 2d387f2610 Add in issue link for nested cases 2022-08-02 12:17:34 -04:00
Tyler Goodlet 8115759984 Mark final nested-actor debugger test 2022-08-02 12:17:34 -04:00
Tyler Goodlet 02c3b9a672 Put `pygments` back to default 2022-08-02 12:17:34 -04:00
Tyler Goodlet fa4388835c Add an expect wrapper, use in hanging CI test 2022-08-02 12:17:34 -04:00
Tyler Goodlet 54de72d8df Loosen timeout on nested child re-locking 2022-08-02 12:17:34 -04:00
Tyler Goodlet c5c7a9027c Line len lint and drop rpc log msg level again 2022-08-02 12:17:34 -04:00
Tyler Goodlet e4771eec16 Go back to skipping since xfail is wack 2022-08-02 12:17:28 -04:00
Tyler Goodlet a9aaee9dbd Use xfails for nested cases, revert prompt expect 2022-08-02 12:17:28 -04:00
Tyler Goodlet acfbae4b95 Drop verbose level, report xfails 2022-08-02 12:17:28 -04:00
Tyler Goodlet aca9a6b99a Try just skipping nested actor tests in CI 2022-08-02 12:17:28 -04:00
Tyler Goodlet 8896ba2bf8 Use `assert_before` more extensively 2022-08-02 12:17:28 -04:00
Tyler Goodlet 87b2ccb86a Try less times for EOF 2022-08-02 12:17:28 -04:00
Tyler Goodlet 937ed99e39 Factor sigint overriding into lock methods 2022-08-02 12:17:28 -04:00
Tyler Goodlet 91f034a136 Move all module vars into a `Lock` type 2022-08-02 12:17:28 -04:00
Tyler Goodlet 08cf03cd9e Handle missing prompt render case? 2022-08-02 12:17:28 -04:00
Tyler Goodlet 5e23b3ca0d Drop pytest full-tracing in CI again 2022-08-02 12:17:28 -04:00
Tyler Goodlet 6f01c78122 Disable `pygments` highlighting on ctlc tests 2022-08-02 12:17:28 -04:00
Tyler Goodlet 457499bc2e Avoid infinite wait for EOF 2022-08-02 12:17:28 -04:00
Tyler Goodlet a4bac135d9 Use `pytest-timeout` plug to try and prevent CI hang 2022-08-02 12:17:28 -04:00
Tyler Goodlet 20c660faa7 Add timeout on spawn error msg check 2022-08-02 12:17:28 -04:00
Tyler Goodlet 1d4d55f5cd Increase verbosity in ci tests for now 2022-08-02 12:17:28 -04:00
Tyler Goodlet c0cd99e374 Timeout on arbiter ping, avoid TCP SYN hangs in CI? 2022-08-02 12:17:28 -04:00
Tyler Goodlet a4538a3d84 Drop ctlc tests on Py3.9...
After many tries I just don't think it's worth it to make the tests work
since the repl UX in `pdbpp` is so unreliable in the latest release and
honestly we're trying to go 3.10+ ASAP.

Further,
- entirely drop the pattern matching inside the `do_ctlc()` for now.
- add a `subactor_error` parametrization that catches a case that
  previously caused a hang (when you use 'next' immediately after the
  first crash/debug lock (the fix was pushed just before this commit).
2022-08-02 12:17:28 -04:00
Tyler Goodlet b01daa5319 Factor lock-state release logic into helper
The common logic to both remove our custom SIGINT handler as well
as signal the actor global event that pdb is complete. Call this
whenever we exit a post mortem call and thus any time some rpc task
get's debugged inside `._actor._invoke()`.

Further, we have to manually print the REPL prompt on 3.9 for some wack
reason, so stick a version guard in the sigint handler for that..
2022-08-02 12:17:28 -04:00
Tyler Goodlet bd362a05f0 Run release hook around `next` repl commands as well 2022-08-02 12:17:28 -04:00
Tyler Goodlet cb0c47c42a Try disabling prompt expect in ctrlc cases 2022-08-02 12:17:28 -04:00
Tyler Goodlet 808d7ae2c6 Add timeout guard around caller side context open 2022-08-02 12:17:28 -04:00
Tyler Goodlet b21f2e16ad Always consider the debugger when exiting contexts
When in an uncertain teardown state and in debug mode a context can be
popped from actor runtime before a child finished debugging (the case
when the parent is tearing down but the child hasn't closed/completed
its tty lock IPC exit phase) and the child sends the "stop" message to
unlock the debugger but it's ignored bc the parent has already dropped
the ctx. Instead we call `._debug.maybe_wait_for_deugger()` before these
context removals to avoid the root getting stuck thinking the lock was
never released.

Further, add special `Actor._cancel_task()` handling code inside
`_invoke()` which continues to execute the method despite the IPC
channel to the caller being broken and thus avoiding potential hangs due
to a target (child) actor task remaining alive.
2022-08-02 12:17:28 -04:00
Tyler Goodlet 4779badd96 Add before assert helper and print console bytes on fail 2022-08-02 12:17:28 -04:00
Tyler Goodlet 6bdcbdb96f Do child decode on `do_ctlc` exit? 2022-08-02 12:17:28 -04:00
Tyler Goodlet adbebd3f06 Add ctl-c to remaining tests, only expect prompt in non-CI 2022-08-02 12:17:28 -04:00
Tyler Goodlet a2e90194bc Add ctl-c case to `subactor_breakpoint` example test 2022-08-02 12:17:28 -04:00
Tyler Goodlet ba7b355d9c Add note about default behaviour of `fancycompleter` 2022-08-02 12:17:28 -04:00
Tyler Goodlet 617d57dc35 Disable ctl-c prompt checks again 2022-08-02 12:17:28 -04:00
Tyler Goodlet dadd5e6148 Add back prompt expect via flag 2022-08-02 12:17:28 -04:00
Tyler Goodlet a72350118c Test: drop expect prompt 2022-08-02 12:17:28 -04:00
Tyler Goodlet ef8dc0204c Just drop all longlisting for now and leave comments 2022-08-02 12:17:28 -04:00
Tyler Goodlet a101971027 Go back to original longlist code 2022-08-02 12:17:28 -04:00
Tyler Goodlet 835836123b Just don't call longlist on 3.10+ for now 2022-08-02 12:17:28 -04:00
Tyler Goodlet 70ad0f6b8e Add longer delays around ctl-c loop, don't expect longlist 2022-08-02 12:17:28 -04:00
Tyler Goodlet 56b30a9a53 Add sleep around ctl-c iteration loop 2022-08-02 12:17:27 -04:00
Tyler Goodlet 925d5c1ceb Pin to specific `pdbppp` master commit 2022-08-02 12:17:27 -04:00
Tyler Goodlet b9eb601265 General typing fixes for `mypy` 2022-08-02 12:17:27 -04:00
Tyler Goodlet 4dcc21234e Only call `.poll()` if a method on the spawn backend 2022-08-02 12:17:27 -04:00
Tyler Goodlet 64909e676e Fix loglevel in subactor test; actually pass the level XD 2022-08-02 12:17:27 -04:00
Tyler Goodlet 19fb77f698 Pin to `trio >= 0.20` 2022-08-02 12:17:27 -04:00
Tyler Goodlet 8b9f342eef Port to new `.lowlevel.open_process()` API 2022-08-02 12:17:27 -04:00
Tyler Goodlet bd7d507153 Guard against `asyncio` cancelled logged to console 2022-08-02 12:17:16 -04:00
Tyler Goodlet 9bc38cbf04 Add slight delay 2nd ctlc round.. 2022-08-02 12:17:06 -04:00
Tyler Goodlet a90ca4b384 Call longlist normally when on py < 3.10 2022-08-02 12:17:06 -04:00
Tyler Goodlet d0dcd55f47 Only report disconnected actors if proc is still alive? 2022-08-02 12:17:06 -04:00
Tyler Goodlet 4e08605b0d Only do `pdbpp` from `git` install on 3.10+ 2022-08-02 12:17:06 -04:00
Tyler Goodlet 519f4c300b I dunno, seems like `breakpoint()` needs this? 2022-08-02 12:17:06 -04:00
Tyler Goodlet 56c19093bb Add basic module-not-found when opening a ctx eg. 2022-08-02 12:17:06 -04:00
Tyler Goodlet ff3f5959e9 Always enable debug level logging if mode enabled 2022-08-02 12:16:58 -04:00
Tyler Goodlet abb00531d3 Add help msg for non `__main__` modules as well 2022-08-02 12:16:58 -04:00
Tyler Goodlet 439d320a25 Add basic ctl-c testing cases to suite 2022-08-02 12:16:58 -04:00
Tyler Goodlet 18c525d2f1 Hack around double long list print issue..
See https://github.com/pdbpp/pdbpp/issues/496
2022-08-02 12:16:58 -04:00
Tyler Goodlet 201c026284 Show full KBI trace for help with CI hangs 2022-08-02 12:16:58 -04:00
Tyler Goodlet 2a61aa099b Move pydantic-click hang example to new dir, skip in test suite 2022-08-02 12:16:58 -04:00
Tyler Goodlet e2453fd3da Add spaces before values in log msg 2022-08-02 12:16:58 -04:00
Tyler Goodlet b29def8b5d Add runtime level msg around channel draining 2022-08-02 12:16:58 -04:00
Tyler Goodlet f07e9dbb2f Always undo SIGINT overrides, cancel detached children
Ensure that even when `pdb` resumption methods are called during a crash
where `trio`'s runtime has already terminated (eg. `Event.set()` will
raise) we always revert our sigint handler to the original. Further
inside the handler if we hit a case where a child is in debug and
(thinks it) has the global pdb lock, if it has no IPC connection to
a parent, simply presume tty sync-coordination is now lost and cancel
the child immediately.
2022-08-02 12:16:49 -04:00
Tyler Goodlet 2f5a6049a4 Readme formatting tweaks 2022-07-27 11:40:02 -04:00
Tyler Goodlet 418e74eee7 Pin to `pdbpp` upstream master, 3.10 problem?
See issues:
- https://github.com/pdbpp/pdbpp/issues/480
- https://github.com/pdbpp/pdbpp/pull/482
2022-07-27 11:40:02 -04:00
Tyler Goodlet c7035be2fc Tolerate double `.remove()`s of stream on portal teardowns 2022-07-27 11:40:02 -04:00
Tyler Goodlet deaca7d6cc Always propagate SIGINT when no locking peer found
A hopefully significant fix here is to always avoid suppressing a SIGINT
when the root actor can not detect an active IPC connections (via
a connected channel) to the supposed debug lock holding actor. In that
case it is most likely that the actor has either terminated or has lost
its connection for debugger control and there is no way the root can
verify the lock is in use; thus we choose to allow KBI cancellation.

Drop the (by comment) `try`-`finally` block in
`_hijoack_stdin_for_child()` around the `_acquire_debug_lock()` call
since all that logic should now be handled internal to that locking
manager. Try to catch a weird error around the `.do_longlist()` method
call that seems to sometimes break on py3.10 and latest `pdbpp`.
2022-07-27 11:40:02 -04:00
Tyler Goodlet d47d0e7c37 Always call pdb hook even if tty locking fails 2022-07-27 11:40:02 -04:00
Tyler Goodlet 0062c96a3c Log cancels with appropriate level 2022-07-27 11:40:02 -04:00
Tyler Goodlet 4be13b7387 Just warn on IPC breaks 2022-07-27 11:40:02 -04:00
Tyler Goodlet 7bb5addd4c Only warn on `trio.BrokenResourceError`s from `_invoke()` 2022-07-27 11:40:02 -04:00
Tyler Goodlet 4fd924cfd2 Make example a subpkg for `python -m <mod>` testing 2022-07-27 11:40:02 -04:00
Tyler Goodlet fe0fd1a1c1 Add example that triggers bug 2022-07-27 11:40:02 -04:00
Tyler Goodlet dd23e78de1 Add back in async gen loop 2022-07-27 11:40:02 -04:00
Tyler Goodlet 89b44f8163 Pre-declare disconnected flag 2022-07-27 11:40:02 -04:00
Tyler Goodlet 2819b6a5b2 Avoid attr error XD 2022-07-27 11:40:02 -04:00
Tyler Goodlet f2671ed026 Type annot updates 2022-07-27 11:40:02 -04:00
Tyler Goodlet 41924c86a6 Drop uneeded backframe traceback hide annotation 2022-07-27 11:40:02 -04:00
Tyler Goodlet 206c7c0720 Make `Actor._process_messages()` report disconnects
The method now returns a `bool` which flags whether the transport died
to the caller and allows for reporting a disconnect in the
channel-transport handler task. This is something a user will normally
want to know about on the caller side especially after seeing
a traceback from the peer (if in tree) on console.
2022-07-27 11:40:02 -04:00
Tyler Goodlet bf0ac3116c Only cancel/get-result from a ctx if transport is up
There's no point in sending a cancel message to the remote linked task
and especially no reason to block waiting on a result from that task if
the transport layer is detected to be disconnected. We expect that the
transport shouldn't go down at the layer of the message loop
(reconnection logic should be handled in the transport layer itself) so
if we detect the channel is not connected we don't bother requesting
cancels nor waiting on a final result message.

Why?

- if the connection goes down in error the caller side won't have a way
  to know "how long" it should block to wait for a cancel ack or result
  and causes a potential hang that may require an additional ctrl-c from
  the user especially if using the debugger or if the traceback is not
  seen on console.
- obviously there's no point in waiting for messages when there's no
  transport to deliver them XD

Further, add some more detailed cancel logging detailing the task and
actor ids.
2022-07-27 11:40:02 -04:00
Tyler Goodlet bb732cefd0 Drop high log level in ctx example 2022-07-27 11:40:02 -04:00
Tyler Goodlet 74b819a857 Typing fixes, simplify `_set_trace()` 2022-07-27 11:40:02 -04:00
Tyler Goodlet 8892204c84 Add notes around py3.10 stdlib bug from `pdb++`
There's a bug that's triggered in the stdlib without latest `pdb++`
installed; add a note for that.

Further inside `wait_for_parent_stdin_hijack()` don't `.started()` until
the interactor stream has been opened to avoid races when debugging this
`._debug.py` module (at the least) since we usually don't want the
spawning (parent) task to resume until we know for sure the tty lock has
been acquired. Also, drop the random checkpoint we had inside
`_breakpoint()`, not sure it was actually adding anything useful since
we're (mostly) carefully shielded throughout this func.
2022-07-27 11:40:02 -04:00
Tyler Goodlet 8f4bbf1cbf Add and use a pdb instance factory 2022-07-27 11:40:02 -04:00
Tyler Goodlet 21dccb2e79 A `.open_context()` example that causes a hang!
Finally! I think this may be the root issue we've been seeing in
production in a client project.

No idea yet why this is happening but the fault-causing sequence seems
to be:
- `.open_context()` in a child actor
- enter the debugger via `tractor.breakpoint()`
- continue from that entry via `c` command in REPL
- raise an error just after inside the context task's body

Looking at logging it appears as though the child thinks it has the tty
but no input is accepted on the REPL and a further `ctrl-c` results in
some teardown but also a further hang where both parent and child become
unresponsive..
2022-07-27 11:40:02 -04:00
Tyler Goodlet aea8f63bae Drop all the `@cm.__exit__()` override attempts..
None of it worked (you still will see `.__exit__()` frames on debugger
entry - you'd think this would have been solved by now but, shrug) so
instead wrap the debugger entry-point in a `try:` and put the SIGINT
handler restoration inside `MultiActorPdb` teardown hooks.

This seems to restore the UX as it was prior but with also giving the
desired SIGINT override handler behaviour.
2022-07-27 11:40:02 -04:00
Tyler Goodlet 7964a9f6f8 Try overriding `_GeneratorContextManager.__exit__()`; didn't work..
Using either of `@pdb.hideframe` or `__tracebackhide__` on stdlib
methods doesn't seem to work either.. This all seems to have something
to do with async generator usage I think ?
2022-07-27 11:40:02 -04:00
Tyler Goodlet 99c4319940 Fix example name typo 2022-07-27 11:40:02 -04:00
Tyler Goodlet e5195264a1 Handle a context cancel? Might be a noop 2022-07-27 11:40:02 -04:00
Tyler Goodlet 42f9d10252 Add a pre-started breakpoint example 2022-07-27 11:40:02 -04:00
Tyler Goodlet 345573e602 Make `mypy` happy 2022-07-27 11:40:02 -04:00
Tyler Goodlet 4e60c17375 Refine the handler for child vs. root cases
This gets very close to avoiding any possible hangs to do with tty
locking and SIGINT handling minus a special case that will be detailed
below.

Summary of implementation changes:

- convert `_mk_pdb()` -> `with _open_pdb() as pdb:` which implicitly
  handles the `bdb.BdbQuit` case such that debugger teardown hooks are
  always called.
- rename the handler to `shield_sigint()` and handle a variety of new
  cases:
  * the root is in debug but hasn't been cancelled -> call
    `Actor.cancel_soon()`
  * the root is in debug but *has* been called (`Actor.cancel_soon()`
    already called) -> raise KBI
  * a child is in debug *and* has a task locking the debugger -> ignore
    SIGINT in child *and* the root actor.
- if the debugger instance is provided to the handler at acquire time,
  on SIGINT handling completion re-print the last pdb++ REPL output so
  that the user realizes they are still actively in debug.
- ignore the unlock case where a race condition of "no task" holding the
  lock causes the `RuntimeError` normally associated with the "wrong
  task" doing so (not sure if this is a `trio` bug?).
- change debug logs to runtime level.

Unhandled case(s):

- a child is maybe in debug mode but does not itself have any task using
  the debugger.
    * ToDo: we need a way to decide what to do with
      "intermediate" child actors who themselves either are not in
      `debug_mode=True` but have children who *are* such that a SIGINT
      won't cause cancellation of that child-as-parent-of-another-child
      **iff** any of their children are in in debug mode.
2022-07-27 11:40:02 -04:00
Tyler Goodlet 6b7b58346f (facepalm) Reraise `BdbQuit` and discard ownerless lock releases 2022-07-27 11:40:02 -04:00
Tyler Goodlet 3cac323421 Add WIP while-debugger-active SIGINT ignore handler 2022-07-27 11:40:02 -04:00
137 changed files with 29973 additions and 6076 deletions
.github/workflows
notes_to_self

View File

@ -20,14 +20,16 @@ jobs:
- name: Setup python
uses: actions/setup-python@v2
with:
python-version: '3.10'
python-version: '3.11'
- name: Install dependencies
run: pip install -U . --upgrade-strategy eager -r requirements-test.txt
- name: Run MyPy check
run: mypy tractor/ --ignore-missing-imports
run: mypy tractor/ --ignore-missing-imports --show-traceback
# test that we can generate a software distribution and install it
# thus avoid missing file issues after packaging.
sdist-linux:
name: 'sdist'
runs-on: ubuntu-latest
@ -39,7 +41,7 @@ jobs:
- name: Setup python
uses: actions/setup-python@v2
with:
python-version: '3.10'
python-version: '3.11'
- name: Build sdist
run: python setup.py sdist --formats=zip
@ -57,8 +59,12 @@ jobs:
fail-fast: false
matrix:
os: [ubuntu-latest]
python: ['3.9', '3.10']
spawn_backend: ['trio', 'mp']
python: ['3.11']
spawn_backend: [
'trio',
'mp_spawn',
'mp_forkserver',
]
steps:
@ -73,42 +79,53 @@ jobs:
- name: Install dependencies
run: pip install -U . -r requirements-test.txt -r requirements-docs.txt --upgrade-strategy eager
- name: Run tests
run: pytest tests/ --spawn-backend=${{ matrix.spawn_backend }} -rs
- name: List dependencies
run: pip list
# We skip 3.10 on windows for now due to
# https://github.com/pytest-dev/pytest/issues/8733
# some kinda weird `pyreadline` issue..
- name: Run tests
run: pytest tests/ --spawn-backend=${{ matrix.spawn_backend }} -rsx
# We skip 3.10 on windows for now due to not having any collabs to
# debug the CI failures. Anyone wanting to hack and solve them is very
# welcome, but our primary user base is not using that OS.
# TODO: use job filtering to accomplish instead of repeated
# boilerplate as is above XD:
# - https://docs.github.com/en/actions/learn-github-actions/managing-complex-workflows
# - https://docs.github.com/en/actions/learn-github-actions/managing-complex-workflows#using-a-build-matrix
# - https://docs.github.com/en/actions/learn-github-actions/workflow-syntax-for-github-actions#jobsjob_idif
testing-windows:
name: '${{ matrix.os }} Python ${{ matrix.python }} - ${{ matrix.spawn_backend }}'
timeout-minutes: 12
runs-on: ${{ matrix.os }}
# testing-windows:
# name: '${{ matrix.os }} Python ${{ matrix.python }} - ${{ matrix.spawn_backend }}'
# timeout-minutes: 12
# runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os: [windows-latest]
python: ['3.9', '3.10']
spawn_backend: ['trio', 'mp']
# strategy:
# fail-fast: false
# matrix:
# os: [windows-latest]
# python: ['3.10']
# spawn_backend: ['trio', 'mp']
steps:
# steps:
- name: Checkout
uses: actions/checkout@v2
# - name: Checkout
# uses: actions/checkout@v2
- name: Setup python
uses: actions/setup-python@v2
with:
python-version: '${{ matrix.python }}'
# - name: Setup python
# uses: actions/setup-python@v2
# with:
# python-version: '${{ matrix.python }}'
- name: Install dependencies
run: pip install -U . -r requirements-test.txt -r requirements-docs.txt --upgrade-strategy eager
# - name: Install dependencies
# run: pip install -U . -r requirements-test.txt -r requirements-docs.txt --upgrade-strategy eager
- name: Run tests
run: pytest tests/ --spawn-backend=${{ matrix.spawn_backend }} -rs
# # TODO: pretty sure this solves debugger deps-issues on windows, but it needs to
# # be verified by someone with a native setup.
# # - name: Force pyreadline3
# # run: pip uninstall pyreadline; pip install -U pyreadline3
# - name: List dependencies
# run: pip list
# - name: Run tests
# run: pytest tests/ --spawn-backend=${{ matrix.spawn_backend }} -rsx

View File

@ -1,7 +0,0 @@
Add ``tractor.query_actor()`` an addr looker-upper which doesn't deliver
a ``Portal`` instance and instead just a socket address ``tuple``.
Sometimes it's handy to just have a simple way to figure out if
a "service" actor is up, so add this discovery helper for that. We'll
prolly just leave it undocumented for now until we figure out
a longer-term/better discovery system.

142
NEWS.rst
View File

@ -4,6 +4,148 @@ Changelog
.. towncrier release notes start
tractor 0.1.0a5 (2022-08-03)
============================
This is our final release supporting Python 3.9 since we will be moving
internals to the new `match:` syntax from 3.10 going forward and
further, we have officially dropped usage of the `msgpack` library and
happily adopted `msgspec`.
Features
--------
- `#165 <https://github.com/goodboy/tractor/issues/165>`_: Add SIGINT
protection to our `pdbpp` based debugger subystem such that for
(single-depth) actor trees in debug mode we ignore interrupts in any
actor currently holding the TTY lock thus avoiding clobbering IPC
connections and/or task and process state when working in the REPL.
As a big note currently so called "nested" actor trees (trees with
actors having more then one parent/ancestor) are not fully supported
since we don't yet have a mechanism to relay the debug mode knowledge
"up" the actor tree (for eg. when handling a crash in a leaf actor).
As such currently there is a set of tests and known scenarios which will
result in process cloberring by the zombie repaing machinery and these
have been documented in https://github.com/goodboy/tractor/issues/320.
The implementation details include:
- utilizing a custom SIGINT handler which we apply whenever an actor's
runtime enters the debug machinery, which we also make sure the
stdlib's `pdb` configuration doesn't override (which it does by
default without special instance config).
- litter the runtime with `maybe_wait_for_debugger()` mostly in spots
where the root actor should block before doing embedded nursery
teardown ops which both cancel potential-children-in-deubg as well
as eventually trigger zombie reaping machinery.
- hardening of the TTY locking semantics/API both in terms of IPC
terminations and cancellation and lock release determinism from
sync debugger instance methods.
- factoring of locking infrastructure into a new `._debug.Lock` global
which encapsulates all details of the ``trio`` sync primitives and
task/actor uid management and tracking.
We also add `ctrl-c` cases throughout the test suite though these are
disabled for py3.9 (`pdbpp` UX differences that don't seem worth
compensating for, especially since this will be our last 3.9 supported
release) and there are a slew of marked cases that aren't expected to
work in CI more generally (as mentioned in the "nested" tree note
above) despite seemingly working when run manually on linux.
- `#304 <https://github.com/goodboy/tractor/issues/304>`_: Add a new
``to_asyncio.LinkedTaskChannel.subscribe()`` which gives task-oriented
broadcast functionality semantically equivalent to
``tractor.MsgStream.subscribe()`` this makes it possible for multiple
``trio``-side tasks to consume ``asyncio``-side task msgs in tandem.
Further Improvements to the test suite were added in this patch set
including a new scenario test for a sub-actor managed "service nursery"
(implementing the basics of a "service manager") including use of
*infected asyncio* mode. Further we added a lower level
``test_trioisms.py`` to start to track issues we need to work around in
``trio`` itself which in this case included a bug we were trying to
solve related to https://github.com/python-trio/trio/issues/2258.
Bug Fixes
---------
- `#318 <https://github.com/goodboy/tractor/issues/318>`_: Fix
a previously undetected ``trio``-``asyncio`` task lifetime linking
issue with the ``to_asyncio.open_channel_from()`` api where both sides
where not properly waiting/signalling termination and it was possible
for ``asyncio``-side errors to not propagate due to a race condition.
The implementation fix summary is:
- add state to signal the end of the ``trio`` side task to be
read by the ``asyncio`` side and always cancel any ongoing
task in such cases.
- always wait on the ``asyncio`` task termination from the ``trio``
side on error before maybe raising said error.
- always close the ``trio`` mem chan on exit to ensure the other
side can detect it and follow.
Trivial/Internal Changes
------------------------
- `#248 <https://github.com/goodboy/tractor/issues/248>`_: Adjust the
`tractor._spawn.soft_wait()` strategy to avoid sending an actor cancel
request (via `Portal.cancel_actor()`) if either the child process is
detected as having terminated or the IPC channel is detected to be
closed.
This ensures (even) more deterministic inter-actor cancellation by
avoiding the timeout condition where possible when a whild never
sucessfully spawned, crashed, or became un-contactable over IPC.
- `#295 <https://github.com/goodboy/tractor/issues/295>`_: Add an
experimental ``tractor.msg.NamespacePath`` type for passing Python
objects by "reference" through a ``str``-subtype message and using the
new ``pkgutil.resolve_name()`` for reference loading.
- `#298 <https://github.com/goodboy/tractor/issues/298>`_: Add a new
`tractor.experimental` subpackage for staging new high level APIs and
subystems that we might eventually make built-ins.
- `#300 <https://github.com/goodboy/tractor/issues/300>`_: Update to and
pin latest ``msgpack`` (1.0.3) and ``msgspec`` (0.4.0) both of which
required adjustments for backwards imcompatible API tweaks.
- `#303 <https://github.com/goodboy/tractor/issues/303>`_: Fence off
``multiprocessing`` imports until absolutely necessary in an effort to
avoid "resource tracker" spawning side effects that seem to have
varying degrees of unreliability per Python release. Port to new
``msgspec.DecodeError``.
- `#305 <https://github.com/goodboy/tractor/issues/305>`_: Add
``tractor.query_actor()`` an addr looker-upper which doesn't deliver
a ``Portal`` instance and instead just a socket address ``tuple``.
Sometimes it's handy to just have a simple way to figure out if
a "service" actor is up, so add this discovery helper for that. We'll
prolly just leave it undocumented for now until we figure out
a longer-term/better discovery system.
- `#316 <https://github.com/goodboy/tractor/issues/316>`_: Run windows
CI jobs on python 3.10 after some hacks for ``pdbpp`` dependency
issues.
Issue was to do with the now deprecated `pyreadline` project which
should be changed over to `pyreadline3`.
- `#317 <https://github.com/goodboy/tractor/issues/317>`_: Drop use of
the ``msgpack`` package and instead move fully to the ``msgspec``
codec library.
We've now used ``msgspec`` extensively in production and there's no
reason to not use it as default. Further this change preps us for the up
and coming typed messaging semantics (#196), dialog-unprotocol system
(#297), and caps-based messaging-protocols (#299) planned before our
first beta.
tractor 0.1.0a4 (2021-12-18)
============================

View File

@ -1,39 +1,122 @@
|logo| ``tractor``: next-gen Python parallelism
|logo| ``tractor``: distributed structurred concurrency
|gh_actions|
|docs|
``tractor`` is a `structured concurrent`_, multi-processing_ runtime built on trio_.
``tractor`` is a `structured concurrency`_ (SC), multi-processing_ runtime built on trio_.
Fundamentally ``tractor`` gives you parallelism via ``trio``-"*actors*":
our nurseries_ let you spawn new Python processes which each run a ``trio``
scheduled runtime - a call to ``trio.run()``.
Fundamentally, ``tractor`` provides parallelism via
``trio``-"*actors*": independent Python **processes** (i.e.
*non-shared-memory threads*) which can schedule ``trio`` tasks whilst
maintaining *end-to-end SC* inside a *distributed supervision tree*.
We believe the system adhere's to the `3 axioms`_ of an "`actor model`_"
but likely *does not* look like what *you* probably think an "actor
model" looks like, and that's *intentional*.
Cross-process (and thus cross-host) SC is accomplished through the
combined use of our,
The first step to grok ``tractor`` is to get the basics of ``trio`` down.
A great place to start is the `trio docs`_ and this `blog post`_.
- "actor nurseries_" which provide for spawning multiple, and
possibly nested, Python processes each running a ``trio`` scheduled
runtime - a call to ``trio.run()``,
- an "SC-transitive supervision protocol" enforced as an
IPC-message-spec encapsulating all RPC-dialogs.
We believe the system adheres to the `3 axioms`_ of an "`actor model`_"
but likely **does not** look like what **you** probably *think* an "actor
model" looks like, and that's **intentional**.
Where do i start!?
------------------
The first step to grok ``tractor`` is to get an intermediate
knowledge of ``trio`` and **structured concurrency** B)
Some great places to start are,
- the seminal `blog post`_
- obviously the `trio docs`_
- wikipedia's nascent SC_ page
- the fancy diagrams @ libdill-docs_
Features
--------
- **It's just** a ``trio`` API
- *Infinitely nesteable* process trees
- Builtin IPC streaming APIs with task fan-out broadcasting
- A (first ever?) "native" multi-core debugger UX for Python using `pdb++`_
- Support for a swappable, OS specific, process spawning layer
- A modular transport stack, allowing for custom serialization (eg. with
`msgspec`_), communications protocols, and environment specific IPC
primitives
- Support for spawning process-level-SC, inter-loop one-to-one-task oriented
``asyncio`` actors via "infected ``asyncio``" mode
- `structured chadcurrency`_ from the ground up
- **It's just** a ``trio`` API!
- *Infinitely nesteable* process trees running embedded ``trio`` tasks.
- Swappable, OS-specific, process spawning via multiple backends.
- Modular IPC stack, allowing for custom interchange formats (eg.
as offered from `msgspec`_), varied transport protocols (TCP, RUDP,
QUIC, wireguard), and OS-env specific higher-perf primitives (UDS,
shm-ring-buffers).
- Optionally distributed_: all IPC and RPC APIs work over multi-host
transports the same as local.
- Builtin high-level streaming API that enables your app to easily
leverage the benefits of a "`cheap or nasty`_" `(un)protocol`_.
- A "native UX" around a multi-process safe debugger REPL using
`pdbp`_ (a fork & fix of `pdb++`_)
- "Infected ``asyncio``" mode: support for starting an actor's
runtime as a `guest`_ on the ``asyncio`` loop allowing us to
provide stringent SC-style ``trio.Task``-supervision around any
``asyncio.Task`` spawned via our ``tractor.to_asyncio`` APIs.
- A **very naive** and still very much work-in-progress inter-actor
`discovery`_ sys with plans to support multiple `modern protocol`_
approaches.
- Various ``trio`` extension APIs via ``tractor.trionics`` such as,
- task fan-out `broadcasting`_,
- multi-task-single-resource-caching and fan-out-to-multi
``__aenter__()`` APIs for ``@acm`` functions,
- (WIP) a ``TaskMngr``: one-cancels-one style nursery supervisor.
Install
-------
``tractor`` is still in a *alpha-near-beta-stage* for many
of its subsystems, however we are very close to having a stable
lowlevel runtime and API.
As such, it's currently recommended that you clone and install the
repo from source::
pip install git+git://github.com/goodboy/tractor.git
We use the very hip `uv`_ for project mgmt::
git clone https://github.com/goodboy/tractor.git
cd tractor
uv sync --dev
uv run python examples/rpc_bidir_streaming.py
Consider activating a virtual/project-env before starting to hack on
the code base::
# you could use plain ol' venvs
# https://docs.astral.sh/uv/pip/environments/
uv venv tractor_py313 --python 3.13
# but @goodboy prefers the more explicit (and shell agnostic)
# https://docs.astral.sh/uv/configuration/environment/#uv_project_environment
UV_PROJECT_ENVIRONMENT="tractor_py313
# hint hint, enter @goodboy's fave shell B)
uv run --dev xonsh
Alongside all this we ofc offer "releases" on PyPi::
pip install tractor
Just note that YMMV since the main git branch is often much further
ahead then any latest release.
Example codez
-------------
In ``tractor``'s (very lacking) documention we prefer to point to
example scripts in the repo over duplicating them in docs, but with
that in mind here are some definitive snippets to try and hook you
into digging deeper.
Run a func in a process
-----------------------
***********************
Use ``trio``'s style of focussing on *tasks as functions*:
.. code:: python
@ -91,7 +174,7 @@ might want to check out `trio-parallel`_.
Zombie safe: self-destruct a process tree
-----------------------------------------
*****************************************
``tractor`` tries to protect you from zombies, no matter what.
.. code:: python
@ -117,7 +200,7 @@ Zombie safe: self-destruct a process tree
f"running in pid {os.getpid()}"
)
await trio.sleep_forever()
await trio.sleep_forever()
async def main():
@ -147,8 +230,8 @@ it **is a bug**.
"Native" multi-process debugging
--------------------------------
Using the magic of `pdb++`_ and our internal IPC, we've
********************************
Using the magic of `pdbp`_ and our internal IPC, we've
been able to create a native feeling debugging experience for
any (sub-)process in your ``tractor`` tree.
@ -202,7 +285,7 @@ We're hoping to add a respawn-from-repl system soon!
SC compatible bi-directional streaming
--------------------------------------
**************************************
Yes, you saw it here first; we provide 2-way streams
with reliable, transitive setup/teardown semantics.
@ -294,7 +377,7 @@ hear your thoughts on!
Worker poolz are easy peasy
---------------------------
***************************
The initial ask from most new users is *"how do I make a worker
pool thing?"*.
@ -316,10 +399,10 @@ This uses no extra threads, fancy semaphores or futures; all we need
is ``tractor``'s IPC!
"Infected ``asyncio``" mode
---------------------------
***************************
Have a bunch of ``asyncio`` code you want to force to be SC at the process level?
Check out our experimental system for `guest-mode`_ controlled
Check out our experimental system for `guest`_-mode controlled
``asyncio`` actors:
.. code:: python
@ -425,7 +508,7 @@ We need help refining the `asyncio`-side channel API to be more
Higher level "cluster" APIs
---------------------------
***************************
To be extra terse the ``tractor`` devs have started hacking some "higher
level" APIs for managing actor trees/clusters. These interfaces should
generally be condsidered provisional for now but we encourage you to try
@ -482,18 +565,6 @@ spawn a flat cluster:
.. _full worker pool re-implementation: https://github.com/goodboy/tractor/blob/master/examples/parallelism/concurrent_actors_primes.py
Install
-------
From PyPi::
pip install tractor
From git::
pip install git+git://github.com/goodboy/tractor.git
Under the hood
--------------
``tractor`` is an attempt to pair trionic_ `structured concurrency`_ with
@ -566,6 +637,13 @@ Help us push toward the future of distributed `Python`.
- Typed capability-based (dialog) protocols ( see `#196
<https://github.com/goodboy/tractor/issues/196>`_ with draft work
started in `#311 <https://github.com/goodboy/tractor/pull/311>`_)
- We **recently disabled CI-testing on windows** and need help getting
it running again! (see `#327
<https://github.com/goodboy/tractor/pull/327>`_). **We do have windows
support** (and have for quite a while) but since no active hacker
exists in the user-base to help test on that OS, for now we're not
actively maintaining testing due to the added hassle and general
latency..
Feel like saying hi?
@ -577,31 +655,39 @@ say hi, please feel free to reach us in our `matrix channel`_. If
matrix seems too hip, we're also mostly all in the the `trio gitter
channel`_!
.. _structured concurrent: https://trio.discourse.group/t/concise-definition-of-structured-concurrency/228
.. _distributed: https://en.wikipedia.org/wiki/Distributed_computing
.. _multi-processing: https://en.wikipedia.org/wiki/Multiprocessing
.. _trio: https://github.com/python-trio/trio
.. _nurseries: https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/#nurseries-a-structured-replacement-for-go-statements
.. _actor model: https://en.wikipedia.org/wiki/Actor_model
.. _trio: https://github.com/python-trio/trio
.. _multi-processing: https://en.wikipedia.org/wiki/Multiprocessing
.. _trionic: https://trio.readthedocs.io/en/latest/design.html#high-level-design-principles
.. _async sandwich: https://trio.readthedocs.io/en/latest/tutorial.html#async-sandwich
.. _structured concurrent: https://trio.discourse.group/t/concise-definition-of-structured-concurrency/228
.. _3 axioms: https://www.youtube.com/watch?v=7erJ1DV_Tlo&t=162s
.. .. _3 axioms: https://en.wikipedia.org/wiki/Actor_model#Fundamental_concepts
.. _adherance to: https://www.youtube.com/watch?v=7erJ1DV_Tlo&t=1821s
.. _trio gitter channel: https://gitter.im/python-trio/general
.. _matrix channel: https://matrix.to/#/!tractor:matrix.org
.. _broadcasting: https://github.com/goodboy/tractor/pull/229
.. _modern procotol: https://en.wikipedia.org/wiki/Rendezvous_protocol
.. _pdbp: https://github.com/mdmintz/pdbp
.. _pdb++: https://github.com/pdbpp/pdbpp
.. _guest mode: https://trio.readthedocs.io/en/stable/reference-lowlevel.html?highlight=guest%20mode#using-guest-mode-to-run-trio-on-top-of-other-event-loops
.. _cheap or nasty: https://zguide.zeromq.org/docs/chapter7/#The-Cheap-or-Nasty-Pattern
.. _(un)protocol: https://zguide.zeromq.org/docs/chapter7/#Unprotocols
.. _discovery: https://zguide.zeromq.org/docs/chapter8/#Discovery
.. _modern protocol: https://en.wikipedia.org/wiki/Rendezvous_protocol
.. _messages: https://en.wikipedia.org/wiki/Message_passing
.. _trio docs: https://trio.readthedocs.io/en/latest/
.. _blog post: https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/
.. _structured concurrency: https://en.wikipedia.org/wiki/Structured_concurrency
.. _structured chadcurrency: https://en.wikipedia.org/wiki/Structured_concurrency
.. _structured concurrency: https://en.wikipedia.org/wiki/Structured_concurrency
.. _SC: https://en.wikipedia.org/wiki/Structured_concurrency
.. _libdill-docs: https://sustrik.github.io/libdill/structured-concurrency.html
.. _unrequirements: https://en.wikipedia.org/wiki/Actor_model#Direct_communication_and_asynchrony
.. _async generators: https://www.python.org/dev/peps/pep-0525/
.. _trio-parallel: https://github.com/richardsheridan/trio-parallel
.. _uv: https://docs.astral.sh/uv/
.. _msgspec: https://jcristharif.com/msgspec/
.. _guest-mode: https://trio.readthedocs.io/en/stable/reference-lowlevel.html?highlight=guest%20mode#using-guest-mode-to-run-trio-on-top-of-other-event-loops
.. _guest: https://trio.readthedocs.io/en/stable/reference-lowlevel.html?highlight=guest%20mode#using-guest-mode-to-run-trio-on-top-of-other-event-loops
.. |gh_actions| image:: https://img.shields.io/endpoint.svg?url=https%3A%2F%2Factions-badge.atrox.dev%2Fgoodboy%2Ftractor%2Fbadge&style=popout-square

View File

@ -396,7 +396,7 @@ tasks spawned via multiple RPC calls to an actor can modify
# a per process cache
_actor_cache: Dict[str, bool] = {}
_actor_cache: dict[str, bool] = {}
def ping_endpoints(endpoints: List[str]):

View File

View File

@ -0,0 +1,259 @@
'''
Complex edge case where during real-time streaming the IPC tranport
channels are wiped out (purposely in this example though it could have
been an outage) and we want to ensure that despite being in debug mode
(or not) the user can sent SIGINT once they notice the hang and the
actor tree will eventually be cancelled without leaving any zombies.
'''
from contextlib import asynccontextmanager as acm
from functools import partial
from tractor import (
open_nursery,
context,
Context,
ContextCancelled,
MsgStream,
_testing,
)
import trio
import pytest
async def break_ipc_then_error(
stream: MsgStream,
break_ipc_with: str|None = None,
pre_close: bool = False,
):
await _testing.break_ipc(
stream=stream,
method=break_ipc_with,
pre_close=pre_close,
)
async for msg in stream:
await stream.send(msg)
assert 0
async def iter_ipc_stream(
stream: MsgStream,
break_ipc_with: str|None = None,
pre_close: bool = False,
):
async for msg in stream:
await stream.send(msg)
@context
async def recv_and_spawn_net_killers(
ctx: Context,
break_ipc_after: bool|int = False,
pre_close: bool = False,
) -> None:
'''
Receive stream msgs and spawn some IPC killers mid-stream.
'''
broke_ipc: bool = False
await ctx.started()
async with (
ctx.open_stream() as stream,
trio.open_nursery(
strict_exception_groups=False,
) as tn,
):
async for i in stream:
print(f'child echoing {i}')
if not broke_ipc:
await stream.send(i)
else:
await trio.sleep(0.01)
if (
break_ipc_after
and
i >= break_ipc_after
):
broke_ipc = True
tn.start_soon(
iter_ipc_stream,
stream,
)
tn.start_soon(
partial(
break_ipc_then_error,
stream=stream,
pre_close=pre_close,
)
)
@acm
async def stuff_hangin_ctlc(timeout: float = 1) -> None:
with trio.move_on_after(timeout) as cs:
yield timeout
if cs.cancelled_caught:
# pretend to be a user seeing no streaming action
# thinking it's a hang, and then hitting ctl-c..
print(
f"i'm a user on the PARENT side and thingz hangin "
f'after timeout={timeout} ???\n\n'
'MASHING CTlR-C..!?\n'
)
raise KeyboardInterrupt
async def main(
debug_mode: bool = False,
start_method: str = 'trio',
loglevel: str = 'cancel',
# by default we break the parent IPC first (if configured to break
# at all), but this can be changed so the child does first (even if
# both are set to break).
break_parent_ipc_after: int|bool = False,
break_child_ipc_after: int|bool = False,
pre_close: bool = False,
) -> None:
async with (
open_nursery(
start_method=start_method,
# NOTE: even debugger is used we shouldn't get
# a hang since it never engages due to broken IPC
debug_mode=debug_mode,
loglevel=loglevel,
) as an,
):
sub_name: str = 'chitty_hijo'
portal = await an.start_actor(
sub_name,
enable_modules=[__name__],
)
async with (
stuff_hangin_ctlc(timeout=2) as timeout,
_testing.expect_ctxc(
yay=(
break_parent_ipc_after
or break_child_ipc_after
),
# TODO: we CAN'T remove this right?
# since we need the ctxc to bubble up from either
# the stream API after the `None` msg is sent
# (which actually implicitly cancels all remote
# tasks in the hijo) or from simluated
# KBI-mash-from-user
# or should we expect that a KBI triggers the ctxc
# and KBI in an eg?
reraise=True,
),
portal.open_context(
recv_and_spawn_net_killers,
break_ipc_after=break_child_ipc_after,
pre_close=pre_close,
) as (ctx, sent),
):
rx_eoc: bool = False
ipc_break_sent: bool = False
async with ctx.open_stream() as stream:
for i in range(1000):
if (
break_parent_ipc_after
and
i > break_parent_ipc_after
and
not ipc_break_sent
):
print(
'#################################\n'
'Simulating PARENT-side IPC BREAK!\n'
'#################################\n'
)
# TODO: other methods? see break func above.
# await stream._ctx.chan.send(None)
# await stream._ctx.chan.transport.stream.send_eof()
await stream._ctx.chan.transport.stream.aclose()
ipc_break_sent = True
# it actually breaks right here in the
# mp_spawn/forkserver backends and thus the
# zombie reaper never even kicks in?
try:
print(f'parent sending {i}')
await stream.send(i)
except ContextCancelled as ctxc:
print(
'parent received ctxc on `stream.send()`\n'
f'{ctxc}\n'
)
assert 'root' in ctxc.canceller
assert sub_name in ctx.canceller
# TODO: is this needed or no?
raise
except trio.ClosedResourceError:
# NOTE: don't send if we already broke the
# connection to avoid raising a closed-error
# such that we drop through to the ctl-c
# mashing by user.
await trio.sleep(0.01)
# timeout: int = 1
# with trio.move_on_after(timeout) as cs:
async with stuff_hangin_ctlc() as timeout:
print(
f'PARENT `stream.receive()` with timeout={timeout}\n'
)
# NOTE: in the parent side IPC failure case this
# will raise an ``EndOfChannel`` after the child
# is killed and sends a stop msg back to it's
# caller/this-parent.
try:
rx = await stream.receive()
print(
"I'm a happy PARENT user and echoed to me is\n"
f'{rx}\n'
)
except trio.EndOfChannel:
rx_eoc: bool = True
print('MsgStream got EoC for PARENT')
raise
print(
'Streaming finished and we got Eoc.\n'
'Canceling `.open_context()` in root with\n'
'CTlR-C..'
)
if rx_eoc:
assert stream.closed
try:
await stream.send(i)
pytest.fail('stream not closed?')
except (
trio.ClosedResourceError,
trio.EndOfChannel,
) as send_err:
if rx_eoc:
assert send_err is stream._eoc
else:
assert send_err is stream._closed
raise KeyboardInterrupt
if __name__ == '__main__':
trio.run(main)

View File

@ -0,0 +1,136 @@
'''
Examples of using the builtin `breakpoint()` from an `asyncio.Task`
running in a subactor spawned with `infect_asyncio=True`.
'''
import asyncio
import trio
import tractor
from tractor import (
to_asyncio,
Portal,
)
async def aio_sleep_forever():
await asyncio.sleep(float('inf'))
async def bp_then_error(
to_trio: trio.MemorySendChannel,
from_trio: asyncio.Queue,
raise_after_bp: bool = True,
) -> None:
# sync with `trio`-side (caller) task
to_trio.send_nowait('start')
# NOTE: what happens here inside the hook needs some refinement..
# => seems like it's still `._debug._set_trace()` but
# we set `Lock.local_task_in_debug = 'sync'`, we probably want
# some further, at least, meta-data about the task/actor in debug
# in terms of making it clear it's `asyncio` mucking about.
breakpoint() # asyncio-side
# short checkpoint / delay
await asyncio.sleep(0.5) # asyncio-side
if raise_after_bp:
raise ValueError('asyncio side error!')
# TODO: test case with this so that it gets cancelled?
else:
# XXX NOTE: this is required in order to get the SIGINT-ignored
# hang case documented in the module script section!
await aio_sleep_forever()
@tractor.context
async def trio_ctx(
ctx: tractor.Context,
bp_before_started: bool = False,
):
# this will block until the ``asyncio`` task sends a "first"
# message, see first line in above func.
async with (
to_asyncio.open_channel_from(
bp_then_error,
# raise_after_bp=not bp_before_started,
) as (first, chan),
trio.open_nursery() as tn,
):
assert first == 'start'
if bp_before_started:
await tractor.pause() # trio-side
await ctx.started(first) # trio-side
tn.start_soon(
to_asyncio.run_task,
aio_sleep_forever,
)
await trio.sleep_forever()
async def main(
bps_all_over: bool = True,
# TODO, WHICH OF THESE HAZ BUGZ?
cancel_from_root: bool = False,
err_from_root: bool = False,
) -> None:
async with tractor.open_nursery(
debug_mode=True,
maybe_enable_greenback=True,
# loglevel='devx',
) as an:
ptl: Portal = await an.start_actor(
'aio_daemon',
enable_modules=[__name__],
infect_asyncio=True,
debug_mode=True,
# loglevel='cancel',
)
async with ptl.open_context(
trio_ctx,
bp_before_started=bps_all_over,
) as (ctx, first):
assert first == 'start'
# pause in parent to ensure no cross-actor
# locking problems exist!
await tractor.pause() # trio-root
if cancel_from_root:
await ctx.cancel()
if err_from_root:
assert 0
else:
await trio.sleep_forever()
# TODO: case where we cancel from trio-side while asyncio task
# has debugger lock?
# await ptl.cancel_actor()
if __name__ == '__main__':
# works fine B)
trio.run(main)
# will hang and ignores SIGINT !!
# NOTE: you'll need to send a SIGQUIT (via ctl-\) to kill it
# manually..
# trio.run(main, True)

View File

@ -0,0 +1,9 @@
'''
Reproduce a bug where enabling debug mode for a sub-actor actually causes
a hang on teardown...
'''
import asyncio
import trio
import tractor

View File

@ -1,5 +1,5 @@
'''
Fast fail test with a context.
Fast fail test with a `Context`.
Ensure the partially initialized sub-actor process
doesn't cause a hang on error/cancel of the parent

View File

@ -4,9 +4,15 @@ import trio
async def breakpoint_forever():
"Indefinitely re-enter debugger in child actor."
while True:
yield 'yo'
await tractor.breakpoint()
try:
while True:
yield 'yo'
await tractor.pause()
except BaseException:
tractor.log.get_console_log().exception(
'Cancelled while trying to enter pause point!'
)
raise
async def name_error():
@ -15,11 +21,14 @@ async def name_error():
async def main():
"""Test breakpoint in a streaming actor.
"""
'''
Test breakpoint in a streaming actor.
'''
async with tractor.open_nursery(
debug_mode=True,
loglevel='error',
loglevel='cancel',
# loglevel='devx',
) as n:
p0 = await n.start_actor('bp_forever', enable_modules=[__name__])
@ -27,7 +36,18 @@ async def main():
# retreive results
async with p0.open_stream_from(breakpoint_forever) as stream:
await p1.run(name_error)
# triggers the first name error
try:
await p1.run(name_error)
except tractor.RemoteActorError as rae:
assert rae.boxed_type is NameError
async for i in stream:
# a second time try the failing subactor and this tie
# let error propagate up to the parent/nursery.
await p1.run(name_error)
if __name__ == '__main__':

View File

@ -10,7 +10,12 @@ async def name_error():
async def breakpoint_forever():
"Indefinitely re-enter debugger in child actor."
while True:
await tractor.breakpoint()
await tractor.pause()
# NOTE: if the test never sent 'q'/'quit' commands
# on the pdb repl, without this checkpoint line the
# repl would spin in this actor forever.
# await trio.sleep(0)
async def spawn_until(depth=0):
@ -18,12 +23,20 @@ async def spawn_until(depth=0):
"""
async with tractor.open_nursery() as n:
if depth < 1:
# await n.run_in_actor('breakpoint_forever', breakpoint_forever)
await n.run_in_actor(
await n.run_in_actor(breakpoint_forever)
p = await n.run_in_actor(
name_error,
name='name_error'
)
await trio.sleep(0.5)
# rx and propagate error from child
await p.result()
else:
# recusrive call to spawn another process branching layer of
# the tree
depth -= 1
await n.run_in_actor(
spawn_until,
@ -32,6 +45,7 @@ async def spawn_until(depth=0):
)
# TODO: notes on the new boxed-relayed errors through proxy actors
async def main():
"""The main ``tractor`` routine.
@ -53,6 +67,7 @@ async def main():
"""
async with tractor.open_nursery(
debug_mode=True,
# loglevel='cancel',
) as n:
# spawn both actors
@ -67,8 +82,16 @@ async def main():
name='spawner1',
)
# TODO: test this case as well where the parent don't see
# the sub-actor errors by default and instead expect a user
# ctrl-c to kill the root.
with trio.move_on_after(3):
await trio.sleep_forever()
# gah still an issue here.
await portal.result()
# should never get here
await portal1.result()

View File

@ -40,7 +40,7 @@ async def main():
"""
async with tractor.open_nursery(
debug_mode=True,
# loglevel='cancel',
loglevel='devx',
) as n:
# spawn both actors

View File

@ -6,7 +6,7 @@ async def breakpoint_forever():
"Indefinitely re-enter debugger in child actor."
while True:
await trio.sleep(0.1)
await tractor.breakpoint()
await tractor.pause()
async def name_error():
@ -38,6 +38,7 @@ async def main():
"""
async with tractor.open_nursery(
debug_mode=True,
# loglevel='runtime',
) as n:
# Spawn both actors, don't bother with collecting results

View File

@ -0,0 +1,40 @@
import trio
import tractor
@tractor.context
async def just_sleep(
ctx: tractor.Context,
**kwargs,
) -> None:
'''
Start and sleep.
'''
await ctx.started()
await trio.sleep_forever()
async def main() -> None:
async with tractor.open_nursery(
debug_mode=True,
) as n:
portal = await n.start_actor(
'ctx_child',
# XXX: we don't enable the current module in order
# to trigger `ModuleNotFound`.
enable_modules=[],
)
async with portal.open_context(
just_sleep, # taken from pytest parameterization
) as (ctx, sent):
raise KeyboardInterrupt
if __name__ == '__main__':
trio.run(main)

View File

@ -23,5 +23,6 @@ async def main():
n.start_soon(debug_actor.run, die)
n.start_soon(crash_boi.run, die)
if __name__ == '__main__':
trio.run(main)

View File

@ -0,0 +1,56 @@
import trio
import tractor
@tractor.context
async def name_error(
ctx: tractor.Context,
):
'''
Raise a `NameError`, catch it and enter `.post_mortem()`, then
expect the `._rpc._invoke()` crash handler to also engage.
'''
try:
getattr(doggypants) # noqa (on purpose)
except NameError:
await tractor.post_mortem()
raise
async def main():
'''
Test 3 `PdbREPL` entries:
- one in the child due to manual `.post_mortem()`,
- another in the child due to runtime RPC crash handling.
- final one here in parent from the RAE.
'''
# XXX NOTE: ideally the REPL arrives at this frame in the parent
# ONE UP FROM the inner ctx block below!
async with tractor.open_nursery(
debug_mode=True,
# loglevel='cancel',
) as an:
p: tractor.Portal = await an.start_actor(
'child',
enable_modules=[__name__],
)
# XXX should raise `RemoteActorError[NameError]`
# AND be the active frame when REPL enters!
try:
async with p.open_context(name_error) as (ctx, first):
assert first
except tractor.RemoteActorError as rae:
assert rae.boxed_type is NameError
# manually handle in root's parent task
await tractor.post_mortem()
raise
else:
raise RuntimeError('IPC ctx should have remote errored!?')
if __name__ == '__main__':
trio.run(main)

View File

@ -0,0 +1,49 @@
import os
import sys
import trio
import tractor
async def main() -> None:
# intially unset, no entry.
orig_pybp_var: int = os.environ.get('PYTHONBREAKPOINT')
assert orig_pybp_var in {None, "0"}
async with tractor.open_nursery(
debug_mode=True,
) as an:
assert an
assert (
(pybp_var := os.environ['PYTHONBREAKPOINT'])
==
'tractor.devx._debug._sync_pause_from_builtin'
)
# TODO: an assert that verifies the hook has indeed been, hooked
# XD
assert (
(pybp_hook := sys.breakpointhook)
is not tractor.devx._debug._set_trace
)
print(
f'$PYTHONOBREAKPOINT: {pybp_var!r}\n'
f'`sys.breakpointhook`: {pybp_hook!r}\n'
)
breakpoint() # first bp, tractor hook set.
# XXX AFTER EXIT (of actor-runtime) verify the hook is unset..
#
# YES, this is weird but it's how stdlib docs say to do it..
# https://docs.python.org/3/library/sys.html#sys.breakpointhook
assert os.environ.get('PYTHONBREAKPOINT') is orig_pybp_var
assert sys.breakpointhook
# now ensure a regular builtin pause still works
breakpoint() # last bp, stdlib hook restored
if __name__ == '__main__':
trio.run(main)

View File

@ -10,7 +10,7 @@ async def main():
await trio.sleep(0.1)
await tractor.breakpoint()
await tractor.pause()
await trio.sleep(0.1)

View File

@ -2,13 +2,16 @@ import trio
import tractor
async def main():
async def main(
registry_addrs: tuple[str, int]|None = None
):
async with tractor.open_root_actor(
debug_mode=True,
# loglevel='runtime',
):
while True:
await tractor.breakpoint()
await tractor.pause()
if __name__ == '__main__':

View File

@ -0,0 +1,83 @@
'''
Verify we can dump a `stackscope` tree on a hang.
'''
import os
import signal
import trio
import tractor
@tractor.context
async def start_n_shield_hang(
ctx: tractor.Context,
):
# actor: tractor.Actor = tractor.current_actor()
# sync to parent-side task
await ctx.started(os.getpid())
print('Entering shield sleep..')
with trio.CancelScope(shield=True):
await trio.sleep_forever() # in subactor
# XXX NOTE ^^^ since this shields, we expect
# the zombie reaper (aka T800) to engage on
# SIGINT from the user and eventually hard-kill
# this subprocess!
async def main(
from_test: bool = False,
) -> None:
async with (
tractor.open_nursery(
debug_mode=True,
enable_stack_on_sig=True,
# maybe_enable_greenback=False,
loglevel='devx',
) as an,
):
ptl: tractor.Portal = await an.start_actor(
'hanger',
enable_modules=[__name__],
debug_mode=True,
)
async with ptl.open_context(
start_n_shield_hang,
) as (ctx, cpid):
_, proc, _ = an._children[ptl.chan.uid]
assert cpid == proc.pid
print(
'Yo my child hanging..?\n'
# "i'm a user who wants to see a `stackscope` tree!\n"
)
# XXX simulate the wrapping test's "user actions"
# (i.e. if a human didn't run this manually but wants to
# know what they should do to reproduce test behaviour)
if from_test:
print(
f'Sending SIGUSR1 to {cpid!r}!\n'
)
os.kill(
cpid,
signal.SIGUSR1,
)
# simulate user cancelling program
await trio.sleep(0.5)
os.kill(
os.getpid(),
signal.SIGINT,
)
else:
# actually let user send the ctl-c
await trio.sleep_forever() # in root
if __name__ == '__main__':
trio.run(main)

View File

@ -0,0 +1,88 @@
import trio
import tractor
async def cancellable_pause_loop(
task_status: trio.TaskStatus[trio.CancelScope] = trio.TASK_STATUS_IGNORED
):
with trio.CancelScope() as cs:
task_status.started(cs)
for _ in range(3):
try:
# ON first entry, there is no level triggered
# cancellation yet, so this cp does a parent task
# ctx-switch so that this scope raises for the NEXT
# checkpoint we hit.
await trio.lowlevel.checkpoint()
await tractor.pause()
cs.cancel()
# parent should have called `cs.cancel()` by now
await trio.lowlevel.checkpoint()
except trio.Cancelled:
print('INSIDE SHIELDED PAUSE')
await tractor.pause(shield=True)
else:
# should raise it again, bubbling up to parent
print('BUBBLING trio.Cancelled to parent task-nursery')
await trio.lowlevel.checkpoint()
async def pm_on_cancelled():
async with trio.open_nursery() as tn:
tn.cancel_scope.cancel()
try:
await trio.sleep_forever()
except trio.Cancelled:
# should also raise `Cancelled` since
# we didn't pass `shield=True`.
try:
await tractor.post_mortem(hide_tb=False)
except trio.Cancelled as taskc:
# should enter just fine, in fact it should
# be debugging the internals of the previous
# sin-shield call above Bo
await tractor.post_mortem(
hide_tb=False,
shield=True,
)
raise taskc
else:
raise RuntimeError('Dint cancel as expected!?')
async def cancelled_before_pause(
):
'''
Verify that using a shielded pause works despite surrounding
cancellation called state in the calling task.
'''
async with trio.open_nursery() as tn:
cs: trio.CancelScope = await tn.start(cancellable_pause_loop)
await trio.sleep(0.1)
assert cs.cancelled_caught
await pm_on_cancelled()
async def main():
async with tractor.open_nursery(
debug_mode=True,
) as n:
portal: tractor.Portal = await n.run_in_actor(
cancelled_before_pause,
)
await portal.result()
# ensure the same works in the root actor!
await pm_on_cancelled()
if __name__ == '__main__':
trio.run(main)

View File

@ -0,0 +1,50 @@
import tractor
import trio
async def gen():
yield 'yo'
await tractor.pause()
yield 'yo'
await tractor.pause()
@tractor.context
async def just_bp(
ctx: tractor.Context,
) -> None:
await ctx.started()
await tractor.pause()
# TODO: bps and errors in this call..
async for val in gen():
print(val)
# await trio.sleep(0.5)
# prematurely destroy the connection
await ctx.chan.aclose()
# THIS CAUSES AN UNRECOVERABLE HANG
# without latest ``pdbpp``:
assert 0
async def main():
async with tractor.open_nursery(
debug_mode=True,
) as n:
p = await n.start_actor(
'bp_boi',
enable_modules=[__name__],
)
async with p.open_context(
just_bp,
) as (ctx, first):
await trio.sleep_forever()
if __name__ == '__main__':
trio.run(main)

View File

@ -3,17 +3,20 @@ import tractor
async def breakpoint_forever():
"""Indefinitely re-enter debugger in child actor.
"""
'''
Indefinitely re-enter debugger in child actor.
'''
while True:
await trio.sleep(0.1)
await tractor.breakpoint()
await tractor.pause()
async def main():
async with tractor.open_nursery(
debug_mode=True,
loglevel='cancel',
) as n:
portal = await n.run_in_actor(

View File

@ -3,16 +3,26 @@ import tractor
async def name_error():
getattr(doggypants)
getattr(doggypants) # noqa (on purpose)
async def main():
async with tractor.open_nursery(
debug_mode=True,
) as n:
# loglevel='transport',
) as an:
portal = await n.run_in_actor(name_error)
await portal.result()
# TODO: ideally the REPL arrives at this frame in the parent,
# ABOVE the @api_frame of `Portal.run_in_actor()` (which
# should eventually not even be a portal method ... XD)
# await tractor.pause()
p: tractor.Portal = await an.run_in_actor(name_error)
# with this style, should raise on this line
await p.result()
# with this alt style should raise at `open_nusery()`
# return await p.result()
if __name__ == '__main__':

View File

@ -0,0 +1,169 @@
from functools import partial
import time
import trio
import tractor
# TODO: only import these when not running from test harness?
# can we detect `pexpect` usage maybe?
# from tractor.devx._debug import (
# get_lock,
# get_debug_req,
# )
def sync_pause(
use_builtin: bool = False,
error: bool = False,
hide_tb: bool = True,
pre_sleep: float|None = None,
):
if pre_sleep:
time.sleep(pre_sleep)
if use_builtin:
breakpoint(hide_tb=hide_tb)
else:
# TODO: maybe for testing some kind of cm style interface
# where the `._set_trace()` call doesn't happen until block
# exit?
# assert get_lock().ctx_in_debug is None
# assert get_debug_req().repl is None
tractor.pause_from_sync()
# assert get_debug_req().repl is None
if error:
raise RuntimeError('yoyo sync code error')
@tractor.context
async def start_n_sync_pause(
ctx: tractor.Context,
):
actor: tractor.Actor = tractor.current_actor()
# sync to parent-side task
await ctx.started()
print(f'Entering `sync_pause()` in subactor: {actor.uid}\n')
sync_pause()
print(f'Exited `sync_pause()` in subactor: {actor.uid}\n')
async def main() -> None:
async with (
tractor.open_nursery(
debug_mode=True,
maybe_enable_greenback=True,
enable_stack_on_sig=True,
# loglevel='warning',
# loglevel='devx',
) as an,
trio.open_nursery() as tn,
):
# just from root task
sync_pause()
p: tractor.Portal = await an.start_actor(
'subactor',
enable_modules=[__name__],
# infect_asyncio=True,
debug_mode=True,
)
# TODO: 3 sub-actor usage cases:
# -[x] via a `.open_context()`
# -[ ] via a `.run_in_actor()` call
# -[ ] via a `.run()`
# -[ ] via a `.to_thread.run_sync()` in subactor
async with p.open_context(
start_n_sync_pause,
) as (ctx, first):
assert first is None
# TODO: handle bg-thread-in-root-actor special cases!
#
# there are a couple very subtle situations possible here
# and they are likely to become more important as cpython
# moves to support no-GIL.
#
# Cases:
# 1. root-actor bg-threads that call `.pause_from_sync()`
# whilst an in-tree subactor also is using ` .pause()`.
# |_ since the root-actor bg thread can not
# `Lock._debug_lock.acquire_nowait()` without running
# a `trio.Task`, AND because the
# `PdbREPL.set_continue()` is called from that
# bg-thread, we can not `._debug_lock.release()`
# either!
# |_ this results in no actor-tree `Lock` being used
# on behalf of the bg-thread and thus the subactor's
# task and the thread trying to to use stdio
# simultaneously which results in the classic TTY
# clobbering!
#
# 2. mutiple sync-bg-threads that call
# `.pause_from_sync()` where one is scheduled via
# `Nursery.start_soon(to_thread.run_sync)` in a bg
# task.
#
# Due to the GIL, the threads never truly try to step
# through the REPL simultaneously, BUT their `logging`
# and traceback outputs are interleaved since the GIL
# (seemingly) on every REPL-input from the user
# switches threads..
#
# Soo, the context switching semantics of the GIL
# result in a very confusing and messy interaction UX
# since eval and (tb) print output is NOT synced to
# each REPL-cycle (like we normally make it via
# a `.set_continue()` callback triggering the
# `Lock.release()`). Ideally we can solve this
# usability issue NOW because this will of course be
# that much more important when eventually there is no
# GIL!
# XXX should cause double REPL entry and thus TTY
# clobbering due to case 1. above!
tn.start_soon(
partial(
trio.to_thread.run_sync,
partial(
sync_pause,
use_builtin=False,
# pre_sleep=0.5,
),
abandon_on_cancel=True,
thread_name='start_soon_root_bg_thread',
)
)
await tractor.pause()
# XXX should cause double REPL entry and thus TTY
# clobbering due to case 2. above!
await trio.to_thread.run_sync(
partial(
sync_pause,
# NOTE this already works fine since in the new
# thread the `breakpoint()` built-in is never
# overloaded, thus NO locking is used, HOWEVER
# the case 2. from above still exists!
use_builtin=True,
),
# TODO: with this `False` we can hang!??!
# abandon_on_cancel=False,
abandon_on_cancel=True,
thread_name='inline_root_bg_thread',
)
await ctx.cancel()
# TODO: case where we cancel from trio-side while asyncio task
# has debugger lock?
await p.cancel_actor()
if __name__ == '__main__':
trio.run(main)

View File

@ -1,6 +1,11 @@
import time
import trio
import tractor
from tractor import (
ActorNursery,
MsgStream,
Portal,
)
# this is the first 2 actors, streamer_1 and streamer_2
@ -12,14 +17,18 @@ async def stream_data(seed):
# this is the third actor; the aggregator
async def aggregate(seed):
"""Ensure that the two streams we receive match but only stream
'''
Ensure that the two streams we receive match but only stream
a single set of values to the parent.
"""
async with tractor.open_nursery() as nursery:
portals = []
'''
an: ActorNursery
async with tractor.open_nursery() as an:
portals: list[Portal] = []
for i in range(1, 3):
# fork point
portal = await nursery.start_actor(
# fork/spawn call
portal = await an.start_actor(
name=f'streamer_{i}',
enable_modules=[__name__],
)
@ -43,7 +52,11 @@ async def aggregate(seed):
async with trio.open_nursery() as n:
for portal in portals:
n.start_soon(push_to_chan, portal, send_chan.clone())
n.start_soon(
push_to_chan,
portal,
send_chan.clone(),
)
# close this local task's reference to send side
await send_chan.aclose()
@ -60,26 +73,36 @@ async def aggregate(seed):
print("FINISHED ITERATING in aggregator")
await nursery.cancel()
await an.cancel()
print("WAITING on `ActorNursery` to finish")
print("AGGREGATOR COMPLETE!")
# this is the main actor and *arbiter*
async def main():
# a nursery which spawns "actors"
async def main() -> list[int]:
'''
This is the "root" actor's main task's entrypoint.
By default (and if not otherwise specified) that root process
also acts as a "registry actor" / "registrar" on the localhost
for the purposes of multi-actor "service discovery".
'''
# yes, a nursery which spawns `trio`-"actors" B)
an: ActorNursery
async with tractor.open_nursery(
arbiter_addr=('127.0.0.1', 1616)
) as nursery:
loglevel='cancel',
# debug_mode=True,
) as an:
seed = int(1e3)
pre_start = time.time()
portal = await nursery.start_actor(
portal: Portal = await an.start_actor(
name='aggregator',
enable_modules=[__name__],
)
stream: MsgStream
async with portal.open_stream_from(
aggregate,
seed=seed,
@ -88,11 +111,12 @@ async def main():
start = time.time()
# the portal call returns exactly what you'd expect
# as if the remote "aggregate" function was called locally
result_stream = []
result_stream: list[int] = []
async for value in stream:
result_stream.append(value)
await portal.cancel_actor()
cancelled: bool = await portal.cancel_actor()
assert cancelled
print(f"STREAM TIME = {time.time() - start}")
print(f"STREAM + SPAWN TIME = {time.time() - pre_start}")

View File

@ -0,0 +1,49 @@
import trio
import click
import tractor
import pydantic
# from multiprocessing import shared_memory
@tractor.context
async def just_sleep(
ctx: tractor.Context,
**kwargs,
) -> None:
'''
Test a small ping-pong 2-way streaming server.
'''
await ctx.started()
await trio.sleep_forever()
async def main() -> None:
proc = await trio.open_process( (
'python',
'-c',
'import trio; trio.run(trio.sleep_forever)',
))
await proc.wait()
# await trio.sleep_forever()
# async with tractor.open_nursery() as n:
# portal = await n.start_actor(
# 'rpc_server',
# enable_modules=[__name__],
# )
# async with portal.open_context(
# just_sleep, # taken from pytest parameterization
# ) as (ctx, sent):
# await trio.sleep_forever()
if __name__ == '__main__':
import time
# time.sleep(999)
trio.run(main)

View File

@ -8,15 +8,17 @@ This uses no extra threads, fancy semaphores or futures; all we need
is ``tractor``'s channels.
"""
from contextlib import asynccontextmanager
from typing import List, Callable
from contextlib import (
asynccontextmanager as acm,
aclosing,
)
from typing import Callable
import itertools
import math
import time
import tractor
import trio
from async_generator import aclosing
PRIMES = [
@ -44,7 +46,7 @@ async def is_prime(n):
return True
@asynccontextmanager
@acm
async def worker_pool(workers=4):
"""Though it's a trivial special case for ``tractor``, the well
known "worker pool" seems to be the defacto "but, I want this
@ -71,8 +73,8 @@ async def worker_pool(workers=4):
async def _map(
worker_func: Callable[[int], bool],
sequence: List[int]
) -> List[bool]:
sequence: list[int]
) -> list[bool]:
# define an async (local) task to collect results from workers
async def send_result(func, value, portal):

View File

@ -3,20 +3,18 @@ import trio
import tractor
async def sleepy_jane():
uid = tractor.current_actor().uid
async def sleepy_jane() -> None:
uid: tuple = tractor.current_actor().uid
print(f'Yo i am actor {uid}')
await trio.sleep_forever()
async def main():
'''
Spawn a flat actor cluster, with one process per
detected core.
Spawn a flat actor cluster, with one process per detected core.
'''
portal_map: dict[str, tractor.Portal]
results: dict[str, str]
# look at this hip new syntax!
async with (
@ -25,11 +23,16 @@ async def main():
modules=[__name__]
) as portal_map,
trio.open_nursery() as n,
trio.open_nursery(
strict_exception_groups=False,
) as tn,
):
for (name, portal) in portal_map.items():
n.start_soon(portal.run, sleepy_jane)
tn.start_soon(
portal.run,
sleepy_jane,
)
await trio.sleep(0.5)
@ -41,4 +44,4 @@ if __name__ == '__main__':
try:
trio.run(main)
except KeyboardInterrupt:
pass
print('trio cancelled by KBI')

View File

@ -13,7 +13,7 @@ async def simple_rpc(
'''
# signal to parent that we're up much like
# ``trio_typing.TaskStatus.started()``
# ``trio.TaskStatus.started()``
await ctx.started(data + 1)
async with ctx.open_stream() as stream:

View File

@ -9,7 +9,7 @@ async def main(service_name):
async with tractor.open_nursery() as an:
await an.start_actor(service_name)
async with tractor.get_arbiter('127.0.0.1', 1616) as portal:
async with tractor.get_registry('127.0.0.1', 1616) as portal:
print(f"Arbiter is listening on {portal.channel}")
async with tractor.wait_for_actor(service_name) as sockaddr:

View File

@ -1,8 +0,0 @@
Adjust the `tractor._spawn.soft_wait()` strategy to avoid sending an
actor cancel request (via `Portal.cancel_actor()`) if either the child
process is detected as having terminated or the IPC channel is detected
to be closed.
This ensures (even) more deterministic inter-actor cancellation by
avoiding the timeout condition where possible when a whild never
sucessfully spawned, crashed, or became un-contactable over IPC.

View File

@ -1,3 +0,0 @@
Add an experimental ``tractor.msg.NamespacePath`` type for passing Python
objects by "reference" through a ``str``-subtype message and using the
new ``pkgutil.resolve_name()`` for reference loading.

View File

@ -1,2 +0,0 @@
Add a new `tractor.experimental` subpackage for staging new high level
APIs and subystems that we might eventually make built-ins.

View File

@ -1,3 +0,0 @@
Update to and pin latest ``msgpack`` (1.0.3) and ``msgspec`` (0.4.0)
both of which required adjustments for backwards imcompatible API
tweaks.

View File

@ -1,4 +0,0 @@
Fence off ``multiprocessing`` imports until absolutely necessary in an
effort to avoid "resource tracker" spawning side effects that seem to
have varying degrees of unreliability per Python release. Port to new
``msgspec.DecodeError``.

View File

@ -1,12 +0,0 @@
Add a new ``to_asyncio.LinkedTaskChannel.subscribe()`` which gives
task-oriented broadcast functionality semantically equivalent to
``tractor.MsgStream.subscribe()`` this makes it possible for multiple
``trio``-side tasks to consume ``asyncio``-side task msgs in tandem.
Further Improvements to the test suite were added in this patch set
including a new scenario test for a sub-actor managed "service nursery"
(implementing the basics of a "service manager") including use of
*infected asyncio* mode. Further we added a lower level
``test_trioisms.py`` to start to track issues we need to work around in
``trio`` itself which in this case included a bug we were trying to
solve related to https://github.com/python-trio/trio/issues/2258.

View File

@ -1,5 +0,0 @@
Run windows CI jobs on python 3.10 after some
hacks for ``pdbpp`` dependency issues.
Issue was to do with the now deprecated `pyreadline` project which
should be changed over to `pyreadline3`.

View File

@ -1,8 +0,0 @@
Drop use of the ``msgpack`` package and instead move fully to the
``msgspec`` codec library.
We've now used ``msgspec`` extensively in production and there's no
reason to not use it as default. Further this change preps us for the up
and coming typed messaging semantics (#196), dialog-unprotocol system
(#297), and caps-based messaging-protocols (#299) planned before our
first beta.

View File

@ -1,13 +0,0 @@
Fix a previously undetected ``trio``-``asyncio`` task lifetime linking
issue with the ``to_asyncio.open_channel_from()`` api where both sides
where not properly waiting/signalling termination and it was possible
for ``asyncio``-side errors to not propagate due to a race condition.
The implementation fix summary is:
- add state to signal the end of the ``trio`` side task to be
read by the ``asyncio`` side and always cancel any ongoing
task in such cases.
- always wait on the ``asyncio`` task termination from the ``trio``
side on error before maybe raising said error.
- always close the ``trio`` mem chan on exit to ensure the other
side can detect it and follow.

View File

@ -0,0 +1,16 @@
Strictly support Python 3.10+, start runtime machinery reorg
Since we want to push forward using the new `match:` syntax for our
internal RPC-msg loops, we officially drop 3.9 support for the next
release which should coincide well with the first release of 3.11.
This patch set also officially removes the ``tractor.run()`` API (which
has been deprecated for some time) as well as starts an initial re-org
of the internal runtime core by:
- renaming ``tractor._actor`` -> ``._runtime``
- moving the ``._runtime.ActorActor._process_messages()`` and
``._async_main()`` to be module level singleton-task-functions since
they are only started once for each connection and actor spawn
respectively; this internal API thus looks more similar to (at the
time of writing) the ``trio``-internals in ``trio._core._run``.
- officially remove ``tractor.run()``, now deprecated for some time.

View File

@ -0,0 +1,4 @@
Only set `._debug.Lock.local_pdb_complete` if has been created.
This can be triggered by a very rare race condition (and thus we have no
working test yet) but it is known to exist in (a) consumer project(s).

View File

@ -0,0 +1,25 @@
Add support for ``trio >= 0.22`` and support for the new Python 3.11
``[Base]ExceptionGroup`` from `pep 654`_ via the backported
`exceptiongroup`_ package and some final fixes to the debug mode
subsystem.
This port ended up driving some (hopefully) final fixes to our debugger
subsystem including the solution to all lingering stdstreams locking
race-conditions and deadlock scenarios. This includes extending the
debugger tests suite as well as cancellation and ``asyncio`` mode cases.
Some of the notable details:
- always reverting to the ``trio`` SIGINT handler when leaving debug
mode.
- bypassing child attempts to acquire the debug lock when detected
to be amdist actor-runtime-cancellation.
- allowing the root actor to cancel local but IPC-stale subactor
requests-tasks for the debug lock when in a "no IPC peers" state.
Further we refined our ``ActorNursery`` semantics to be more similar to
``trio`` in the sense that parent task errors are always packed into the
actor-nursery emitted exception group and adjusted all tests and
examples accordingly.
.. _pep 654: https://peps.python.org/pep-0654/#handling-exception-groups
.. _exceptiongroup: https://github.com/python-trio/exceptiongroup

View File

@ -0,0 +1,5 @@
Establish an explicit "backend spawning" method table; use it from CI
More clearly lays out the current set of (3) backends: ``['trio',
'mp_spawn', 'mp_forkserver']`` and adjusts the ``._spawn.py`` internals
as well as the test suite to accommodate.

View File

@ -0,0 +1,4 @@
Add ``key: Callable[..., Hashable]`` support to ``.trionics.maybe_open_context()``
Gives users finer grained control over cache hit behaviour using
a callable which receives the input ``kwargs: dict``.

View File

@ -0,0 +1,41 @@
Add support for debug-lock blocking using a ``._debug.Lock._blocked:
set[tuple]`` and add ids when no-more IPC connections with the
root actor are detected.
This is an enhancement which (mostly) solves a lingering debugger
locking race case we needed to handle:
- child crashes acquires TTY lock in root and attaches to ``pdb``
- child IPC goes down such that all channels to the root are broken
/ non-functional.
- root is stuck thinking the child is still in debug even though it
can't be contacted and the child actor machinery hasn't been
cancelled by its parent.
- root get's stuck in deadlock with child since it won't send a cancel
request until the child is finished debugging (to avoid clobbering
a child that is actually using the debugger), but the child can't
unlock the debugger bc IPC is down and it can't contact the root.
To avoid this scenario add debug lock blocking list via
`._debug.Lock._blocked: set[tuple]` which holds actor uids for any actor
that is detected by the root as having no transport channel connections
(of which at least one should exist if this sub-actor at some point
acquired the debug lock). The root consequently checks this list for any
actor that tries to (re)acquire the lock and blocks with
a ``ContextCancelled``. Further, when a debug condition is tested in
``._runtime._invoke``, the context's ``._enter_debugger_on_cancel`` is
set to `False` if the actor was put on the block list then all
post-mortem / crash handling will be bypassed for that task.
In theory this approach to block list management may cause problems
where some nested child actor acquires and releases the lock multiple
times and it gets stuck on the block list after the first use? If this
turns out to be an issue we can try changing the strat so blocks are
only added when the root has zero IPC peers left?
Further, this adds a root-locking-task side cancel scope,
``Lock._root_local_task_cs_in_debug``, which can be ``.cancel()``-ed by the root
runtime when a stale lock is detected during the IPC channel testing.
However, right now we're NOT using this since it seems to cause test
failures likely due to causing pre-mature cancellation and maybe needs
a bit more experimenting?

View File

@ -0,0 +1,19 @@
Rework our ``.trionics.BroadcastReceiver`` internals to avoid method
recursion and approach a design and interface closer to ``trio``'s
``MemoryReceiveChannel``.
The details of the internal changes include:
- implementing a ``BroadcastReceiver.receive_nowait()`` and using it
within the async ``.receive()`` thus avoiding recursion from
``.receive()``.
- failing over to an internal ``._receive_from_underlying()`` when the
``_nowait()`` call raises ``trio.WouldBlock``
- adding ``BroadcastState.statistics()`` for debugging and testing both
internals and by users.
- add an internal ``BroadcastReceiver._raise_on_lag: bool`` which can be
set to avoid ``Lagged`` raising for possible use cases where a user
wants to choose between a [cheap or nasty
pattern](https://zguide.zeromq.org/docs/chapter7/#The-Cheap-or-Nasty-Pattern)
the the particular stream (we use this in ``piker``'s dark clearing
engine to avoid fast feeds breaking during HFT periods).

View File

@ -0,0 +1,11 @@
Always ``list``-cast the ``mngrs`` input to
``.trionics.gather_contexts()`` and ensure its size otherwise raise
a ``ValueError``.
Turns out that trying to pass an inline-style generator comprehension
doesn't seem to work inside the ``async with`` expression? Further, in
such a case we can get a hang waiting on the all-entered event
completion when the internal mngrs iteration is a noop. Instead we
always greedily check a size and error on empty input; the lazy
iteration of a generator input is not beneficial anyway since we're
entering all manager instances in concurrent tasks.

View File

@ -0,0 +1,15 @@
Fixes to ensure IPC (channel) breakage doesn't result in hung actor
trees; the zombie reaping and general supervision machinery will always
clean up and terminate.
This includes not only the (mostly minor) fixes to solve these cases but
also a new extensive test suite in `test_advanced_faults.py` with an
accompanying highly configurable example module-script in
`examples/advanced_faults/ipc_failure_during_stream.py`. Tests ensure we
never get hang or zombies despite operating in debug mode and attempt to
simulate all possible IPC transport failure cases for a local-host actor
tree.
Further we simplify `Context.open_stream.__aexit__()` to just call
`MsgStream.aclose()` directly more or less avoiding a pure duplicate
code path.

View File

@ -0,0 +1,10 @@
Always redraw the `pdbpp` prompt on `SIGINT` during REPL use.
There was recent changes todo with Python 3.10 that required us to pin
to a specific commit in `pdbpp` which have recently been fixed minus
this last issue with `SIGINT` shielding: not clobbering or not
showing the `(Pdb++)` prompt on ctlr-c by the user. This repairs all
that by firstly removing the standard KBI intercepting of the std lib's
`pdb.Pdb._cmdloop()` as well as ensuring that only the actor with REPL
control ever reports `SIGINT` handler log msgs and prompt redraws. With
this we move back to using pypi `pdbpp` release.

View File

@ -0,0 +1,7 @@
Drop `trio.Process.aclose()` usage, copy into our spawning code.
The details are laid out in https://github.com/goodboy/tractor/issues/330.
`trio` changed is process running quite some time ago, this just copies
out the small bit we needed (from the old `.aclose()`) for hard kills
where a soft runtime cancel request fails and our "zombie killer"
implementation kicks in.

View File

@ -0,0 +1,15 @@
Switch to using the fork & fix of `pdb++`, `pdbp`:
https://github.com/mdmintz/pdbp
Allows us to sidestep a variety of issues that aren't being maintained
in the upstream project thanks to the hard work of @mdmintz!
We also include some default settings adjustments as per recent
development on the fork:
- sticky mode is still turned on by default but now activates when
a using the `ll` repl command.
- turn off line truncation by default to avoid inter-line gaps when
resizing the terimnal during use.
- when using the backtrace cmd either by `w` or `bt`, the config
automatically switches to non-sticky mode.

View File

@ -0,0 +1,18 @@
First generate a built disti:
```
python -m pip install --upgrade build
python -m build --sdist --outdir dist/alpha5/
```
Then try a test ``pypi`` upload:
```
python -m twine upload --repository testpypi dist/alpha5/*
```
The push to `pypi` for realz.
```
python -m twine upload --repository testpypi dist/alpha5/*
```

158
pyproject.toml 100644
View File

@ -0,0 +1,158 @@
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
# ------ build-system ------
[project]
name = "tractor"
version = "0.1.0a6dev0"
description = 'structured concurrent `trio`-"actors"'
authors = [{ name = "Tyler Goodlet", email = "goodboy_foss@protonmail.com" }]
requires-python = ">= 3.11"
readme = "docs/README.rst"
license = "AGPL-3.0-or-later"
keywords = [
"trio",
"async",
"concurrency",
"structured concurrency",
"actor model",
"distributed",
"multiprocessing",
]
classifiers = [
"Development Status :: 3 - Alpha",
"Operating System :: POSIX :: Linux",
"Framework :: Trio",
"License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)",
"Programming Language :: Python :: Implementation :: CPython",
"Programming Language :: Python :: 3 :: Only",
"Programming Language :: Python :: 3.11",
"Topic :: System :: Distributed Computing",
]
dependencies = [
# trio runtime and friends
# (poetry) proper range specs,
# https://packaging.python.org/en/latest/discussions/install-requires-vs-requirements/#id5
# TODO, for 3.13 we must go go `0.27` which means we have to
# disable strict egs or port to handling them internally!
"trio>0.27",
"tricycle>=0.4.1,<0.5",
"wrapt>=1.16.0,<2",
"colorlog>=6.8.2,<7",
# built-in multi-actor `pdb` REPL
"pdbp>=1.6,<2", # windows only (from `pdbp`)
# typed IPC msging
"msgspec>=0.19.0",
]
# ------ project ------
[dependency-groups]
dev = [
# test suite
# TODO: maybe some of these layout choices?
# https://docs.pytest.org/en/8.0.x/explanation/goodpractices.html#choosing-a-test-layout-import-rules
"pytest>=8.3.5",
"pexpect>=4.9.0,<5",
# `tractor.devx` tooling
"greenback>=1.2.1,<2",
"stackscope>=0.2.2,<0.3",
"pyperclip>=1.9.0",
"prompt-toolkit>=3.0.50",
"xonsh>=0.19.2",
]
# TODO, add these with sane versions; were originally in
# `requirements-docs.txt`..
# docs = [
# "sphinx>="
# "sphinx_book_theme>="
# ]
# ------ dependency-groups ------
# ------ dependency-groups ------
[tool.uv.sources]
# XXX NOTE, only for @goodboy's hacking on `pprint(sort_dicts=False)`
# for the `pp` alias..
# pdbp = { path = "../pdbp", editable = true }
# ------ tool.uv.sources ------
# TODO, distributed (multi-host) extensions
# linux kernel networking
# 'pyroute2
# ------ tool.uv.sources ------
[tool.uv]
# XXX NOTE, prefer the sys python bc apparently the distis from
# `astral` are built in a way that breaks `pdbp`+`tabcompleter`'s
# likely due to linking against `libedit` over `readline`..
# |_https://docs.astral.sh/uv/concepts/python-versions/#managed-python-distributions
# |_https://gregoryszorc.com/docs/python-build-standalone/main/quirks.html#use-of-libedit-on-linux
#
# https://docs.astral.sh/uv/reference/settings/#python-preference
python-preference = 'system'
# ------ tool.uv ------
[tool.hatch.build.targets.sdist]
include = ["tractor"]
[tool.hatch.build.targets.wheel]
include = ["tractor"]
# ------ tool.hatch ------
[tool.towncrier]
package = "tractor"
filename = "NEWS.rst"
directory = "nooz/"
version = "0.1.0a6"
title_format = "tractor {version} ({project_date})"
template = "nooz/_template.rst"
all_bullets = true
[[tool.towncrier.type]]
directory = "feature"
name = "Features"
showcontent = true
[[tool.towncrier.type]]
directory = "bugfix"
name = "Bug Fixes"
showcontent = true
[[tool.towncrier.type]]
directory = "doc"
name = "Improved Documentation"
showcontent = true
[[tool.towncrier.type]]
directory = "trivial"
name = "Trivial/Internal Changes"
showcontent = true
# ------ tool.towncrier ------
[tool.pytest.ini_options]
minversion = '6.0'
testpaths = [
'tests'
]
addopts = [
# TODO: figure out why this isn't working..
'--rootdir=./tests',
'--import-mode=importlib',
# don't show frickin captured logs AGAIN in the report..
'--show-capture=no',
]
log_cli = false
# TODO: maybe some of these layout choices?
# https://docs.pytest.org/en/8.0.x/explanation/goodpractices.html#choosing-a-test-layout-import-rules
# pythonpath = "src"
# ------ tool.pytest ------

8
pytest.ini 100644
View File

@ -0,0 +1,8 @@
# vim: ft=ini
# pytest.ini for tractor
[pytest]
# don't show frickin captured logs AGAIN in the report..
addopts = --show-capture='no'
log_cli = false
; minversion = 6.0

View File

@ -1,2 +0,0 @@
sphinx
sphinx_book_theme

View File

@ -1,7 +0,0 @@
pytest
pytest-trio
pdbpp
mypy<0.920
trio_typing<0.7.0
pexpect
towncrier

82
ruff.toml 100644
View File

@ -0,0 +1,82 @@
# from default `ruff.toml` @
# https://docs.astral.sh/ruff/configuration/
# Exclude a variety of commonly ignored directories.
exclude = [
".bzr",
".direnv",
".eggs",
".git",
".git-rewrite",
".hg",
".ipynb_checkpoints",
".mypy_cache",
".nox",
".pants.d",
".pyenv",
".pytest_cache",
".pytype",
".ruff_cache",
".svn",
".tox",
".venv",
".vscode",
"__pypackages__",
"_build",
"buck-out",
"build",
"dist",
"node_modules",
"site-packages",
"venv",
]
# Same as Black.
line-length = 88
indent-width = 4
# Assume Python 3.9
target-version = "py311"
[lint]
# Enable Pyflakes (`F`) and a subset of the pycodestyle (`E`) codes by default.
# Unlike Flake8, Ruff doesn't enable pycodestyle warnings (`W`) or
# McCabe complexity (`C901`) by default.
select = ["E4", "E7", "E9", "F"]
ignore = [
'E402', # https://docs.astral.sh/ruff/rules/module-import-not-at-top-of-file/
]
# Allow fix for all enabled rules (when `--fix`) is provided.
fixable = ["ALL"]
unfixable = []
# Allow unused variables when underscore-prefixed.
# dummy-variable-rgx = "^(_+|(_+[a-zA-Z0-9_]*[a-zA-Z0-9]+?))$"
[format]
# Use single quotes in `ruff format`.
quote-style = "single"
# Like Black, indent with spaces, rather than tabs.
indent-style = "space"
# Like Black, respect magic trailing commas.
skip-magic-trailing-comma = false
# Like Black, automatically detect the appropriate line ending.
line-ending = "auto"
# Enable auto-formatting of code examples in docstrings. Markdown,
# reStructuredText code/literal blocks and doctests are all supported.
#
# This is currently disabled by default, but it is planned for this
# to be opt-out in the future.
docstring-code-format = false
# Set the line length limit used when formatting code snippets in
# docstrings.
#
# This only has an effect when the `docstring-code-format` setting is
# enabled.
docstring-code-line-length = "dynamic"

View File

@ -1,92 +0,0 @@
#!/usr/bin/env python
#
# tractor: structured concurrent "actors".
#
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
from setuptools import setup
with open('docs/README.rst', encoding='utf-8') as f:
readme = f.read()
setup(
name="tractor",
version='0.1.0a5.dev', # alpha zone
description='structured concurrrent "actors"',
long_description=readme,
license='AGPLv3',
author='Tyler Goodlet',
maintainer='Tyler Goodlet',
maintainer_email='jgbt@protonmail.com',
url='https://github.com/goodboy/tractor',
platforms=['linux', 'windows'],
packages=[
'tractor',
'tractor.experimental',
'tractor.trionics',
'tractor.testing',
],
install_requires=[
# trio related
'trio>0.8',
'async_generator',
'trio_typing',
# tooling
'tricycle',
'trio_typing',
# tooling
'colorlog',
'wrapt',
'pdbpp',
# windows deps workaround for ``pdbpp``
# https://github.com/pdbpp/pdbpp/issues/498
# https://github.com/pdbpp/fancycompleter/issues/37
'pyreadline3 ; platform_system == "Windows"',
# serialization
'msgspec >= "0.4.0"'
],
tests_require=['pytest'],
python_requires=">=3.9",
keywords=[
'trio',
'async',
'concurrency',
'structured concurrency',
'actor model',
'distributed',
'multiprocessing'
],
classifiers=[
"Development Status :: 3 - Alpha",
"Operating System :: POSIX :: Linux",
"Operating System :: Microsoft :: Windows",
"Framework :: Trio",
"License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)",
"Programming Language :: Python :: Implementation :: CPython",
"Programming Language :: Python :: 3 :: Only",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.9",
"Intended Audience :: Science/Research",
"Intended Audience :: Developers",
"Topic :: System :: Distributed Computing",
],
)

View File

View File

@ -11,14 +11,14 @@ import time
import pytest
import tractor
from tractor._testing import (
examples_dir as examples_dir,
tractor_test as tractor_test,
expect_ctxc as expect_ctxc,
)
# export for tests
from tractor.testing import tractor_test # noqa
# TODO: include wtv plugin(s) we build in `._testing.pytest`?
pytest_plugins = ['pytester']
_arb_addr = '127.0.0.1', random.randint(1000, 9999)
# Sending signal.SIGINT on subprocess fails on windows. Use CTRL_* alternatives
if platform.system() == 'Windows':
@ -39,36 +39,46 @@ no_windows = pytest.mark.skipif(
)
def repodir():
"""Return the abspath to the repo directory.
"""
dirname = os.path.dirname
dirpath = os.path.abspath(
dirname(dirname(os.path.realpath(__file__)))
)
return dirpath
def pytest_addoption(parser):
parser.addoption(
"--ll", action="store", dest='loglevel',
"--ll",
action="store",
dest='loglevel',
default='ERROR', help="logging level to set when testing"
)
parser.addoption(
"--spawn-backend", action="store", dest='spawn_backend',
"--spawn-backend",
action="store",
dest='spawn_backend',
default='trio',
help="Processing spawning backend to use for test run",
)
parser.addoption(
"--tpdb", "--debug-mode",
action="store_true",
dest='tractor_debug_mode',
# default=False,
help=(
'Enable a flag that can be used by tests to to set the '
'`debug_mode: bool` for engaging the internal '
'multi-proc debugger sys.'
),
)
def pytest_configure(config):
backend = config.option.spawn_backend
tractor._spawn.try_set_start_method(backend)
if backend == 'mp':
tractor._spawn.try_set_start_method('spawn')
elif backend == 'trio':
tractor._spawn.try_set_start_method(backend)
@pytest.fixture(scope='session')
def debug_mode(request):
debug_mode: bool = request.config.option.tractor_debug_mode
# if debug_mode:
# breakpoint()
return debug_mode
@pytest.fixture(scope='session', autouse=True)
@ -81,42 +91,84 @@ def loglevel(request):
@pytest.fixture(scope='session')
def spawn_backend(request):
def spawn_backend(request) -> str:
return request.config.option.spawn_backend
# @pytest.fixture(scope='function', autouse=True)
# def debug_enabled(request) -> str:
# from tractor import _state
# if _state._runtime_vars['_debug_mode']:
# breakpoint()
_ci_env: bool = os.environ.get('CI', False)
@pytest.fixture(scope='session')
def ci_env() -> bool:
"""Detect CI envoirment.
"""
return os.environ.get('TRAVIS', False) or os.environ.get('CI', False)
'''
Detect CI envoirment.
'''
return _ci_env
# TODO: also move this to `._testing` for now?
# -[ ] possibly generalize and re-use for multi-tree spawning
# along with the new stuff for multi-addrs in distribute_dis
# branch?
#
# choose randomly at import time
_reg_addr: tuple[str, int] = (
'127.0.0.1',
random.randint(1000, 9999),
)
@pytest.fixture(scope='session')
def arb_addr():
return _arb_addr
def reg_addr() -> tuple[str, int]:
# globally override the runtime to the per-test-session-dynamic
# addr so that all tests never conflict with any other actor
# tree using the default.
from tractor import _root
_root._default_lo_addrs = [_reg_addr]
return _reg_addr
def pytest_generate_tests(metafunc):
spawn_backend = metafunc.config.option.spawn_backend
if not spawn_backend:
# XXX some weird windows bug with `pytest`?
spawn_backend = 'mp'
assert spawn_backend in ('mp', 'trio')
spawn_backend = 'trio'
# TODO: maybe just use the literal `._spawn.SpawnMethodKey`?
assert spawn_backend in (
'mp_spawn',
'mp_forkserver',
'trio',
)
# NOTE: used to be used to dyanmically parametrize tests for when
# you just passed --spawn-backend=`mp` on the cli, but now we expect
# that cli input to be manually specified, BUT, maybe we'll do
# something like this again in the future?
if 'start_method' in metafunc.fixturenames:
if spawn_backend == 'mp':
from multiprocessing import get_all_start_methods
methods = get_all_start_methods()
if 'fork' in methods:
# fork not available on windows, so check before
# removing XXX: the fork method is in general
# incompatible with trio's global scheduler state
methods.remove('fork')
elif spawn_backend == 'trio':
methods = ['trio']
metafunc.parametrize("start_method", [spawn_backend], scope='module')
metafunc.parametrize("start_method", methods, scope='module')
# TODO: a way to let test scripts (like from `examples/`)
# guarantee they won't registry addr collide!
# @pytest.fixture
# def open_test_runtime(
# reg_addr: tuple,
# ) -> AsyncContextManager:
# return partial(
# tractor.open_nursery,
# registry_addrs=[reg_addr],
# )
def sig_prog(proc, sig):
@ -131,28 +183,40 @@ def sig_prog(proc, sig):
assert ret
# TODO: factor into @cm and move to `._testing`?
@pytest.fixture
def daemon(loglevel, testdir, arb_addr):
"""Run a daemon actor as a "remote arbiter".
"""
if loglevel in ('trace', 'debug'):
# too much logging will lock up the subproc (smh)
loglevel = 'info'
def daemon(
loglevel: str,
testdir,
reg_addr: tuple[str, int],
):
'''
Run a daemon root actor as a separate actor-process tree and
"remote registrar" for discovery-protocol related tests.
cmdargs = [
sys.executable, '-c',
"import tractor; tractor.run_daemon([], arbiter_addr={}, loglevel={})"
.format(
arb_addr,
"'{}'".format(loglevel) if loglevel else None)
'''
if loglevel in ('trace', 'debug'):
# XXX: too much logging will lock up the subproc (smh)
loglevel: str = 'info'
code: str = (
"import tractor; "
"tractor.run_daemon([], registry_addrs={reg_addrs}, loglevel={ll})"
).format(
reg_addrs=str([reg_addr]),
ll="'{}'".format(loglevel) if loglevel else None,
)
cmd: list[str] = [
sys.executable,
'-c', code,
]
kwargs = dict()
kwargs = {}
if platform.system() == 'Windows':
# without this, tests hang on windows forever
kwargs['creationflags'] = subprocess.CREATE_NEW_PROCESS_GROUP
proc = testdir.popen(
cmdargs,
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
**kwargs,

View File

View File

@ -0,0 +1,243 @@
'''
`tractor.devx.*` tooling sub-pkg test space.
'''
import time
from typing import (
Callable,
)
import pytest
from pexpect.exceptions import (
TIMEOUT,
)
from pexpect.spawnbase import SpawnBase
from tractor._testing import (
mk_cmd,
)
from tractor.devx._debug import (
_pause_msg as _pause_msg,
_crash_msg as _crash_msg,
_repl_fail_msg as _repl_fail_msg,
_ctlc_ignore_header as _ctlc_ignore_header,
)
from ..conftest import (
_ci_env,
)
@pytest.fixture
def spawn(
start_method,
testdir: pytest.Pytester,
reg_addr: tuple[str, int],
) -> Callable[[str], None]:
'''
Use the `pexpect` module shipped via `testdir.spawn()` to
run an `./examples/..` script by name.
'''
if start_method != 'trio':
pytest.skip(
'`pexpect` based tests only supported on `trio` backend'
)
def unset_colors():
'''
Python 3.13 introduced colored tracebacks that break patt
matching,
https://docs.python.org/3/using/cmdline.html#envvar-PYTHON_COLORS
https://docs.python.org/3/using/cmdline.html#using-on-controlling-color
'''
import os
os.environ['PYTHON_COLORS'] = '0'
def _spawn(
cmd: str,
**mkcmd_kwargs,
):
unset_colors()
return testdir.spawn(
cmd=mk_cmd(
cmd,
**mkcmd_kwargs,
),
expect_timeout=3,
# preexec_fn=unset_colors,
# ^TODO? get `pytest` core to expose underlying
# `pexpect.spawn()` stuff?
)
# such that test-dep can pass input script name.
return _spawn
@pytest.fixture(
params=[False, True],
ids='ctl-c={}'.format,
)
def ctlc(
request,
ci_env: bool,
) -> bool:
use_ctlc = request.param
node = request.node
markers = node.own_markers
for mark in markers:
if mark.name == 'has_nested_actors':
pytest.skip(
f'Test {node} has nested actors and fails with Ctrl-C.\n'
f'The test can sometimes run fine locally but until'
' we solve' 'this issue this CI test will be xfail:\n'
'https://github.com/goodboy/tractor/issues/320'
)
if mark.name == 'ctlcs_bish':
pytest.skip(
f'Test {node} prolly uses something from the stdlib (namely `asyncio`..)\n'
f'The test and/or underlying example script can *sometimes* run fine '
f'locally but more then likely until the cpython peeps get their sh#$ together, '
f'this test will definitely not behave like `trio` under SIGINT..\n'
)
if use_ctlc:
# XXX: disable pygments highlighting for auto-tests
# since some envs (like actions CI) will struggle
# the the added color-char encoding..
from tractor.devx._debug import TractorConfig
TractorConfig.use_pygements = False
yield use_ctlc
def expect(
child,
# normally a `pdb` prompt by default
patt: str,
**kwargs,
) -> None:
'''
Expect wrapper that prints last seen console
data before failing.
'''
try:
child.expect(
patt,
**kwargs,
)
except TIMEOUT:
before = str(child.before.decode())
print(before)
raise
PROMPT = r"\(Pdb\+\)"
def in_prompt_msg(
child: SpawnBase,
parts: list[str],
pause_on_false: bool = False,
err_on_false: bool = False,
print_prompt_on_false: bool = True,
) -> bool:
'''
Predicate check if (the prompt's) std-streams output has all
`str`-parts in it.
Can be used in test asserts for bulk matching expected
log/REPL output for a given `pdb` interact point.
'''
__tracebackhide__: bool = False
before: str = str(child.before.decode())
for part in parts:
if part not in before:
if pause_on_false:
import pdbp
pdbp.set_trace()
if print_prompt_on_false:
print(before)
if err_on_false:
raise ValueError(
f'Could not find pattern in `before` output?\n'
f'part: {part!r}\n'
)
return False
return True
# TODO: todo support terminal color-chars stripping so we can match
# against call stack frame output from the the 'll' command the like!
# -[ ] SO answer for stipping ANSI codes: https://stackoverflow.com/a/14693789
def assert_before(
child: SpawnBase,
patts: list[str],
**kwargs,
) -> None:
__tracebackhide__: bool = False
assert in_prompt_msg(
child=child,
parts=patts,
# since this is an "assert" helper ;)
err_on_false=True,
**kwargs
)
def do_ctlc(
child,
count: int = 3,
delay: float = 0.1,
patt: str|None = None,
# expect repl UX to reprint the prompt after every
# ctrl-c send.
# XXX: no idea but, in CI this never seems to work even on 3.10 so
# needs some further investigation potentially...
expect_prompt: bool = not _ci_env,
) -> str|None:
before: str|None = None
# make sure ctl-c sends don't do anything but repeat output
for _ in range(count):
time.sleep(delay)
child.sendcontrol('c')
# TODO: figure out why this makes CI fail..
# if you run this test manually it works just fine..
if expect_prompt:
time.sleep(delay)
child.expect(PROMPT)
before = str(child.before.decode())
time.sleep(delay)
if patt:
# should see the last line on console
assert patt in before
# return the console content up to the final prompt
return before

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,381 @@
'''
That "foreign loop/thread" debug REPL support better ALSO WORK!
Same as `test_native_pause.py`.
All these tests can be understood (somewhat) by running the
equivalent `examples/debugging/` scripts manually.
'''
from contextlib import (
contextmanager as cm,
)
# from functools import partial
# import itertools
import time
# from typing import (
# Iterator,
# )
import pytest
from pexpect.exceptions import (
TIMEOUT,
EOF,
)
from .conftest import (
# _ci_env,
do_ctlc,
PROMPT,
# expect,
in_prompt_msg,
assert_before,
_pause_msg,
_crash_msg,
_ctlc_ignore_header,
# _repl_fail_msg,
)
@cm
def maybe_expect_timeout(
ctlc: bool = False,
) -> None:
try:
yield
except TIMEOUT:
# breakpoint()
if ctlc:
pytest.xfail(
'Some kinda redic threading SIGINT bug i think?\n'
'See the notes in `examples/debugging/sync_bp.py`..\n'
)
raise
@pytest.mark.ctlcs_bish
def test_pause_from_sync(
spawn,
ctlc: bool,
):
'''
Verify we can use the `pdbp` REPL from sync functions AND from
any thread spawned with `trio.to_thread.run_sync()`.
`examples/debugging/sync_bp.py`
'''
child = spawn('sync_bp')
# first `sync_pause()` after nurseries open
child.expect(PROMPT)
assert_before(
child,
[
# pre-prompt line
_pause_msg,
"<Task '__main__.main'",
"('root'",
]
)
if ctlc:
do_ctlc(child)
# ^NOTE^ subactor not spawned yet; don't need extra delay.
child.sendline('c')
# first `await tractor.pause()` inside `p.open_context()` body
child.expect(PROMPT)
# XXX shouldn't see gb loaded message with PDB loglevel!
# assert not in_prompt_msg(
# child,
# ['`greenback` portal opened!'],
# )
# should be same root task
assert_before(
child,
[
_pause_msg,
"<Task '__main__.main'",
"('root'",
]
)
if ctlc:
do_ctlc(
child,
# NOTE: setting this to 0 (or some other sufficient
# small val) can cause the test to fail since the
# `subactor` suffers a race where the root/parent
# sends an actor-cancel prior to it hitting its pause
# point; by def the value is 0.1
delay=0.4,
)
# XXX, fwiw without a brief sleep here the SIGINT might actually
# trigger "subactor" cancellation by its parent before the
# shield-handler is engaged.
#
# => similar to the `delay` input to `do_ctlc()` below, setting
# this too low can cause the test to fail since the `subactor`
# suffers a race where the root/parent sends an actor-cancel
# prior to the context task hitting its pause point (and thus
# engaging the `sigint_shield()` handler in time); this value
# seems be good enuf?
time.sleep(0.6)
# one of the bg thread or subactor should have
# `Lock.acquire()`-ed
# (NOT both, which will result in REPL clobbering!)
attach_patts: dict[str, list[str]] = {
'subactor': [
"'start_n_sync_pause'",
"('subactor'",
],
'inline_root_bg_thread': [
"<Thread(inline_root_bg_thread",
"('root'",
],
'start_soon_root_bg_thread': [
"<Thread(start_soon_root_bg_thread",
"('root'",
],
}
conts: int = 0 # for debugging below matching logic on failure
while attach_patts:
child.sendline('c')
conts += 1
child.expect(PROMPT)
before = str(child.before.decode())
for key in attach_patts:
if key in before:
attach_key: str = key
expected_patts: str = attach_patts.pop(key)
assert_before(
child,
[_pause_msg]
+
expected_patts
)
break
else:
pytest.fail(
f'No keys found?\n\n'
f'{attach_patts.keys()}\n\n'
f'{before}\n'
)
# ensure no other task/threads engaged a REPL
# at the same time as the one that was detected above.
for key, other_patts in attach_patts.copy().items():
assert not in_prompt_msg(
child,
other_patts,
)
if ctlc:
do_ctlc(
child,
patt=attach_key,
# NOTE same as comment above
delay=0.4,
)
child.sendline('c')
# XXX TODO, weird threading bug it seems despite the
# `abandon_on_cancel: bool` setting to
# `trio.to_thread.run_sync()`..
with maybe_expect_timeout(
ctlc=ctlc,
):
child.expect(EOF)
def expect_any_of(
attach_patts: dict[str, list[str]],
child, # what type?
ctlc: bool = False,
prompt: str = _ctlc_ignore_header,
ctlc_delay: float = .4,
) -> list[str]:
'''
Receive any of a `list[str]` of patterns provided in
`attach_patts`.
Used to test racing prompts from multiple actors and/or
tasks using a common root process' `pdbp` REPL.
'''
assert attach_patts
child.expect(PROMPT)
before = str(child.before.decode())
for attach_key in attach_patts:
if attach_key in before:
expected_patts: str = attach_patts.pop(attach_key)
assert_before(
child,
expected_patts
)
break # from for
else:
pytest.fail(
f'No keys found?\n\n'
f'{attach_patts.keys()}\n\n'
f'{before}\n'
)
# ensure no other task/threads engaged a REPL
# at the same time as the one that was detected above.
for key, other_patts in attach_patts.copy().items():
assert not in_prompt_msg(
child,
other_patts,
)
if ctlc:
do_ctlc(
child,
patt=prompt,
# NOTE same as comment above
delay=ctlc_delay,
)
return expected_patts
@pytest.mark.ctlcs_bish
def test_sync_pause_from_aio_task(
spawn,
ctlc: bool
# ^TODO, fix for `asyncio`!!
):
'''
Verify we can use the `pdbp` REPL from an `asyncio.Task` spawned using
APIs in `.to_asyncio`.
`examples/debugging/asycio_bp.py`
'''
child = spawn('asyncio_bp')
# RACE on whether trio/asyncio task bps first
attach_patts: dict[str, list[str]] = {
# first pause in guest-mode (aka "infecting")
# `trio.Task`.
'trio-side': [
_pause_msg,
"<Task 'trio_ctx'",
"('aio_daemon'",
],
# `breakpoint()` from `asyncio.Task`.
'asyncio-side': [
_pause_msg,
"<Task pending name='Task-2' coro=<greenback_shim()",
"('aio_daemon'",
],
}
while attach_patts:
expect_any_of(
attach_patts=attach_patts,
child=child,
ctlc=ctlc,
)
child.sendline('c')
# NOW in race order,
# - the asyncio-task will error
# - the root-actor parent task will pause
#
attach_patts: dict[str, list[str]] = {
# error raised in `asyncio.Task`
"raise ValueError('asyncio side error!')": [
_crash_msg,
"<Task 'trio_ctx'",
"@ ('aio_daemon'",
"ValueError: asyncio side error!",
# XXX, we no longer show this frame by default!
# 'return await chan.receive()', # `.to_asyncio` impl internals in tb
],
# parent-side propagation via actor-nursery/portal
# "tractor._exceptions.RemoteActorError: remote task raised a 'ValueError'": [
"remote task raised a 'ValueError'": [
_crash_msg,
"src_uid=('aio_daemon'",
"('aio_daemon'",
],
# a final pause in root-actor
"<Task '__main__.main'": [
_pause_msg,
"<Task '__main__.main'",
"('root'",
],
}
while attach_patts:
expect_any_of(
attach_patts=attach_patts,
child=child,
ctlc=ctlc,
)
child.sendline('c')
assert not attach_patts
# final boxed error propagates to root
assert_before(
child,
[
_crash_msg,
"<Task '__main__.main'",
"('root'",
"remote task raised a 'ValueError'",
"ValueError: asyncio side error!",
]
)
if ctlc:
do_ctlc(
child,
# NOTE: setting this to 0 (or some other sufficient
# small val) can cause the test to fail since the
# `subactor` suffers a race where the root/parent
# sends an actor-cancel prior to it hitting its pause
# point; by def the value is 0.1
delay=0.4,
)
child.sendline('c')
# with maybe_expect_timeout():
child.expect(EOF)
def test_sync_pause_from_non_greenbacked_aio_task():
'''
Where the `breakpoint()` caller task is NOT spawned by
`tractor.to_asyncio` and thus never activates
a `greenback.ensure_portal()` beforehand, presumably bc the task
was started by some lib/dep as in often seen in the field.
Ensure sync pausing works when the pause is in,
- the root actor running in infected-mode?
|_ since we don't need any IPC to acquire the debug lock?
|_ is there some way to handle this like the non-main-thread case?
All other cases need to error out appropriately right?
- for any subactor we can't avoid needing the repl lock..
|_ is there a way to hook into `asyncio.ensure_future(obj)`?
'''
pass

View File

@ -0,0 +1,172 @@
'''
That "native" runtime-hackin toolset better be dang useful!
Verify the funtion of a variety of "developer-experience" tools we
offer from the `.devx` sub-pkg:
- use of the lovely `stackscope` for dumping actor `trio`-task trees
during operation and hangs.
TODO:
- demonstration of `CallerInfo` call stack frame filtering such that
for logging and REPL purposes a user sees exactly the layers needed
when debugging a problem inside the stack vs. in their app.
'''
import os
import signal
import time
from .conftest import (
expect,
assert_before,
in_prompt_msg,
PROMPT,
_pause_msg,
)
from pexpect.exceptions import (
# TIMEOUT,
EOF,
)
def test_shield_pause(
spawn,
):
'''
Verify the `tractor.pause()/.post_mortem()` API works inside an
already cancelled `trio.CancelScope` and that you can step to the
next checkpoint wherein the cancelled will get raised.
'''
child = spawn(
'shield_hang_in_sub'
)
expect(
child,
'Yo my child hanging..?',
)
assert_before(
child,
[
'Entering shield sleep..',
'Enabling trace-trees on `SIGUSR1` since `stackscope` is installed @',
]
)
script_pid: int = child.pid
print(
f'Sending SIGUSR1 to {script_pid}\n'
f'(kill -s SIGUSR1 {script_pid})\n'
)
os.kill(
script_pid,
signal.SIGUSR1,
)
time.sleep(0.2)
expect(
child,
# end-of-tree delimiter
"end-of-\('root'",
)
assert_before(
child,
[
# 'Srying to dump `stackscope` tree..',
# 'Dumping `stackscope` tree for actor',
"('root'", # uid line
# TODO!? this used to show?
# -[ ] mk reproducable for @oremanj?
#
# parent block point (non-shielded)
# 'await trio.sleep_forever() # in root',
]
)
expect(
child,
# end-of-tree delimiter
"end-of-\('hanger'",
)
assert_before(
child,
[
# relay to the sub should be reported
'Relaying `SIGUSR1`[10] to sub-actor',
"('hanger'", # uid line
# TODO!? SEE ABOVE
# hanger LOC where it's shield-halted
# 'await trio.sleep_forever() # in subactor',
]
)
# simulate the user sending a ctl-c to the hanging program.
# this should result in the terminator kicking in since
# the sub is shield blocking and can't respond to SIGINT.
os.kill(
child.pid,
signal.SIGINT,
)
expect(
child,
'Shutting down actor runtime',
timeout=6,
)
assert_before(
child,
[
'raise KeyboardInterrupt',
# 'Shutting down actor runtime',
'#T-800 deployed to collect zombie B0',
"'--uid', \"('hanger',",
]
)
def test_breakpoint_hook_restored(
spawn,
):
'''
Ensures our actor runtime sets a custom `breakpoint()` hook
on open then restores the stdlib's default on close.
The hook state validation is done via `assert`s inside the
invoked script with only `breakpoint()` (not `tractor.pause()`)
calls used.
'''
child = spawn('restore_builtin_breakpoint')
child.expect(PROMPT)
assert_before(
child,
[
_pause_msg,
"<Task '__main__.main'",
"('root'",
"first bp, tractor hook set",
]
)
child.sendline('c')
child.expect(PROMPT)
assert_before(
child,
[
"last bp, stdlib hook restored",
]
)
# since the stdlib hook was already restored there should be NO
# `tractor` `log.pdb()` content from console!
assert not in_prompt_msg(
child,
[
_pause_msg,
"<Task '__main__.main'",
"('root'",
],
)
child.sendline('c')
child.expect(EOF)

View File

@ -0,0 +1,290 @@
'''
Sketchy network blackoutz, ugly byzantine gens, puedes eschuchar la
cancelacion?..
'''
from functools import partial
from types import ModuleType
import pytest
from _pytest.pathlib import import_path
import trio
import tractor
from tractor._testing import (
examples_dir,
break_ipc,
)
@pytest.mark.parametrize(
'pre_aclose_msgstream',
[
False,
True,
],
ids=[
'no_msgstream_aclose',
'pre_aclose_msgstream',
],
)
@pytest.mark.parametrize(
'ipc_break',
[
# no breaks
{
'break_parent_ipc_after': False,
'break_child_ipc_after': False,
},
# only parent breaks
{
'break_parent_ipc_after': 500,
'break_child_ipc_after': False,
},
# only child breaks
{
'break_parent_ipc_after': False,
'break_child_ipc_after': 500,
},
# both: break parent first
{
'break_parent_ipc_after': 500,
'break_child_ipc_after': 800,
},
# both: break child first
{
'break_parent_ipc_after': 800,
'break_child_ipc_after': 500,
},
],
ids=[
'no_break',
'break_parent',
'break_child',
'break_both_parent_first',
'break_both_child_first',
],
)
def test_ipc_channel_break_during_stream(
debug_mode: bool,
loglevel: str,
spawn_backend: str,
ipc_break: dict|None,
pre_aclose_msgstream: bool,
):
'''
Ensure we can have an IPC channel break its connection during
streaming and it's still possible for the (simulated) user to kill
the actor tree using SIGINT.
We also verify the type of connection error expected in the parent
depending on which side if the IPC breaks first.
'''
if spawn_backend != 'trio':
if debug_mode:
pytest.skip('`debug_mode` only supported on `trio` spawner')
# non-`trio` spawners should never hit the hang condition that
# requires the user to do ctl-c to cancel the actor tree.
# expect_final_exc = trio.ClosedResourceError
expect_final_exc = tractor.TransportClosed
mod: ModuleType = import_path(
examples_dir() / 'advanced_faults'
/ 'ipc_failure_during_stream.py',
root=examples_dir(),
consider_namespace_packages=False,
)
# by def we expect KBI from user after a simulated "hang
# period" wherein the user eventually hits ctl-c to kill the
# root-actor tree.
expect_final_exc: BaseException = KeyboardInterrupt
if (
# only expect EoC if trans is broken on the child side,
ipc_break['break_child_ipc_after'] is not False
# AND we tell the child to call `MsgStream.aclose()`.
and pre_aclose_msgstream
):
# expect_final_exc = trio.EndOfChannel
# ^XXX NOPE! XXX^ since now `.open_stream()` absorbs this
# gracefully!
expect_final_exc = KeyboardInterrupt
# NOTE when ONLY the child breaks or it breaks BEFORE the
# parent we expect the parent to get a closed resource error
# on the next `MsgStream.receive()` and then fail out and
# cancel the child from there.
#
# ONLY CHILD breaks
if (
ipc_break['break_child_ipc_after']
and
ipc_break['break_parent_ipc_after'] is False
):
# NOTE: we DO NOT expect this any more since
# the child side's channel will be broken silently
# and nothing on the parent side will indicate this!
# expect_final_exc = trio.ClosedResourceError
# NOTE: child will send a 'stop' msg before it breaks
# the transport channel BUT, that will be absorbed by the
# `ctx.open_stream()` block and thus the `.open_context()`
# should hang, after which the test script simulates
# a user sending ctl-c by raising a KBI.
if pre_aclose_msgstream:
expect_final_exc = KeyboardInterrupt
# XXX OLD XXX
# if child calls `MsgStream.aclose()` then expect EoC.
# ^ XXX not any more ^ since eoc is always absorbed
# gracefully and NOT bubbled to the `.open_context()`
# block!
# expect_final_exc = trio.EndOfChannel
# BOTH but, CHILD breaks FIRST
elif (
ipc_break['break_child_ipc_after'] is not False
and (
ipc_break['break_parent_ipc_after']
> ipc_break['break_child_ipc_after']
)
):
if pre_aclose_msgstream:
expect_final_exc = KeyboardInterrupt
# NOTE when the parent IPC side dies (even if the child does as well
# but the child fails BEFORE the parent) we always expect the
# IPC layer to raise a closed-resource, NEVER do we expect
# a stop msg since the parent-side ctx apis will error out
# IMMEDIATELY before the child ever sends any 'stop' msg.
#
# ONLY PARENT breaks
elif (
ipc_break['break_parent_ipc_after']
and
ipc_break['break_child_ipc_after'] is False
):
# expect_final_exc = trio.ClosedResourceError
expect_final_exc = tractor.TransportClosed
# BOTH but, PARENT breaks FIRST
elif (
ipc_break['break_parent_ipc_after'] is not False
and (
ipc_break['break_child_ipc_after']
>
ipc_break['break_parent_ipc_after']
)
):
# expect_final_exc = trio.ClosedResourceError
expect_final_exc = tractor.TransportClosed
with pytest.raises(
expected_exception=(
expect_final_exc,
ExceptionGroup,
),
) as excinfo:
try:
trio.run(
partial(
mod.main,
debug_mode=debug_mode,
start_method=spawn_backend,
loglevel=loglevel,
pre_close=pre_aclose_msgstream,
**ipc_break,
)
)
except KeyboardInterrupt as _kbi:
kbi = _kbi
if expect_final_exc is not KeyboardInterrupt:
pytest.fail(
'Rxed unexpected KBI !?\n'
f'{repr(kbi)}'
)
raise
except tractor.TransportClosed as _tc:
tc = _tc
if expect_final_exc is KeyboardInterrupt:
pytest.fail(
'Unexpected transport failure !?\n'
f'{repr(tc)}'
)
cause: Exception = tc.__cause__
assert (
type(cause) is trio.ClosedResourceError
and
cause.args[0] == 'another task closed this fd'
)
raise
# get raw instance from pytest wrapper
value = excinfo.value
if isinstance(value, ExceptionGroup):
excs = value.exceptions
assert len(excs) == 1
final_exc = excs[0]
assert isinstance(final_exc, expect_final_exc)
@tractor.context
async def break_ipc_after_started(
ctx: tractor.Context,
) -> None:
await ctx.started()
async with ctx.open_stream() as stream:
# TODO: make a test which verifies the error
# for this, i.e. raises a `MsgTypeError`
# await ctx.chan.send(None)
await break_ipc(
stream=stream,
pre_close=True,
)
print('child broke IPC and terminating')
def test_stream_closed_right_after_ipc_break_and_zombie_lord_engages():
'''
Verify that is a subactor's IPC goes down just after bringing up
a stream the parent can trigger a SIGINT and the child will be
reaped out-of-IPC by the localhost process supervision machinery:
aka "zombie lord".
'''
async def main():
with trio.fail_after(3):
async with tractor.open_nursery() as an:
portal = await an.start_actor(
'ipc_breaker',
enable_modules=[__name__],
)
with trio.move_on_after(1):
async with (
portal.open_context(
break_ipc_after_started
) as (ctx, sent),
):
async with ctx.open_stream():
await trio.sleep(0.5)
print('parent waiting on context')
print(
'parent exited context\n'
'parent raising KBI..\n'
)
raise KeyboardInterrupt
with pytest.raises(KeyboardInterrupt):
trio.run(main)

View File

@ -5,8 +5,8 @@ Advanced streaming patterns using bidirectional streams and contexts.
from collections import Counter
import itertools
import platform
from typing import Set, Dict, List
import pytest
import trio
import tractor
@ -15,7 +15,7 @@ def is_win():
return platform.system() == 'Windows'
_registry: Dict[str, Set[tractor.ReceiveMsgStream]] = {
_registry: dict[str, set[tractor.MsgStream]] = {
'even': set(),
'odd': set(),
}
@ -77,7 +77,7 @@ async def subscribe(
async def consumer(
subs: List[str],
subs: list[str],
) -> None:
@ -144,8 +144,16 @@ def test_dynamic_pub_sub():
try:
trio.run(main)
except trio.TooSlowError:
pass
except (
trio.TooSlowError,
ExceptionGroup,
) as err:
if isinstance(err, ExceptionGroup):
for suberr in err.exceptions:
if isinstance(suberr, trio.TooSlowError):
break
else:
pytest.fail('Never got a `TooSlowError` ?')
@tractor.context
@ -299,44 +307,77 @@ async def inf_streamer(
async with (
ctx.open_stream() as stream,
trio.open_nursery() as n,
# XXX TODO, INTERESTING CASE!!
# - if we don't collapse the eg then the embedded
# `trio.EndOfChannel` doesn't propagate directly to the above
# .open_stream() parent, resulting in it also raising instead
# of gracefully absorbing as normal.. so how to handle?
trio.open_nursery(
strict_exception_groups=False,
) as tn,
):
async def bail_on_sentinel():
async def close_stream_on_sentinel():
async for msg in stream:
if msg == 'done':
print(
'streamer RXed "done" sentinel msg!\n'
'CLOSING `MsgStream`!'
)
await stream.aclose()
else:
print(f'streamer received {msg}')
else:
print('streamer exited recv loop')
# start termination detector
n.start_soon(bail_on_sentinel)
tn.start_soon(close_stream_on_sentinel)
for val in itertools.count():
cap: int = 10000 # so that we don't spin forever when bug..
for val in range(cap):
try:
print(f'streamer sending {val}')
await stream.send(val)
if val > cap:
raise RuntimeError(
'Streamer never cancelled by setinel?'
)
await trio.sleep(0.001)
# close out the stream gracefully
except trio.ClosedResourceError:
# close out the stream gracefully
print('transport closed on streamer side!')
assert stream.closed
break
else:
raise RuntimeError(
'Streamer not cancelled before finished sending?'
)
print('terminating streamer')
print('streamer exited .open_streamer() block')
def test_local_task_fanout_from_stream():
def test_local_task_fanout_from_stream(
debug_mode: bool,
):
'''
Single stream with multiple local consumer tasks using the
``MsgStream.subscribe()` api.
Ensure all tasks receive all values after stream completes sending.
Ensure all tasks receive all values after stream completes
sending.
'''
consumers = 22
consumers: int = 22
async def main():
counts = Counter()
async with tractor.open_nursery() as tn:
p = await tn.start_actor(
async with tractor.open_nursery(
debug_mode=debug_mode,
) as tn:
p: tractor.Portal = await tn.start_actor(
'inf_streamer',
enable_modules=[__name__],
)
@ -344,7 +385,6 @@ def test_local_task_fanout_from_stream():
p.open_context(inf_streamer) as (ctx, _),
ctx.open_stream() as stream,
):
async def pull_and_count(name: str):
# name = trio.lowlevel.current_task().name
async with stream.subscribe() as recver:
@ -353,7 +393,7 @@ def test_local_task_fanout_from_stream():
tractor.trionics.BroadcastReceiver
)
async for val in recver:
# print(f'{name}: {val}')
print(f'bx {name} rx: {val}')
counts[name] += 1
print(f'{name} bcaster ended')
@ -363,10 +403,14 @@ def test_local_task_fanout_from_stream():
with trio.fail_after(3):
async with trio.open_nursery() as nurse:
for i in range(consumers):
nurse.start_soon(pull_and_count, i)
nurse.start_soon(
pull_and_count,
i,
)
# delay to let bcast consumers pull msgs
await trio.sleep(0.5)
print('\nterminating')
print('terminating nursery of bcast rxer consumers!')
await stream.send('done')
print('closed stream connection')

View File

@ -11,8 +11,10 @@ from itertools import repeat
import pytest
import trio
import tractor
from conftest import tractor_test, no_windows
from tractor._testing import (
tractor_test,
)
from .conftest import no_windows
def is_win():
@ -43,45 +45,82 @@ async def do_nuthin():
],
ids=['no_args', 'unexpected_args'],
)
def test_remote_error(arb_addr, args_err):
"""Verify an error raised in a subactor that is propagated
def test_remote_error(reg_addr, args_err):
'''
Verify an error raised in a subactor that is propagated
to the parent nursery, contains the underlying boxed builtin
error type info and causes cancellation and reraising all the
way up the stack.
"""
'''
args, errtype = args_err
async def main():
async with tractor.open_nursery(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
) as nursery:
# on a remote type error caused by bad input args
# this should raise directly which means we **don't** get
# an exception group outside the nursery since the error
# here and the far end task error are one in the same?
portal = await nursery.run_in_actor(
assert_err, name='errorer', **args
assert_err,
name='errorer',
**args
)
# get result(s) from main task
try:
# this means the root actor will also raise a local
# parent task error and thus an eg will propagate out
# of this actor nursery.
await portal.result()
except tractor.RemoteActorError as err:
assert err.type == errtype
assert err.boxed_type == errtype
print("Look Maa that actor failed hard, hehh")
raise
with pytest.raises(tractor.RemoteActorError) as excinfo:
trio.run(main)
# ensure boxed errors
if args:
with pytest.raises(tractor.RemoteActorError) as excinfo:
trio.run(main)
# ensure boxed error is correct
assert excinfo.value.type == errtype
assert excinfo.value.boxed_type == errtype
else:
# the root task will also error on the `Portal.result()`
# call so we expect an error from there AND the child.
# |_ tho seems like on new `trio` this doesn't always
# happen?
with pytest.raises((
BaseExceptionGroup,
tractor.RemoteActorError,
)) as excinfo:
trio.run(main)
# ensure boxed errors are `errtype`
err: BaseException = excinfo.value
if isinstance(err, BaseExceptionGroup):
suberrs: list[BaseException] = err.exceptions
else:
suberrs: list[BaseException] = [err]
for exc in suberrs:
assert exc.boxed_type == errtype
def test_multierror(arb_addr):
"""Verify we raise a ``trio.MultiError`` out of a nursery where
def test_multierror(
reg_addr: tuple[str, int],
):
'''
Verify we raise a ``BaseExceptionGroup`` out of a nursery where
more then one actor errors.
"""
'''
async def main():
async with tractor.open_nursery(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
) as nursery:
await nursery.run_in_actor(assert_err, name='errorer1')
@ -91,14 +130,14 @@ def test_multierror(arb_addr):
try:
await portal2.result()
except tractor.RemoteActorError as err:
assert err.type == AssertionError
assert err.boxed_type is AssertionError
print("Look Maa that first actor failed hard, hehh")
raise
# here we should get a `trio.MultiError` containing exceptions
# here we should get a ``BaseExceptionGroup`` containing exceptions
# from both subactors
with pytest.raises(trio.MultiError):
with pytest.raises(BaseExceptionGroup):
trio.run(main)
@ -106,14 +145,14 @@ def test_multierror(arb_addr):
@pytest.mark.parametrize(
'num_subactors', range(25, 26),
)
def test_multierror_fast_nursery(arb_addr, start_method, num_subactors, delay):
"""Verify we raise a ``trio.MultiError`` out of a nursery where
def test_multierror_fast_nursery(reg_addr, start_method, num_subactors, delay):
"""Verify we raise a ``BaseExceptionGroup`` out of a nursery where
more then one actor errors and also with a delay before failure
to test failure during an ongoing spawning.
"""
async def main():
async with tractor.open_nursery(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
) as nursery:
for i in range(num_subactors):
@ -123,10 +162,11 @@ def test_multierror_fast_nursery(arb_addr, start_method, num_subactors, delay):
delay=delay
)
with pytest.raises(trio.MultiError) as exc_info:
# with pytest.raises(trio.MultiError) as exc_info:
with pytest.raises(BaseExceptionGroup) as exc_info:
trio.run(main)
assert exc_info.type == tractor.MultiError
assert exc_info.type == ExceptionGroup
err = exc_info.value
exceptions = err.exceptions
@ -142,7 +182,7 @@ def test_multierror_fast_nursery(arb_addr, start_method, num_subactors, delay):
for exc in exceptions:
assert isinstance(exc, tractor.RemoteActorError)
assert exc.type == AssertionError
assert exc.boxed_type is AssertionError
async def do_nothing():
@ -150,15 +190,20 @@ async def do_nothing():
@pytest.mark.parametrize('mechanism', ['nursery_cancel', KeyboardInterrupt])
def test_cancel_single_subactor(arb_addr, mechanism):
"""Ensure a ``ActorNursery.start_actor()`` spawned subactor
def test_cancel_single_subactor(reg_addr, mechanism):
'''
Ensure a ``ActorNursery.start_actor()`` spawned subactor
cancels when the nursery is cancelled.
"""
'''
async def spawn_actor():
"""Spawn an actor that blocks indefinitely.
"""
'''
Spawn an actor that blocks indefinitely then cancel via
either `ActorNursery.cancel()` or an exception raise.
'''
async with tractor.open_nursery(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
) as nursery:
portal = await nursery.start_actor(
@ -214,8 +259,8 @@ async def test_cancel_infinite_streamer(start_method):
[
# daemon actors sit idle while single task actors error out
(1, tractor.RemoteActorError, AssertionError, (assert_err, {}), None),
(2, tractor.MultiError, AssertionError, (assert_err, {}), None),
(3, tractor.MultiError, AssertionError, (assert_err, {}), None),
(2, BaseExceptionGroup, AssertionError, (assert_err, {}), None),
(3, BaseExceptionGroup, AssertionError, (assert_err, {}), None),
# 1 daemon actor errors out while single task actors sleep forever
(3, tractor.RemoteActorError, AssertionError, (sleep_forever, {}),
@ -226,7 +271,7 @@ async def test_cancel_infinite_streamer(start_method):
(do_nuthin, {}), (assert_err, {'delay': 1}, True)),
# daemon complete quickly delay while single task
# actors error after brief delay
(3, tractor.MultiError, AssertionError,
(3, BaseExceptionGroup, AssertionError,
(assert_err, {'delay': 1}), (do_nuthin, {}, False)),
],
ids=[
@ -278,7 +323,7 @@ async def test_some_cancels_all(num_actors_and_errs, start_method, loglevel):
await portal.run(func, **kwargs)
except tractor.RemoteActorError as err:
assert err.type == err_type
assert err.boxed_type == err_type
# we only expect this first error to propogate
# (all other daemons are cancelled before they
# can be scheduled)
@ -293,15 +338,15 @@ async def test_some_cancels_all(num_actors_and_errs, start_method, loglevel):
# should error here with a ``RemoteActorError`` or ``MultiError``
except first_err as err:
if isinstance(err, tractor.MultiError):
if isinstance(err, BaseExceptionGroup):
assert len(err.exceptions) == num_actors
for exc in err.exceptions:
if isinstance(exc, tractor.RemoteActorError):
assert exc.type == err_type
assert exc.boxed_type == err_type
else:
assert isinstance(exc, trio.Cancelled)
elif isinstance(err, tractor.RemoteActorError):
assert err.type == err_type
assert err.boxed_type == err_type
assert n.cancelled is True
assert not n._children
@ -337,7 +382,7 @@ async def spawn_and_error(breadth, depth) -> None:
@tractor_test
async def test_nested_multierrors(loglevel, start_method):
'''
Test that failed actor sets are wrapped in `trio.MultiError`s. This
Test that failed actor sets are wrapped in `BaseExceptionGroup`s. This
test goes only 2 nurseries deep but we should eventually have tests
for arbitrary n-depth actor trees.
@ -365,7 +410,7 @@ async def test_nested_multierrors(loglevel, start_method):
breadth=subactor_breadth,
depth=depth,
)
except trio.MultiError as err:
except BaseExceptionGroup as err:
assert len(err.exceptions) == subactor_breadth
for subexc in err.exceptions:
@ -380,21 +425,21 @@ async def test_nested_multierrors(loglevel, start_method):
elif isinstance(subexc, tractor.RemoteActorError):
# on windows it seems we can't exactly be sure wtf
# will happen..
assert subexc.type in (
assert subexc.boxed_type in (
tractor.RemoteActorError,
trio.Cancelled,
trio.MultiError
BaseExceptionGroup,
)
elif isinstance(subexc, trio.MultiError):
elif isinstance(subexc, BaseExceptionGroup):
for subsub in subexc.exceptions:
if subsub in (tractor.RemoteActorError,):
subsub = subsub.type
subsub = subsub.boxed_type
assert type(subsub) in (
trio.Cancelled,
trio.MultiError,
BaseExceptionGroup,
)
else:
assert isinstance(subexc, tractor.RemoteActorError)
@ -405,16 +450,16 @@ async def test_nested_multierrors(loglevel, start_method):
# we get back the (sent) cancel signal instead
if is_win():
if isinstance(subexc, tractor.RemoteActorError):
assert subexc.type in (
trio.MultiError,
assert subexc.boxed_type in (
BaseExceptionGroup,
tractor.RemoteActorError
)
else:
assert isinstance(subexc, trio.MultiError)
assert isinstance(subexc, BaseExceptionGroup)
else:
assert subexc.type is trio.MultiError
assert subexc.boxed_type is ExceptionGroup
else:
assert subexc.type in (
assert subexc.boxed_type in (
tractor.RemoteActorError,
trio.Cancelled
)
@ -435,7 +480,7 @@ def test_cancel_via_SIGINT(
with trio.fail_after(2):
async with tractor.open_nursery() as tn:
await tn.start_actor('sucka')
if spawn_backend == 'mp':
if 'mp' in spawn_backend:
time.sleep(0.1)
os.kill(pid, signal.SIGINT)
await trio.sleep_forever()
@ -459,7 +504,9 @@ def test_cancel_via_SIGINT_other_task(
if is_win(): # smh
timeout += 1
async def spawn_and_sleep_forever(task_status=trio.TASK_STATUS_IGNORED):
async def spawn_and_sleep_forever(
task_status=trio.TASK_STATUS_IGNORED
):
async with tractor.open_nursery() as tn:
for i in range(3):
await tn.run_in_actor(
@ -472,9 +519,11 @@ def test_cancel_via_SIGINT_other_task(
async def main():
# should never timeout since SIGINT should cancel the current program
with trio.fail_after(timeout):
async with trio.open_nursery() as n:
async with trio.open_nursery(
strict_exception_groups=False,
) as n:
await n.start(spawn_and_sleep_forever)
if spawn_backend == 'mp':
if 'mp' in spawn_backend:
time.sleep(0.1)
os.kill(pid, signal.SIGINT)
@ -565,6 +614,12 @@ def test_fast_graceful_cancel_when_spawn_task_in_soft_proc_wait_for_daemon(
nurse.start_soon(delayed_kbi)
await p.run(do_nuthin)
# need to explicitly re-raise the lone kbi..now
except* KeyboardInterrupt as kbi_eg:
assert (len(excs := kbi_eg.exceptions) == 1)
raise excs[0]
finally:
duration = time.time() - start
if duration > timeout:

View File

@ -6,14 +6,15 @@ sub-sub-actor daemons.
'''
from typing import Optional
import asyncio
from contextlib import asynccontextmanager as acm
from contextlib import (
asynccontextmanager as acm,
aclosing,
)
import pytest
import trio
from trio_typing import TaskStatus
import tractor
from tractor import RemoteActorError
from async_generator import aclosing
async def aio_streamer(
@ -94,8 +95,8 @@ async def trio_main(
# stash a "service nursery" as "actor local" (aka a Python global)
global _nursery
n = _nursery
assert n
tn = _nursery
assert tn
async def consume_stream():
async with wrapper_mngr() as stream:
@ -103,10 +104,10 @@ async def trio_main(
print(msg)
# run 2 tasks to ensure broadcaster chan use
n.start_soon(consume_stream)
n.start_soon(consume_stream)
tn.start_soon(consume_stream)
tn.start_soon(consume_stream)
n.start_soon(trio_sleep_and_err)
tn.start_soon(trio_sleep_and_err)
await trio.sleep_forever()
@ -116,8 +117,10 @@ async def open_actor_local_nursery(
ctx: tractor.Context,
):
global _nursery
async with trio.open_nursery() as n:
_nursery = n
async with trio.open_nursery(
strict_exception_groups=False,
) as tn:
_nursery = tn
await ctx.started()
await trio.sleep(10)
# await trio.sleep(1)
@ -131,7 +134,7 @@ async def open_actor_local_nursery(
# never yields back.. aka a scenario where the
# ``tractor.context`` task IS NOT in the service n's cancel
# scope.
n.cancel_scope.cancel()
tn.cancel_scope.cancel()
@pytest.mark.parametrize(
@ -141,7 +144,7 @@ async def open_actor_local_nursery(
)
def test_actor_managed_trio_nursery_task_error_cancels_aio(
asyncio_mode: bool,
arb_addr
reg_addr: tuple,
):
'''
Verify that a ``trio`` nursery created managed in a child actor
@ -156,7 +159,7 @@ def test_actor_managed_trio_nursery_task_error_cancels_aio(
async with tractor.open_nursery() as n:
p = await n.start_actor(
'nursery_mngr',
infect_asyncio=asyncio_mode,
infect_asyncio=asyncio_mode, # TODO, is this enabling debug mode?
enable_modules=[__name__],
)
async with (
@ -170,4 +173,4 @@ def test_actor_managed_trio_nursery_task_error_cancels_aio(
# verify boxed error
err = excinfo.value
assert isinstance(err.type(), NameError)
assert err.boxed_type is NameError

View File

@ -1,36 +1,81 @@
import itertools
import pytest
import trio
import tractor
from tractor import open_actor_cluster
from tractor.trionics import gather_contexts
from conftest import tractor_test
from tractor._testing import tractor_test
MESSAGE = 'tractoring at full speed'
def test_empty_mngrs_input_raises() -> None:
async def main():
with trio.fail_after(1):
async with (
open_actor_cluster(
modules=[__name__],
# NOTE: ensure we can passthrough runtime opts
loglevel='info',
# debug_mode=True,
) as portals,
gather_contexts(
# NOTE: it's the use of inline-generator syntax
# here that causes the empty input.
mngrs=(
p.open_context(worker) for p in portals.values()
),
),
):
assert 0
with pytest.raises(ValueError):
trio.run(main)
@tractor.context
async def worker(ctx: tractor.Context) -> None:
async def worker(
ctx: tractor.Context,
) -> None:
await ctx.started()
async with ctx.open_stream(backpressure=True) as stream:
async with ctx.open_stream(
allow_overruns=True,
) as stream:
# TODO: this with the below assert causes a hang bug?
# with trio.move_on_after(1):
async for msg in stream:
# do something with msg
print(msg)
assert msg == MESSAGE
# TODO: does this ever cause a hang
# assert 0
@tractor_test
async def test_streaming_to_actor_cluster() -> None:
async with (
open_actor_cluster(modules=[__name__]) as portals,
gather_contexts(
mngrs=[p.open_context(worker) for p in portals.values()],
) as contexts,
gather_contexts(
mngrs=[ctx[0].open_stream() for ctx in contexts],
) as streams,
):
with trio.move_on_after(1):
for stream in itertools.cycle(streams):

File diff suppressed because it is too large Load Diff

View File

@ -1,598 +0,0 @@
"""
That native debug better work!
All these tests can be understood (somewhat) by running the equivalent
`examples/debugging/` scripts manually.
TODO:
- none of these tests have been run successfully on windows yet but
there's been manual testing that verified it works.
- wonder if any of it'll work on OS X?
"""
import time
from os import path
import platform
import pytest
import pexpect
from conftest import repodir
# TODO: The next great debugger audit could be done by you!
# - recurrent entry to breakpoint() from single actor *after* and an
# error in another task?
# - root error before child errors
# - root error after child errors
# - root error before child breakpoint
# - root error after child breakpoint
# - recurrent root errors
if platform.system() == 'Windows':
pytest.skip(
'Debugger tests have no windows support (yet)',
allow_module_level=True,
)
def examples_dir():
"""Return the abspath to the examples directory.
"""
return path.join(repodir(), 'examples', 'debugging/')
def mk_cmd(ex_name: str) -> str:
"""Generate a command suitable to pass to ``pexpect.spawn()``.
"""
return ' '.join(
['python',
path.join(examples_dir(), f'{ex_name}.py')]
)
@pytest.fixture
def spawn(
start_method,
testdir,
arb_addr,
) -> 'pexpect.spawn':
if start_method != 'trio':
pytest.skip(
"Debugger tests are only supported on the trio backend"
)
def _spawn(cmd):
return testdir.spawn(
cmd=mk_cmd(cmd),
expect_timeout=3,
)
return _spawn
@pytest.mark.parametrize(
'user_in_out',
[
('c', 'AssertionError'),
('q', 'AssertionError'),
],
ids=lambda item: f'{item[0]} -> {item[1]}',
)
def test_root_actor_error(spawn, user_in_out):
"""Demonstrate crash handler entering pdbpp from basic error in root actor.
"""
user_input, expect_err_str = user_in_out
child = spawn('root_actor_error')
# scan for the pdbpp prompt
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
# make sure expected logging and error arrives
assert "Attaching to pdb in crashed actor: ('root'" in before
assert 'AssertionError' in before
# send user command
child.sendline(user_input)
# process should exit
child.expect(pexpect.EOF)
assert expect_err_str in str(child.before)
@pytest.mark.parametrize(
'user_in_out',
[
('c', None),
('q', 'bdb.BdbQuit'),
],
ids=lambda item: f'{item[0]} -> {item[1]}',
)
def test_root_actor_bp(spawn, user_in_out):
"""Demonstrate breakpoint from in root actor.
"""
user_input, expect_err_str = user_in_out
child = spawn('root_actor_breakpoint')
# scan for the pdbpp prompt
child.expect(r"\(Pdb\+\+\)")
assert 'Error' not in str(child.before)
# send user command
child.sendline(user_input)
child.expect('\r\n')
# process should exit
child.expect(pexpect.EOF)
if expect_err_str is None:
assert 'Error' not in str(child.before)
else:
assert expect_err_str in str(child.before)
def test_root_actor_bp_forever(spawn):
"Re-enter a breakpoint from the root actor-task."
child = spawn('root_actor_breakpoint_forever')
# do some "next" commands to demonstrate recurrent breakpoint
# entries
for _ in range(10):
child.sendline('next')
child.expect(r"\(Pdb\+\+\)")
# do one continue which should trigger a new task to lock the tty
child.sendline('continue')
child.expect(r"\(Pdb\+\+\)")
# XXX: this previously caused a bug!
child.sendline('n')
child.expect(r"\(Pdb\+\+\)")
child.sendline('n')
child.expect(r"\(Pdb\+\+\)")
def test_subactor_error(spawn):
"Single subactor raising an error"
child = spawn('subactor_error')
# scan for the pdbpp prompt
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
assert "Attaching to pdb in crashed actor: ('name_error'" in before
# send user command
# (in this case it's the same for 'continue' vs. 'quit')
child.sendline('continue')
# the debugger should enter a second time in the nursery
# creating actor
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
# root actor gets debugger engaged
assert "Attaching to pdb in crashed actor: ('root'" in before
# error is a remote error propagated from the subactor
assert "RemoteActorError: ('name_error'" in before
child.sendline('c')
child.expect('\r\n')
# process should exit
child.expect(pexpect.EOF)
def test_subactor_breakpoint(spawn):
"Single subactor with an infinite breakpoint loop"
child = spawn('subactor_breakpoint')
# scan for the pdbpp prompt
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
assert "Attaching pdb to actor: ('breakpoint_forever'" in before
# do some "next" commands to demonstrate recurrent breakpoint
# entries
for _ in range(10):
child.sendline('next')
child.expect(r"\(Pdb\+\+\)")
# now run some "continues" to show re-entries
for _ in range(5):
child.sendline('continue')
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
assert "Attaching pdb to actor: ('breakpoint_forever'" in before
# finally quit the loop
child.sendline('q')
# child process should exit but parent will capture pdb.BdbQuit
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
assert "RemoteActorError: ('breakpoint_forever'" in before
assert 'bdb.BdbQuit' in before
# quit the parent
child.sendline('c')
# process should exit
child.expect(pexpect.EOF)
before = str(child.before.decode())
assert "RemoteActorError: ('breakpoint_forever'" in before
assert 'bdb.BdbQuit' in before
def test_multi_subactors(spawn):
"""
Multiple subactors, both erroring and breakpointing as well as
a nested subactor erroring.
"""
child = spawn(r'multi_subactors')
# scan for the pdbpp prompt
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
assert "Attaching pdb to actor: ('breakpoint_forever'" in before
# do some "next" commands to demonstrate recurrent breakpoint
# entries
for _ in range(10):
child.sendline('next')
child.expect(r"\(Pdb\+\+\)")
# continue to next error
child.sendline('c')
# first name_error failure
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
assert "Attaching to pdb in crashed actor: ('name_error'" in before
assert "NameError" in before
# continue again
child.sendline('c')
# 2nd name_error failure
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
assert "Attaching to pdb in crashed actor: ('name_error_1'" in before
assert "NameError" in before
# breakpoint loop should re-engage
child.sendline('c')
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
assert "Attaching pdb to actor: ('breakpoint_forever'" in before
# wait for spawn error to show up
spawn_err = "Attaching to pdb in crashed actor: ('spawn_error'"
while spawn_err not in before:
child.sendline('c')
time.sleep(0.1)
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
# 2nd depth nursery should trigger
# child.sendline('c')
# child.expect(r"\(Pdb\+\+\)")
# before = str(child.before.decode())
assert spawn_err in before
assert "RemoteActorError: ('name_error_1'" in before
# now run some "continues" to show re-entries
for _ in range(5):
child.sendline('c')
child.expect(r"\(Pdb\+\+\)")
# quit the loop and expect parent to attach
child.sendline('q')
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
# debugger attaches to root
assert "Attaching to pdb in crashed actor: ('root'" in before
# expect a multierror with exceptions for each sub-actor
assert "RemoteActorError: ('breakpoint_forever'" in before
assert "RemoteActorError: ('name_error'" in before
assert "RemoteActorError: ('spawn_error'" in before
assert "RemoteActorError: ('name_error_1'" in before
assert 'bdb.BdbQuit' in before
# process should exit
child.sendline('c')
child.expect(pexpect.EOF)
# repeat of previous multierror for final output
before = str(child.before.decode())
assert "RemoteActorError: ('breakpoint_forever'" in before
assert "RemoteActorError: ('name_error'" in before
assert "RemoteActorError: ('spawn_error'" in before
assert "RemoteActorError: ('name_error_1'" in before
assert 'bdb.BdbQuit' in before
def test_multi_daemon_subactors(spawn, loglevel):
"""Multiple daemon subactors, both erroring and breakpointing within a
stream.
"""
child = spawn('multi_daemon_subactors')
child.expect(r"\(Pdb\+\+\)")
# there is a race for which subactor will acquire
# the root's tty lock first
before = str(child.before.decode())
bp_forever_msg = "Attaching pdb to actor: ('bp_forever'"
name_error_msg = "NameError"
if bp_forever_msg in before:
next_msg = name_error_msg
elif name_error_msg in before:
next_msg = bp_forever_msg
else:
raise ValueError("Neither log msg was found !?")
# NOTE: previously since we did not have clobber prevention
# in the root actor this final resume could result in the debugger
# tearing down since both child actors would be cancelled and it was
# unlikely that `bp_forever` would re-acquire the tty lock again.
# Now, we should have a final resumption in the root plus a possible
# second entry by `bp_forever`.
child.sendline('c')
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
assert next_msg in before
# XXX: hooray the root clobbering the child here was fixed!
# IMO, this demonstrates the true power of SC system design.
# now the root actor won't clobber the bp_forever child
# during it's first access to the debug lock, but will instead
# wait for the lock to release, by the edge triggered
# ``_debug._no_remote_has_tty`` event before sending cancel messages
# (via portals) to its underlings B)
# at some point here there should have been some warning msg from
# the root announcing it avoided a clobber of the child's lock, but
# it seems unreliable in testing here to gnab it:
# assert "in use by child ('bp_forever'," in before
# wait for final error in root
while True:
child.sendline('c')
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
try:
# root error should be packed as remote error
assert "_exceptions.RemoteActorError: ('name_error'" in before
break
except AssertionError:
assert bp_forever_msg in before
try:
child.sendline('c')
child.expect(pexpect.EOF)
except pexpect.exceptions.TIMEOUT:
# Failed to exit using continue..?
child.sendline('q')
child.expect(pexpect.EOF)
def test_multi_subactors_root_errors(spawn):
'''
Multiple subactors, both erroring and breakpointing as well as
a nested subactor erroring.
'''
child = spawn('multi_subactor_root_errors')
# scan for the pdbpp prompt
child.expect(r"\(Pdb\+\+\)")
# at most one subactor should attach before the root is cancelled
before = str(child.before.decode())
assert "NameError: name 'doggypants' is not defined" in before
# continue again to catch 2nd name error from
# actor 'name_error_1' (which is 2nd depth).
child.sendline('c')
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
assert "Attaching to pdb in crashed actor: ('name_error_1'" in before
assert "NameError" in before
child.sendline('c')
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
assert "Attaching to pdb in crashed actor: ('spawn_error'" in before
# boxed error from previous step
assert "RemoteActorError: ('name_error_1'" in before
assert "NameError" in before
child.sendline('c')
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
assert "Attaching to pdb in crashed actor: ('root'" in before
# boxed error from first level failure
assert "RemoteActorError: ('name_error'" in before
assert "NameError" in before
# warnings assert we probably don't need
# assert "Cancelling nursery in ('spawn_error'," in before
# continue again
child.sendline('c')
child.expect(pexpect.EOF)
before = str(child.before.decode())
# error from root actor and root task that created top level nursery
assert "AssertionError" in before
def test_multi_nested_subactors_error_through_nurseries(spawn):
"""Verify deeply nested actors that error trigger debugger entries
at each actor nurserly (level) all the way up the tree.
"""
# NOTE: previously, inside this script was a bug where if the
# parent errors before a 2-levels-lower actor has released the lock,
# the parent tries to cancel it but it's stuck in the debugger?
# A test (below) has now been added to explicitly verify this is
# fixed.
child = spawn('multi_nested_subactors_error_up_through_nurseries')
timed_out_early: bool = False
for i in range(12):
try:
child.expect(r"\(Pdb\+\+\)")
child.sendline('c')
time.sleep(0.1)
except pexpect.exceptions.EOF:
# race conditions on how fast the continue is sent?
print(f"Failed early on {i}?")
timed_out_early = True
break
else:
child.expect(pexpect.EOF)
if not timed_out_early:
before = str(child.before.decode())
assert "NameError" in before
def test_root_nursery_cancels_before_child_releases_tty_lock(
spawn,
start_method
):
"""Test that when the root sends a cancel message before a nested
child has unblocked (which can happen when it has the tty lock and
is engaged in pdb) it is indeed cancelled after exiting the debugger.
"""
timed_out_early = False
child = spawn('root_cancelled_but_child_is_in_tty_lock')
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
assert "NameError: name 'doggypants' is not defined" in before
assert "tractor._exceptions.RemoteActorError: ('name_error'" not in before
time.sleep(0.5)
child.sendline('c')
for i in range(4):
time.sleep(0.5)
try:
child.expect(r"\(Pdb\+\+\)")
except (
pexpect.exceptions.EOF,
pexpect.exceptions.TIMEOUT,
):
# races all over..
print(f"Failed early on {i}?")
before = str(child.before.decode())
timed_out_early = True
# race conditions on how fast the continue is sent?
break
before = str(child.before.decode())
assert "NameError: name 'doggypants' is not defined" in before
child.sendline('c')
while True:
try:
child.expect(pexpect.EOF)
break
except pexpect.exceptions.TIMEOUT:
child.sendline('c')
print('child was able to grab tty lock again?')
if not timed_out_early:
before = str(child.before.decode())
assert "tractor._exceptions.RemoteActorError: ('spawner0'" in before
assert "tractor._exceptions.RemoteActorError: ('name_error'" in before
assert "NameError: name 'doggypants' is not defined" in before
def test_root_cancels_child_context_during_startup(
spawn,
):
'''Verify a fast fail in the root doesn't lock up the child reaping
and all while using the new context api.
'''
child = spawn('fast_error_in_root_after_spawn')
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
assert "AssertionError" in before
child.sendline('c')
child.expect(pexpect.EOF)
def test_different_debug_mode_per_actor(
spawn,
):
child = spawn('per_actor_debug')
child.expect(r"\(Pdb\+\+\)")
# only one actor should enter the debugger
before = str(child.before.decode())
assert "Attaching to pdb in crashed actor: ('debugged_boi'" in before
assert "RuntimeError" in before
child.sendline('c')
child.expect(pexpect.EOF)
before = str(child.before.decode())
# NOTE: this debugged actor error currently WON'T show up since the
# root will actually cancel and terminate the nursery before the error
# msg reported back from the debug mode actor is processed.
# assert "tractor._exceptions.RemoteActorError: ('debugged_boi'" in before
assert "tractor._exceptions.RemoteActorError: ('crash_boi'" in before
# the crash boi should not have made a debugger request but
# instead crashed completely
assert "tractor._exceptions.RemoteActorError: ('crash_boi'" in before
assert "RuntimeError" in before

View File

@ -9,25 +9,24 @@ import itertools
import pytest
import tractor
from tractor._testing import tractor_test
import trio
from conftest import tractor_test
@tractor_test
async def test_reg_then_unreg(arb_addr):
async def test_reg_then_unreg(reg_addr):
actor = tractor.current_actor()
assert actor.is_arbiter
assert len(actor._registry) == 1 # only self is registered
async with tractor.open_nursery(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
) as n:
portal = await n.start_actor('actor', enable_modules=[__name__])
uid = portal.channel.uid
async with tractor.get_arbiter(*arb_addr) as aportal:
async with tractor.get_registry(*reg_addr) as aportal:
# this local actor should be the arbiter
assert actor is aportal.actor
@ -53,15 +52,27 @@ async def hi():
return the_line.format(tractor.current_actor().name)
async def say_hello(other_actor):
async def say_hello(
other_actor: str,
reg_addr: tuple[str, int],
):
await trio.sleep(1) # wait for other actor to spawn
async with tractor.find_actor(other_actor) as portal:
async with tractor.find_actor(
other_actor,
registry_addrs=[reg_addr],
) as portal:
assert portal is not None
return await portal.run(__name__, 'hi')
async def say_hello_use_wait(other_actor):
async with tractor.wait_for_actor(other_actor) as portal:
async def say_hello_use_wait(
other_actor: str,
reg_addr: tuple[str, int],
):
async with tractor.wait_for_actor(
other_actor,
registry_addr=reg_addr,
) as portal:
assert portal is not None
result = await portal.run(__name__, 'hi')
return result
@ -69,21 +80,29 @@ async def say_hello_use_wait(other_actor):
@tractor_test
@pytest.mark.parametrize('func', [say_hello, say_hello_use_wait])
async def test_trynamic_trio(func, start_method, arb_addr):
"""Main tractor entry point, the "master" process (for now
acts as the "director").
"""
async def test_trynamic_trio(
func,
start_method,
reg_addr,
):
'''
Root actor acting as the "director" and running one-shot-task-actors
for the directed subs.
'''
async with tractor.open_nursery() as n:
print("Alright... Action!")
donny = await n.run_in_actor(
func,
other_actor='gretchen',
reg_addr=reg_addr,
name='donny',
)
gretchen = await n.run_in_actor(
func,
other_actor='donny',
reg_addr=reg_addr,
name='gretchen',
)
print(await gretchen.result())
@ -131,7 +150,7 @@ async def unpack_reg(actor_or_portal):
async def spawn_and_check_registry(
arb_addr: tuple,
reg_addr: tuple,
use_signal: bool,
remote_arbiter: bool = False,
with_streaming: bool = False,
@ -139,9 +158,9 @@ async def spawn_and_check_registry(
) -> None:
async with tractor.open_root_actor(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
):
async with tractor.get_arbiter(*arb_addr) as portal:
async with tractor.get_registry(*reg_addr) as portal:
# runtime needs to be up to call this
actor = tractor.current_actor()
@ -162,7 +181,9 @@ async def spawn_and_check_registry(
try:
async with tractor.open_nursery() as n:
async with trio.open_nursery() as trion:
async with trio.open_nursery(
strict_exception_groups=False,
) as trion:
portals = {}
for i in range(3):
@ -213,17 +234,19 @@ async def spawn_and_check_registry(
def test_subactors_unregister_on_cancel(
start_method,
use_signal,
arb_addr,
reg_addr,
with_streaming,
):
"""Verify that cancelling a nursery results in all subactors
'''
Verify that cancelling a nursery results in all subactors
deregistering themselves with the arbiter.
"""
'''
with pytest.raises(KeyboardInterrupt):
trio.run(
partial(
spawn_and_check_registry,
arb_addr,
reg_addr,
use_signal,
remote_arbiter=False,
with_streaming=with_streaming,
@ -237,7 +260,7 @@ def test_subactors_unregister_on_cancel_remote_daemon(
daemon,
start_method,
use_signal,
arb_addr,
reg_addr,
with_streaming,
):
"""Verify that cancelling a nursery results in all subactors
@ -248,7 +271,7 @@ def test_subactors_unregister_on_cancel_remote_daemon(
trio.run(
partial(
spawn_and_check_registry,
arb_addr,
reg_addr,
use_signal,
remote_arbiter=True,
with_streaming=with_streaming,
@ -262,7 +285,7 @@ async def streamer(agen):
async def close_chans_before_nursery(
arb_addr: tuple,
reg_addr: tuple,
use_signal: bool,
remote_arbiter: bool = False,
) -> None:
@ -275,9 +298,9 @@ async def close_chans_before_nursery(
entries_at_end = 1
async with tractor.open_root_actor(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
):
async with tractor.get_arbiter(*arb_addr) as aportal:
async with tractor.get_registry(*reg_addr) as aportal:
try:
get_reg = partial(unpack_reg, aportal)
@ -295,7 +318,9 @@ async def close_chans_before_nursery(
async with portal2.open_stream_from(
stream_forever
) as agen2:
async with trio.open_nursery() as n:
async with trio.open_nursery(
strict_exception_groups=False,
) as n:
n.start_soon(streamer, agen1)
n.start_soon(cancel, use_signal, .5)
try:
@ -329,7 +354,7 @@ async def close_chans_before_nursery(
def test_close_channel_explicit(
start_method,
use_signal,
arb_addr,
reg_addr,
):
"""Verify that closing a stream explicitly and killing the actor's
"root nursery" **before** the containing nursery tears down also
@ -339,7 +364,7 @@ def test_close_channel_explicit(
trio.run(
partial(
close_chans_before_nursery,
arb_addr,
reg_addr,
use_signal,
remote_arbiter=False,
),
@ -351,7 +376,7 @@ def test_close_channel_explicit_remote_arbiter(
daemon,
start_method,
use_signal,
arb_addr,
reg_addr,
):
"""Verify that closing a stream explicitly and killing the actor's
"root nursery" **before** the containing nursery tears down also
@ -361,7 +386,7 @@ def test_close_channel_explicit_remote_arbiter(
trio.run(
partial(
close_chans_before_nursery,
arb_addr,
reg_addr,
use_signal,
remote_arbiter=True,
),

View File

@ -11,18 +11,17 @@ import platform
import shutil
import pytest
from conftest import repodir
def examples_dir():
"""Return the abspath to the examples directory.
"""
return os.path.join(repodir(), 'examples')
from tractor._testing import (
examples_dir,
)
@pytest.fixture
def run_example_in_subproc(loglevel, testdir, arb_addr):
def run_example_in_subproc(
loglevel: str,
testdir: pytest.Pytester,
reg_addr: tuple[str, int],
):
@contextmanager
def run(script_code):
@ -32,8 +31,8 @@ def run_example_in_subproc(loglevel, testdir, arb_addr):
# on windows we need to create a special __main__.py which will
# be executed with ``python -m <modulename>`` on windows..
shutil.copyfile(
os.path.join(examples_dir(), '__main__.py'),
os.path.join(str(testdir), '__main__.py')
examples_dir() / '__main__.py',
str(testdir / '__main__.py'),
)
# drop the ``if __name__ == '__main__'`` guard onwards from
@ -81,24 +80,37 @@ def run_example_in_subproc(loglevel, testdir, arb_addr):
'example_script',
# walk yields: (dirpath, dirnames, filenames)
[(p[0], f) for p in os.walk(examples_dir()) for f in p[2]
if '__' not in f
and f[0] != '_'
and 'debugging' not in p[0]],
[
(p[0], f)
for p in os.walk(examples_dir())
for f in p[2]
if (
'__' not in f
and f[0] != '_'
and 'debugging' not in p[0]
and 'integration' not in p[0]
and 'advanced_faults' not in p[0]
and 'multihost' not in p[0]
)
],
ids=lambda t: t[1],
)
def test_example(run_example_in_subproc, example_script):
"""Load and run scripts from this repo's ``examples/`` dir as a user
def test_example(
run_example_in_subproc,
example_script,
):
'''
Load and run scripts from this repo's ``examples/`` dir as a user
would copy and pasing them into their editor.
On windows a little more "finessing" is done to make
``multiprocessing`` play nice: we copy the ``__main__.py`` into the
test directory and invoke the script as a module with ``python -m
test_example``.
"""
ex_file = os.path.join(*example_script)
'''
ex_file: str = os.path.join(*example_script)
if 'rpc_bidir_streaming' in ex_file and sys.version_info < (3, 9):
pytest.skip("2-way streaming example requires py3.9 async with syntax")
@ -113,9 +125,20 @@ def test_example(run_example_in_subproc, example_script):
# print(f'STDOUT: {out}')
# if we get some gnarly output let's aggregate and raise
errmsg = err.decode()
errlines = errmsg.splitlines()
if err and 'Error' in errlines[-1]:
raise Exception(errmsg)
if err:
errmsg = err.decode()
errlines = errmsg.splitlines()
last_error = errlines[-1]
if (
'Error' in last_error
# XXX: currently we print this to console, but maybe
# shouldn't eventually once we figure out what's
# a better way to be explicit about aio side
# cancels?
and
'asyncio.exceptions.CancelledError' not in last_error
):
raise Exception(errmsg)
assert proc.returncode == 0

View File

@ -0,0 +1,946 @@
'''
Low-level functional audits for our
"capability based messaging"-spec feats.
B~)
'''
from contextlib import (
contextmanager as cm,
# nullcontext,
)
import importlib
from typing import (
Any,
Type,
Union,
)
from msgspec import (
# structs,
# msgpack,
Raw,
# Struct,
ValidationError,
)
import pytest
import trio
import tractor
from tractor import (
Actor,
# _state,
MsgTypeError,
Context,
)
from tractor.msg import (
_codec,
_ctxvar_MsgCodec,
_exts,
NamespacePath,
MsgCodec,
MsgDec,
mk_codec,
mk_dec,
apply_codec,
current_codec,
)
from tractor.msg.types import (
log,
Started,
# _payload_msgs,
# PayloadMsg,
# mk_msg_spec,
)
from tractor.msg._ops import (
limit_plds,
)
def enc_nsp(obj: Any) -> Any:
actor: Actor = tractor.current_actor(
err_on_no_runtime=False,
)
uid: tuple[str, str]|None = None if not actor else actor.uid
print(f'{uid} ENC HOOK')
match obj:
# case NamespacePath()|str():
case NamespacePath():
encoded: str = str(obj)
print(
f'----- ENCODING `NamespacePath` as `str` ------\n'
f'|_obj:{type(obj)!r} = {obj!r}\n'
f'|_encoded: str = {encoded!r}\n'
)
# if type(obj) != NamespacePath:
# breakpoint()
return encoded
case _:
logmsg: str = (
f'{uid}\n'
'FAILED ENCODE\n'
f'obj-> `{obj}: {type(obj)}`\n'
)
raise NotImplementedError(logmsg)
def dec_nsp(
obj_type: Type,
obj: Any,
) -> Any:
# breakpoint()
actor: Actor = tractor.current_actor(
err_on_no_runtime=False,
)
uid: tuple[str, str]|None = None if not actor else actor.uid
print(
f'{uid}\n'
'CUSTOM DECODE\n'
f'type-arg-> {obj_type}\n'
f'obj-arg-> `{obj}`: {type(obj)}\n'
)
nsp = None
# XXX, never happens right?
if obj_type is Raw:
breakpoint()
if (
obj_type is NamespacePath
and isinstance(obj, str)
and ':' in obj
):
nsp = NamespacePath(obj)
# TODO: we could built a generic handler using
# JUST matching the obj_type part?
# nsp = obj_type(obj)
if nsp:
print(f'Returning NSP instance: {nsp}')
return nsp
logmsg: str = (
f'{uid}\n'
'FAILED DECODE\n'
f'type-> {obj_type}\n'
f'obj-arg-> `{obj}`: {type(obj)}\n\n'
f'current codec:\n'
f'{current_codec()}\n'
)
# TODO: figure out the ignore subsys for this!
# -[ ] option whether to defense-relay backc the msg
# inside an `Invalid`/`Ignore`
# -[ ] how to make this handling pluggable such that a
# `Channel`/`MsgTransport` can intercept and process
# back msgs either via exception handling or some other
# signal?
log.warning(logmsg)
# NOTE: this delivers the invalid
# value up to `msgspec`'s decoding
# machinery for error raising.
return obj
# raise NotImplementedError(logmsg)
def ex_func(*args):
'''
A mod level func we can ref and load via our `NamespacePath`
python-object pointer `str` subtype.
'''
print(f'ex_func({args})')
@pytest.mark.parametrize(
'add_codec_hooks',
[
True,
False,
],
ids=['use_codec_hooks', 'no_codec_hooks'],
)
def test_custom_extension_types(
debug_mode: bool,
add_codec_hooks: bool
):
'''
Verify that a `MsgCodec` (used for encoding all outbound IPC msgs
and decoding all inbound `PayloadMsg`s) and a paired `MsgDec`
(used for decoding the `PayloadMsg.pld: Raw` received within a given
task's ipc `Context` scope) can both send and receive "extension types"
as supported via custom converter hooks passed to `msgspec`.
'''
nsp_pld_dec: MsgDec = mk_dec(
spec=None, # ONLY support the ext type
dec_hook=dec_nsp if add_codec_hooks else None,
ext_types=[NamespacePath],
)
nsp_codec: MsgCodec = mk_codec(
# ipc_pld_spec=Raw, # default!
# NOTE XXX: the encode hook MUST be used no matter what since
# our `NamespacePath` is not any of a `Any` native type nor
# a `msgspec.Struct` subtype - so `msgspec` has no way to know
# how to encode it unless we provide the custom hook.
#
# AGAIN that is, regardless of whether we spec an
# `Any`-decoded-pld the enc has no knowledge (by default)
# how to enc `NamespacePath` (nsp), so we add a custom
# hook to do that ALWAYS.
enc_hook=enc_nsp if add_codec_hooks else None,
# XXX NOTE: pretty sure this is mutex with the `type=` to
# `Decoder`? so it won't work in tandem with the
# `ipc_pld_spec` passed above?
ext_types=[NamespacePath],
# TODO? is it useful to have the `.pld` decoded *prior* to
# the `PldRx`?? like perf or mem related?
# ext_dec=nsp_pld_dec,
)
if add_codec_hooks:
assert nsp_codec.dec.dec_hook is None
# TODO? if we pass `ext_dec` above?
# assert nsp_codec.dec.dec_hook is dec_nsp
assert nsp_codec.enc.enc_hook is enc_nsp
nsp = NamespacePath.from_ref(ex_func)
try:
nsp_bytes: bytes = nsp_codec.encode(nsp)
nsp_rt_sin_msg = nsp_pld_dec.decode(nsp_bytes)
nsp_rt_sin_msg.load_ref() is ex_func
except TypeError:
if not add_codec_hooks:
pass
try:
msg_bytes: bytes = nsp_codec.encode(
Started(
cid='cid',
pld=nsp,
)
)
# since the ext-type obj should also be set as the msg.pld
assert nsp_bytes in msg_bytes
started_rt: Started = nsp_codec.decode(msg_bytes)
pld: Raw = started_rt.pld
assert isinstance(pld, Raw)
nsp_rt: NamespacePath = nsp_pld_dec.decode(pld)
assert isinstance(nsp_rt, NamespacePath)
# in obj comparison terms they should be the same
assert nsp_rt == nsp
# ensure we've decoded to ext type!
assert nsp_rt.load_ref() is ex_func
except TypeError:
if not add_codec_hooks:
pass
@tractor.context
async def sleep_forever_in_sub(
ctx: Context,
) -> None:
await trio.sleep_forever()
def mk_custom_codec(
add_hooks: bool,
) -> tuple[
MsgCodec, # encode to send
MsgDec, # pld receive-n-decode
]:
'''
Create custom `msgpack` enc/dec-hooks and set a `Decoder`
which only loads `pld_spec` (like `NamespacePath`) types.
'''
# XXX NOTE XXX: despite defining `NamespacePath` as a type
# field on our `PayloadMsg.pld`, we still need a enc/dec_hook() pair
# to cast to/from that type on the wire. See the docs:
# https://jcristharif.com/msgspec/extending.html#mapping-to-from-native-types
# if pld_spec is Any:
# pld_spec = Raw
nsp_codec: MsgCodec = mk_codec(
# ipc_pld_spec=Raw, # default!
# NOTE XXX: the encode hook MUST be used no matter what since
# our `NamespacePath` is not any of a `Any` native type nor
# a `msgspec.Struct` subtype - so `msgspec` has no way to know
# how to encode it unless we provide the custom hook.
#
# AGAIN that is, regardless of whether we spec an
# `Any`-decoded-pld the enc has no knowledge (by default)
# how to enc `NamespacePath` (nsp), so we add a custom
# hook to do that ALWAYS.
enc_hook=enc_nsp if add_hooks else None,
# XXX NOTE: pretty sure this is mutex with the `type=` to
# `Decoder`? so it won't work in tandem with the
# `ipc_pld_spec` passed above?
ext_types=[NamespacePath],
)
# dec_hook=dec_nsp if add_hooks else None,
return nsp_codec
@pytest.mark.parametrize(
'limit_plds_args',
[
(
{'dec_hook': None, 'ext_types': None},
None,
),
(
{'dec_hook': dec_nsp, 'ext_types': None},
TypeError,
),
(
{'dec_hook': dec_nsp, 'ext_types': [NamespacePath]},
None,
),
(
{'dec_hook': dec_nsp, 'ext_types': [NamespacePath|None]},
None,
),
],
ids=[
'no_hook_no_ext_types',
'only_hook',
'hook_and_ext_types',
'hook_and_ext_types_w_null',
]
)
def test_pld_limiting_usage(
limit_plds_args: tuple[dict, Exception|None],
):
'''
Verify `dec_hook()` and `ext_types` need to either both be
provided or we raise a explanator type-error.
'''
kwargs, maybe_err = limit_plds_args
async def main():
async with tractor.open_nursery() as an: # just to open runtime
# XXX SHOULD NEVER WORK outside an ipc ctx scope!
try:
with limit_plds(**kwargs):
pass
except RuntimeError:
pass
p: tractor.Portal = await an.start_actor(
'sub',
enable_modules=[__name__],
)
async with (
p.open_context(
sleep_forever_in_sub
) as (ctx, first),
):
try:
with limit_plds(**kwargs):
pass
except maybe_err as exc:
assert type(exc) is maybe_err
pass
def chk_codec_applied(
expect_codec: MsgCodec|None,
enter_value: MsgCodec|None = None,
) -> MsgCodec:
'''
buncha sanity checks ensuring that the IPC channel's
context-vars are set to the expected codec and that are
ctx-var wrapper APIs match the same.
'''
# TODO: play with tricyle again, bc this is supposed to work
# the way we want?
#
# TreeVar
# task: trio.Task = trio.lowlevel.current_task()
# curr_codec = _ctxvar_MsgCodec.get_in(task)
# ContextVar
# task_ctx: Context = task.context
# assert _ctxvar_MsgCodec in task_ctx
# curr_codec: MsgCodec = task.context[_ctxvar_MsgCodec]
if expect_codec is None:
assert enter_value is None
return
# NOTE: currently we use this!
# RunVar
curr_codec: MsgCodec = current_codec()
last_read_codec = _ctxvar_MsgCodec.get()
# assert curr_codec is last_read_codec
assert (
(same_codec := expect_codec) is
# returned from `mk_codec()`
# yielded value from `apply_codec()`
# read from current task's `contextvars.Context`
curr_codec is
last_read_codec
# the default `msgspec` settings
is not _codec._def_msgspec_codec
is not _codec._def_tractor_codec
)
if enter_value:
assert enter_value is same_codec
@tractor.context
async def send_back_values(
ctx: Context,
rent_pld_spec_type_strs: list[str],
add_hooks: bool,
) -> None:
'''
Setup up a custom codec to load instances of `NamespacePath`
and ensure we can round trip a func ref with our parent.
'''
uid: tuple = tractor.current_actor().uid
# init state in sub-actor should be default
chk_codec_applied(
expect_codec=_codec._def_tractor_codec,
)
# load pld spec from input str
rent_pld_spec = _exts.dec_type_union(
rent_pld_spec_type_strs,
mods=[
importlib.import_module(__name__),
],
)
rent_pld_spec_types: set[Type] = _codec.unpack_spec_types(
rent_pld_spec,
)
# ONLY add ext-hooks if the rent specified a non-std type!
add_hooks: bool = (
NamespacePath in rent_pld_spec_types
and
add_hooks
)
# same as on parent side config.
nsp_codec: MsgCodec|None = None
if add_hooks:
nsp_codec = mk_codec(
enc_hook=enc_nsp,
ext_types=[NamespacePath],
)
with (
maybe_apply_codec(nsp_codec) as codec,
limit_plds(
rent_pld_spec,
dec_hook=dec_nsp if add_hooks else None,
ext_types=[NamespacePath] if add_hooks else None,
) as pld_dec,
):
# ?XXX? SHOULD WE NOT be swapping the global codec since it
# breaks `Context.started()` roundtripping checks??
chk_codec_applied(
expect_codec=nsp_codec,
enter_value=codec,
)
# ?TODO, mismatch case(s)?
#
# ensure pld spec matches on both sides
ctx_pld_dec: MsgDec = ctx._pld_rx._pld_dec
assert pld_dec is ctx_pld_dec
child_pld_spec: Type = pld_dec.spec
child_pld_spec_types: set[Type] = _codec.unpack_spec_types(
child_pld_spec,
)
assert (
child_pld_spec_types.issuperset(
rent_pld_spec_types
)
)
# ?TODO, try loop for each of the types in pld-superset?
#
# for send_value in [
# nsp,
# str(nsp),
# None,
# ]:
nsp = NamespacePath.from_ref(ex_func)
try:
print(
f'{uid}: attempting to `.started({nsp})`\n'
f'\n'
f'rent_pld_spec: {rent_pld_spec}\n'
f'child_pld_spec: {child_pld_spec}\n'
f'codec: {codec}\n'
)
# await tractor.pause()
await ctx.started(nsp)
except tractor.MsgTypeError as _mte:
mte = _mte
# false -ve case
if add_hooks:
raise RuntimeError(
f'EXPECTED to `.started()` value given spec ??\n\n'
f'child_pld_spec -> {child_pld_spec}\n'
f'value = {nsp}: {type(nsp)}\n'
)
# true -ve case
raise mte
# TODO: maybe we should add our own wrapper error so as to
# be interchange-lib agnostic?
# -[ ] the error type is wtv is raised from the hook so we
# could also require a type-class of errors for
# indicating whether the hook-failure can be handled by
# a nasty-dialog-unprot sub-sys?
except TypeError as typerr:
# false -ve
if add_hooks:
raise RuntimeError('Should have been able to send `nsp`??')
# true -ve
print('Failed to send `nsp` due to no ext hooks set!')
raise typerr
# now try sending a set of valid and invalid plds to ensure
# the pld spec is respected.
sent: list[Any] = []
async with ctx.open_stream() as ipc:
print(
f'{uid}: streaming all pld types to rent..'
)
# for send_value, expect_send in iter_send_val_items:
for send_value in [
nsp,
str(nsp),
None,
]:
send_type: Type = type(send_value)
print(
f'{uid}: SENDING NEXT pld\n'
f'send_type: {send_type}\n'
f'send_value: {send_value}\n'
)
try:
await ipc.send(send_value)
sent.append(send_value)
except ValidationError as valerr:
print(f'{uid} FAILED TO SEND {send_value}!')
# false -ve
if add_hooks:
raise RuntimeError(
f'EXPECTED to roundtrip value given spec:\n'
f'rent_pld_spec -> {rent_pld_spec}\n'
f'child_pld_spec -> {child_pld_spec}\n'
f'value = {send_value}: {send_type}\n'
)
# true -ve
raise valerr
# continue
else:
print(
f'{uid}: finished sending all values\n'
'Should be exiting stream block!\n'
)
print(f'{uid}: exited streaming block!')
@cm
def maybe_apply_codec(codec: MsgCodec|None) -> MsgCodec|None:
if codec is None:
yield None
return
with apply_codec(codec) as codec:
yield codec
@pytest.mark.parametrize(
'pld_spec',
[
Any,
NamespacePath,
NamespacePath|None, # the "maybe" spec Bo
],
ids=[
'any_type',
'only_nsp_ext',
'maybe_nsp_ext',
]
)
@pytest.mark.parametrize(
'add_hooks',
[
True,
False,
],
ids=[
'use_codec_hooks',
'no_codec_hooks',
],
)
def test_ext_types_over_ipc(
debug_mode: bool,
pld_spec: Union[Type],
add_hooks: bool,
):
'''
Ensure we can support extension types coverted using
`enc/dec_hook()`s passed to the `.msg.limit_plds()` API
and that sane errors happen when we try do the same without
the codec hooks.
'''
pld_types: set[Type] = _codec.unpack_spec_types(pld_spec)
async def main():
# sanity check the default pld-spec beforehand
chk_codec_applied(
expect_codec=_codec._def_tractor_codec,
)
# extension type we want to send as msg payload
nsp = NamespacePath.from_ref(ex_func)
# ^NOTE, 2 cases:
# - codec hooks noto added -> decode nsp as `str`
# - codec with hooks -> decode nsp as `NamespacePath`
nsp_codec: MsgCodec|None = None
if (
NamespacePath in pld_types
and
add_hooks
):
nsp_codec = mk_codec(
enc_hook=enc_nsp,
ext_types=[NamespacePath],
)
async with tractor.open_nursery(
debug_mode=debug_mode,
) as an:
p: tractor.Portal = await an.start_actor(
'sub',
enable_modules=[__name__],
)
with (
maybe_apply_codec(nsp_codec) as codec,
):
chk_codec_applied(
expect_codec=nsp_codec,
enter_value=codec,
)
rent_pld_spec_type_strs: list[str] = _exts.enc_type_union(pld_spec)
# XXX should raise an mte (`MsgTypeError`)
# when `add_hooks == False` bc the input
# `expect_ipc_send` kwarg has a nsp which can't be
# serialized!
#
# TODO:can we ensure this happens from the
# `Return`-side (aka the sub) as well?
try:
ctx: tractor.Context
ipc: tractor.MsgStream
async with (
# XXX should raise an mte (`MsgTypeError`)
# when `add_hooks == False`..
p.open_context(
send_back_values,
# expect_debug=debug_mode,
rent_pld_spec_type_strs=rent_pld_spec_type_strs,
add_hooks=add_hooks,
# expect_ipc_send=expect_ipc_send,
) as (ctx, first),
ctx.open_stream() as ipc,
):
with (
limit_plds(
pld_spec,
dec_hook=dec_nsp if add_hooks else None,
ext_types=[NamespacePath] if add_hooks else None,
) as pld_dec,
):
ctx_pld_dec: MsgDec = ctx._pld_rx._pld_dec
assert pld_dec is ctx_pld_dec
# if (
# not add_hooks
# and
# NamespacePath in
# ):
# pytest.fail('ctx should fail to open without custom enc_hook!?')
await ipc.send(nsp)
nsp_rt = await ipc.receive()
assert nsp_rt == nsp
assert nsp_rt.load_ref() is ex_func
# this test passes bc we can go no further!
except MsgTypeError as mte:
# if not add_hooks:
# # teardown nursery
# await p.cancel_actor()
# return
raise mte
await p.cancel_actor()
if (
NamespacePath in pld_types
and
add_hooks
):
trio.run(main)
else:
with pytest.raises(
expected_exception=tractor.RemoteActorError,
) as excinfo:
trio.run(main)
exc = excinfo.value
# bc `.started(nsp: NamespacePath)` will raise
assert exc.boxed_type is TypeError
# def chk_pld_type(
# payload_spec: Type[Struct]|Any,
# pld: Any,
# expect_roundtrip: bool|None = None,
# ) -> bool:
# pld_val_type: Type = type(pld)
# # TODO: verify that the overridden subtypes
# # DO NOT have modified type-annots from original!
# # 'Start', .pld: FuncSpec
# # 'StartAck', .pld: IpcCtxSpec
# # 'Stop', .pld: UNSEt
# # 'Error', .pld: ErrorData
# codec: MsgCodec = mk_codec(
# # NOTE: this ONLY accepts `PayloadMsg.pld` fields of a specified
# # type union.
# ipc_pld_spec=payload_spec,
# )
# # make a one-off dec to compare with our `MsgCodec` instance
# # which does the below `mk_msg_spec()` call internally
# ipc_msg_spec: Union[Type[Struct]]
# msg_types: list[PayloadMsg[payload_spec]]
# (
# ipc_msg_spec,
# msg_types,
# ) = mk_msg_spec(
# payload_type_union=payload_spec,
# )
# _enc = msgpack.Encoder()
# _dec = msgpack.Decoder(
# type=ipc_msg_spec or Any, # like `PayloadMsg[Any]`
# )
# assert (
# payload_spec
# ==
# codec.pld_spec
# )
# # assert codec.dec == dec
# #
# # ^-XXX-^ not sure why these aren't "equal" but when cast
# # to `str` they seem to match ?? .. kk
# assert (
# str(ipc_msg_spec)
# ==
# str(codec.msg_spec)
# ==
# str(_dec.type)
# ==
# str(codec.dec.type)
# )
# # verify the boxed-type for all variable payload-type msgs.
# if not msg_types:
# breakpoint()
# roundtrip: bool|None = None
# pld_spec_msg_names: list[str] = [
# td.__name__ for td in _payload_msgs
# ]
# for typedef in msg_types:
# skip_runtime_msg: bool = typedef.__name__ not in pld_spec_msg_names
# if skip_runtime_msg:
# continue
# pld_field = structs.fields(typedef)[1]
# assert pld_field.type is payload_spec # TODO-^ does this need to work to get all subtypes to adhere?
# kwargs: dict[str, Any] = {
# 'cid': '666',
# 'pld': pld,
# }
# enc_msg: PayloadMsg = typedef(**kwargs)
# _wire_bytes: bytes = _enc.encode(enc_msg)
# wire_bytes: bytes = codec.enc.encode(enc_msg)
# assert _wire_bytes == wire_bytes
# ve: ValidationError|None = None
# try:
# dec_msg = codec.dec.decode(wire_bytes)
# _dec_msg = _dec.decode(wire_bytes)
# # decoded msg and thus payload should be exactly same!
# assert (roundtrip := (
# _dec_msg
# ==
# dec_msg
# ==
# enc_msg
# ))
# if (
# expect_roundtrip is not None
# and expect_roundtrip != roundtrip
# ):
# breakpoint()
# assert (
# pld
# ==
# dec_msg.pld
# ==
# enc_msg.pld
# )
# # assert (roundtrip := (_dec_msg == enc_msg))
# except ValidationError as _ve:
# ve = _ve
# roundtrip: bool = False
# if pld_val_type is payload_spec:
# raise ValueError(
# 'Got `ValidationError` despite type-var match!?\n'
# f'pld_val_type: {pld_val_type}\n'
# f'payload_type: {payload_spec}\n'
# ) from ve
# else:
# # ow we good cuz the pld spec mismatched.
# print(
# 'Got expected `ValidationError` since,\n'
# f'{pld_val_type} is not {payload_spec}\n'
# )
# else:
# if (
# payload_spec is not Any
# and
# pld_val_type is not payload_spec
# ):
# raise ValueError(
# 'DID NOT `ValidationError` despite expected type match!?\n'
# f'pld_val_type: {pld_val_type}\n'
# f'payload_type: {payload_spec}\n'
# )
# # full code decode should always be attempted!
# if roundtrip is None:
# breakpoint()
# return roundtrip
# ?TODO? maybe remove since covered in the newer `test_pldrx_limiting`
# via end-2-end testing of all this?
# -[ ] IOW do we really NEED this lowlevel unit testing?
#
# def test_limit_msgspec(
# debug_mode: bool,
# ):
# '''
# Internals unit testing to verify that type-limiting an IPC ctx's
# msg spec with `Pldrx.limit_plds()` results in various
# encapsulated `msgspec` object settings and state.
# '''
# async def main():
# async with tractor.open_root_actor(
# debug_mode=debug_mode,
# ):
# # ensure we can round-trip a boxing `PayloadMsg`
# assert chk_pld_type(
# payload_spec=Any,
# pld=None,
# expect_roundtrip=True,
# )
# # verify that a mis-typed payload value won't decode
# assert not chk_pld_type(
# payload_spec=int,
# pld='doggy',
# )
# # parametrize the boxed `.pld` type as a custom-struct
# # and ensure that parametrization propagates
# # to all payload-msg-spec-able subtypes!
# class CustomPayload(Struct):
# name: str
# value: Any
# assert not chk_pld_type(
# payload_spec=CustomPayload,
# pld='doggy',
# )
# assert chk_pld_type(
# payload_spec=CustomPayload,
# pld=CustomPayload(name='doggy', value='urmom')
# )
# # yah, we can `.pause_from_sync()` now!
# # breakpoint()
# trio.run(main)

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -7,9 +7,10 @@ import platform
import trio
import tractor
from tractor.testing import tractor_test
import pytest
from tractor._testing import tractor_test
def test_must_define_ctx():
@ -37,10 +38,13 @@ async def async_gen_stream(sequence):
assert cs.cancelled_caught
# TODO: deprecated either remove entirely
# or re-impl in terms of `MsgStream` one-sides
# wrapper, but at least remove `Portal.open_stream_from()`
@tractor.stream
async def context_stream(
ctx: tractor.Context,
sequence
sequence: list[int],
):
for i in sequence:
await ctx.send_yield(i)
@ -54,7 +58,7 @@ async def context_stream(
async def stream_from_single_subactor(
arb_addr,
reg_addr,
start_method,
stream_func,
):
@ -63,7 +67,7 @@ async def stream_from_single_subactor(
# only one per host address, spawns an actor if None
async with tractor.open_nursery(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
start_method=start_method,
) as nursery:
@ -114,13 +118,13 @@ async def stream_from_single_subactor(
@pytest.mark.parametrize(
'stream_func', [async_gen_stream, context_stream]
)
def test_stream_from_single_subactor(arb_addr, start_method, stream_func):
def test_stream_from_single_subactor(reg_addr, start_method, stream_func):
"""Verify streaming from a spawned async generator.
"""
trio.run(
partial(
stream_from_single_subactor,
arb_addr,
reg_addr,
start_method,
stream_func=stream_func,
),
@ -224,14 +228,14 @@ async def a_quadruple_example():
return result_stream
async def cancel_after(wait, arb_addr):
async with tractor.open_root_actor(arbiter_addr=arb_addr):
async def cancel_after(wait, reg_addr):
async with tractor.open_root_actor(registry_addrs=[reg_addr]):
with trio.move_on_after(wait):
return await a_quadruple_example()
@pytest.fixture(scope='module')
def time_quad_ex(arb_addr, ci_env, spawn_backend):
def time_quad_ex(reg_addr, ci_env, spawn_backend):
if spawn_backend == 'mp':
"""no idea but the mp *nix runs are flaking out here often...
"""
@ -239,7 +243,7 @@ def time_quad_ex(arb_addr, ci_env, spawn_backend):
timeout = 7 if platform.system() in ('Windows', 'Darwin') else 4
start = time.time()
results = trio.run(cancel_after, timeout, arb_addr)
results = trio.run(cancel_after, timeout, reg_addr)
diff = time.time() - start
assert results
return results, diff
@ -250,7 +254,7 @@ def test_a_quadruple_example(time_quad_ex, ci_env, spawn_backend):
results, diff = time_quad_ex
assert results
this_fast = 6 if platform.system() in ('Windows', 'Darwin') else 2.666
this_fast = 6 if platform.system() in ('Windows', 'Darwin') else 3
assert diff < this_fast
@ -259,14 +263,14 @@ def test_a_quadruple_example(time_quad_ex, ci_env, spawn_backend):
list(map(lambda i: i/10, range(3, 9)))
)
def test_not_fast_enough_quad(
arb_addr, time_quad_ex, cancel_delay, ci_env, spawn_backend
reg_addr, time_quad_ex, cancel_delay, ci_env, spawn_backend
):
"""Verify we can cancel midway through the quad example and all actors
cancel gracefully.
"""
results, diff = time_quad_ex
delay = max(diff - cancel_delay, 0)
results = trio.run(cancel_after, delay, arb_addr)
results = trio.run(cancel_after, delay, reg_addr)
system = platform.system()
if system in ('Windows', 'Darwin') and results is not None:
# In CI envoirments it seems later runs are quicker then the first
@ -279,7 +283,7 @@ def test_not_fast_enough_quad(
@tractor_test
async def test_respawn_consumer_task(
arb_addr,
reg_addr,
spawn_backend,
loglevel,
):

View File

@ -7,31 +7,24 @@ import pytest
import trio
import tractor
from conftest import tractor_test
from tractor._testing import tractor_test
@pytest.mark.trio
async def test_no_arbitter():
async def test_no_runtime():
"""An arbitter must be established before any nurseries
can be created.
(In other words ``tractor.open_root_actor()`` must be engaged at
some point?)
"""
with pytest.raises(RuntimeError):
with tractor.open_nursery():
with pytest.raises(RuntimeError) :
async with tractor.find_actor('doggy'):
pass
def test_no_main():
"""An async function **must** be passed to ``tractor.run()``.
"""
with pytest.raises(TypeError):
tractor.run(None)
@tractor_test
async def test_self_is_registered(arb_addr):
async def test_self_is_registered(reg_addr):
"Verify waiting on the arbiter to register itself using the standard api."
actor = tractor.current_actor()
assert actor.is_arbiter
@ -41,20 +34,20 @@ async def test_self_is_registered(arb_addr):
@tractor_test
async def test_self_is_registered_localportal(arb_addr):
async def test_self_is_registered_localportal(reg_addr):
"Verify waiting on the arbiter to register itself using a local portal."
actor = tractor.current_actor()
assert actor.is_arbiter
async with tractor.get_arbiter(*arb_addr) as portal:
async with tractor.get_registry(*reg_addr) as portal:
assert isinstance(portal, tractor._portal.LocalPortal)
with trio.fail_after(0.2):
sockaddr = await portal.run_from_ns(
'self', 'wait_for_actor', name='root')
assert sockaddr[0] == arb_addr
assert sockaddr[0] == reg_addr
def test_local_actor_async_func(arb_addr):
def test_local_actor_async_func(reg_addr):
"""Verify a simple async function in-process.
"""
nums = []
@ -62,7 +55,7 @@ def test_local_actor_async_func(arb_addr):
async def print_loop():
async with tractor.open_root_actor(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
):
# arbiter is started in-proc if dne
assert tractor.current_actor().is_arbiter

View File

@ -7,8 +7,10 @@ import time
import pytest
import trio
import tractor
from conftest import (
from tractor._testing import (
tractor_test,
)
from .conftest import (
sig_prog,
_INT_SIGNAL,
_INT_RETURN_CODE,
@ -28,9 +30,9 @@ def test_abort_on_sigint(daemon):
@tractor_test
async def test_cancel_remote_arbiter(daemon, arb_addr):
async def test_cancel_remote_arbiter(daemon, reg_addr):
assert not tractor.current_actor().is_arbiter
async with tractor.get_arbiter(*arb_addr) as portal:
async with tractor.get_registry(*reg_addr) as portal:
await portal.cancel_actor()
time.sleep(0.1)
@ -39,16 +41,16 @@ async def test_cancel_remote_arbiter(daemon, arb_addr):
# no arbiter socket should exist
with pytest.raises(OSError):
async with tractor.get_arbiter(*arb_addr) as portal:
async with tractor.get_registry(*reg_addr) as portal:
pass
def test_register_duplicate_name(daemon, arb_addr):
def test_register_duplicate_name(daemon, reg_addr):
async def main():
async with tractor.open_nursery(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
) as n:
assert not tractor.current_actor().is_arbiter

View File

@ -0,0 +1,364 @@
'''
Audit sub-sys APIs from `.msg._ops`
mostly for ensuring correct `contextvars`
related settings around IPC contexts.
'''
from contextlib import (
asynccontextmanager as acm,
)
from msgspec import (
Struct,
)
import pytest
import trio
import tractor
from tractor import (
Context,
MsgTypeError,
current_ipc_ctx,
Portal,
)
from tractor.msg import (
_ops as msgops,
Return,
)
from tractor.msg import (
_codec,
)
from tractor.msg.types import (
log,
)
class PldMsg(
Struct,
# TODO: with multiple structs in-spec we need to tag them!
# -[ ] offer a built-in `PldMsg` type to inherit from which takes
# case of these details?
#
# https://jcristharif.com/msgspec/structs.html#tagged-unions
# tag=True,
# tag_field='msg_type',
):
field: str
maybe_msg_spec = PldMsg|None
@acm
async def maybe_expect_raises(
raises: BaseException|None = None,
ensure_in_message: list[str]|None = None,
post_mortem: bool = False,
timeout: int = 3,
) -> None:
'''
Async wrapper for ensuring errors propagate from the inner scope.
'''
if tractor._state.debug_mode():
timeout += 999
with trio.fail_after(timeout):
try:
yield
except BaseException as _inner_err:
inner_err = _inner_err
# wasn't-expected to error..
if raises is None:
raise
else:
assert type(inner_err) is raises
# maybe check for error txt content
if ensure_in_message:
part: str
err_repr: str = repr(inner_err)
for part in ensure_in_message:
for i, arg in enumerate(inner_err.args):
if part in err_repr:
break
# if part never matches an arg, then we're
# missing a match.
else:
raise ValueError(
'Failed to find error message content?\n\n'
f'expected: {ensure_in_message!r}\n'
f'part: {part!r}\n\n'
f'{inner_err.args}'
)
if post_mortem:
await tractor.post_mortem()
else:
if raises:
raise RuntimeError(
f'Expected a {raises.__name__!r} to be raised?'
)
@tractor.context(
pld_spec=maybe_msg_spec,
)
async def child(
ctx: Context,
started_value: int|PldMsg|None,
return_value: str|None,
validate_pld_spec: bool,
raise_on_started_mte: bool = True,
) -> None:
'''
Call ``Context.started()`` more then once (an error).
'''
expect_started_mte: bool = started_value == 10
# sanaity check that child RPC context is the current one
curr_ctx: Context = current_ipc_ctx()
assert ctx is curr_ctx
rx: msgops.PldRx = ctx._pld_rx
curr_pldec: _codec.MsgDec = rx.pld_dec
ctx_meta: dict = getattr(
child,
'_tractor_context_meta',
None,
)
if ctx_meta:
assert (
ctx_meta['pld_spec']
is curr_pldec.spec
is curr_pldec.pld_spec
)
# 2 cases: hdndle send-side and recv-only validation
# - when `raise_on_started_mte == True`, send validate
# - else, parent-recv-side only validation
mte: MsgTypeError|None = None
try:
await ctx.started(
value=started_value,
validate_pld_spec=validate_pld_spec,
)
except MsgTypeError as _mte:
mte = _mte
log.exception('started()` raised an MTE!\n')
if not expect_started_mte:
raise RuntimeError(
'Child-ctx-task SHOULD NOT HAVE raised an MTE for\n\n'
f'{started_value!r}\n'
)
boxed_div: str = '------ - ------'
assert boxed_div not in mte._message
assert boxed_div not in mte.tb_str
assert boxed_div not in repr(mte)
assert boxed_div not in str(mte)
mte_repr: str = repr(mte)
for line in mte.message.splitlines():
assert line in mte_repr
# since this is a *local error* there should be no
# boxed traceback content!
assert not mte.tb_str
# propagate to parent?
if raise_on_started_mte:
raise
# no-send-side-error fallthrough
if (
validate_pld_spec
and
expect_started_mte
):
raise RuntimeError(
'Child-ctx-task SHOULD HAVE raised an MTE for\n\n'
f'{started_value!r}\n'
)
assert (
not expect_started_mte
or
not validate_pld_spec
)
# if wait_for_parent_to_cancel:
# ...
#
# ^-TODO-^ logic for diff validation policies on each side:
#
# -[ ] ensure that if we don't validate on the send
# side, that we are eventually error-cancelled by our
# parent due to the bad `Started` payload!
# -[ ] the boxed error should be srced from the parent's
# runtime NOT ours!
# -[ ] we should still error on bad `return_value`s
# despite the parent not yet error-cancelling us?
# |_ how do we want the parent side to look in that
# case?
# -[ ] maybe the equiv of "during handling of the
# above error another occurred" for the case where
# the parent sends a MTE to this child and while
# waiting for the child to terminate it gets back
# the MTE for this case?
#
# XXX should always fail on recv side since we can't
# really do much else beside terminate and relay the
# msg-type-error from this RPC task ;)
return return_value
@pytest.mark.parametrize(
'return_value',
[
'yo',
None,
],
ids=[
'return[invalid-"yo"]',
'return[valid-None]',
],
)
@pytest.mark.parametrize(
'started_value',
[
10,
PldMsg(field='yo'),
],
ids=[
'Started[invalid-10]',
'Started[valid-PldMsg]',
],
)
@pytest.mark.parametrize(
'pld_check_started_value',
[
True,
False,
],
ids=[
'check-started-pld',
'no-started-pld-validate',
],
)
def test_basic_payload_spec(
debug_mode: bool,
loglevel: str,
return_value: str|None,
started_value: int|PldMsg,
pld_check_started_value: bool,
):
'''
Validate the most basic `PldRx` msg-type-spec semantics around
a IPC `Context` endpoint start, started-sync, and final return
value depending on set payload types and the currently applied
pld-spec.
'''
invalid_return: bool = return_value == 'yo'
invalid_started: bool = started_value == 10
async def main():
async with tractor.open_nursery(
debug_mode=debug_mode,
loglevel=loglevel,
) as an:
p: Portal = await an.start_actor(
'child',
enable_modules=[__name__],
)
# since not opened yet.
assert current_ipc_ctx() is None
if invalid_started:
msg_type_str: str = 'Started'
bad_value: int = 10
elif invalid_return:
msg_type_str: str = 'Return'
bad_value: str = 'yo'
else:
# XXX but should never be used below then..
msg_type_str: str = ''
bad_value: str = ''
maybe_mte: MsgTypeError|None = None
should_raise: Exception|None = (
MsgTypeError if (
invalid_return
or
invalid_started
) else None
)
async with (
maybe_expect_raises(
raises=should_raise,
ensure_in_message=[
f"invalid `{msg_type_str}` msg payload",
f'{bad_value}',
f'has type {type(bad_value)!r}',
'not match type-spec',
f'`{msg_type_str}.pld: PldMsg|NoneType`',
],
# only for debug
# post_mortem=True,
),
p.open_context(
child,
return_value=return_value,
started_value=started_value,
validate_pld_spec=pld_check_started_value,
) as (ctx, first),
):
# now opened with 'child' sub
assert current_ipc_ctx() is ctx
assert type(first) is PldMsg
assert first.field == 'yo'
try:
res: None|PldMsg = await ctx.result(hide_tb=False)
assert res is None
except MsgTypeError as mte:
maybe_mte = mte
if not invalid_return:
raise
# expected this invalid `Return.pld` so audit
# the error state + meta-data
assert mte.expected_msg_type is Return
assert mte.cid == ctx.cid
mte_repr: str = repr(mte)
for line in mte.message.splitlines():
assert line in mte_repr
assert mte.tb_str
# await tractor.pause(shield=True)
# verify expected remote mte deats
assert ctx._local_error is None
assert (
mte is
ctx._remote_error is
ctx.maybe_error is
ctx.outcome
)
if should_raise is None:
assert maybe_mte is None
await p.cancel_actor()
trio.run(main)

View File

@ -4,8 +4,8 @@ from itertools import cycle
import pytest
import trio
import tractor
from tractor.testing import tractor_test
from tractor.experimental import msgpub
from tractor._testing import tractor_test
def test_type_checks():
@ -159,7 +159,7 @@ async def test_required_args(callwith_expecterror):
)
def test_multi_actor_subs_arbiter_pub(
loglevel,
arb_addr,
reg_addr,
pub_actor,
):
"""Try out the neato @pub decorator system.
@ -169,7 +169,7 @@ def test_multi_actor_subs_arbiter_pub(
async def main():
async with tractor.open_nursery(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
enable_modules=[__name__],
) as n:
@ -254,12 +254,12 @@ def test_multi_actor_subs_arbiter_pub(
def test_single_subactor_pub_multitask_subs(
loglevel,
arb_addr,
reg_addr,
):
async def main():
async with tractor.open_nursery(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
enable_modules=[__name__],
) as n:

View File

@ -34,7 +34,6 @@ def test_resource_only_entered_once(key_on):
global _resource
_resource = 0
kwargs = {}
key = None
if key_on == 'key_value':
key = 'some_common_key'
@ -139,7 +138,7 @@ def test_open_local_sub_to_stream():
N local tasks using ``trionics.maybe_open_context():``.
'''
timeout = 3 if platform.system() != "Windows" else 10
timeout: float = 3.6 if platform.system() != "Windows" else 10
async def main():

View File

@ -0,0 +1,248 @@
'''
Special attention cases for using "infect `asyncio`" mode from a root
actor; i.e. not using a std `trio.run()` bootstrap.
'''
import asyncio
from functools import partial
import pytest
import trio
import tractor
from tractor import (
to_asyncio,
)
from tests.test_infected_asyncio import (
aio_echo_server,
)
@pytest.mark.parametrize(
'raise_error_mid_stream',
[
False,
Exception,
KeyboardInterrupt,
],
ids='raise_error={}'.format,
)
def test_infected_root_actor(
raise_error_mid_stream: bool|Exception,
# conftest wide
loglevel: str,
debug_mode: bool,
):
'''
Verify you can run the `tractor` runtime with `Actor.is_infected_aio() == True`
in the root actor.
'''
async def _trio_main():
with trio.fail_after(2 if not debug_mode else 999):
first: str
chan: to_asyncio.LinkedTaskChannel
async with (
tractor.open_root_actor(
debug_mode=debug_mode,
loglevel=loglevel,
),
to_asyncio.open_channel_from(
aio_echo_server,
) as (first, chan),
):
assert first == 'start'
for i in range(1000):
await chan.send(i)
out = await chan.receive()
assert out == i
print(f'asyncio echoing {i}')
if (
raise_error_mid_stream
and
i == 500
):
raise raise_error_mid_stream
if out is None:
try:
out = await chan.receive()
except trio.EndOfChannel:
break
else:
raise RuntimeError(
'aio channel never stopped?'
)
if raise_error_mid_stream:
with pytest.raises(raise_error_mid_stream):
tractor.to_asyncio.run_as_asyncio_guest(
trio_main=_trio_main,
)
else:
tractor.to_asyncio.run_as_asyncio_guest(
trio_main=_trio_main,
)
async def sync_and_err(
# just signature placeholders for compat with
# ``to_asyncio.open_channel_from()``
to_trio: trio.MemorySendChannel,
from_trio: asyncio.Queue,
ev: asyncio.Event,
):
if to_trio:
to_trio.send_nowait('start')
await ev.wait()
raise RuntimeError('asyncio-side')
@pytest.mark.parametrize(
'aio_err_trigger',
[
'before_start_point',
'after_trio_task_starts',
'after_start_point',
],
ids='aio_err_triggered={}'.format
)
def test_trio_prestarted_task_bubbles(
aio_err_trigger: str,
# conftest wide
loglevel: str,
debug_mode: bool,
):
async def pre_started_err(
raise_err: bool = False,
pre_sleep: float|None = None,
aio_trigger: asyncio.Event|None = None,
task_status=trio.TASK_STATUS_IGNORED,
):
'''
Maybe pre-started error then sleep.
'''
if pre_sleep is not None:
print(f'Sleeping from trio for {pre_sleep!r}s !')
await trio.sleep(pre_sleep)
# signal aio-task to raise JUST AFTER this task
# starts but has not yet `.started()`
if aio_trigger:
print('Signalling aio-task to raise from `trio`!!')
aio_trigger.set()
if raise_err:
print('Raising from trio!')
raise TypeError('trio-side')
task_status.started()
await trio.sleep_forever()
async def _trio_main():
# with trio.fail_after(2):
with trio.fail_after(999):
first: str
chan: to_asyncio.LinkedTaskChannel
aio_ev = asyncio.Event()
async with (
tractor.open_root_actor(
debug_mode=False,
loglevel=loglevel,
),
):
# TODO, tests for this with 3.13 egs?
# from tractor.devx import open_crash_handler
# with open_crash_handler():
async with (
# where we'll start a sub-task that errors BEFORE
# calling `.started()` such that the error should
# bubble before the guest run terminates!
trio.open_nursery() as tn,
# THEN start an infect task which should error just
# after the trio-side's task does.
to_asyncio.open_channel_from(
partial(
sync_and_err,
ev=aio_ev,
)
) as (first, chan),
):
for i in range(5):
pre_sleep: float|None = None
last_iter: bool = (i == 4)
# TODO, missing cases?
# -[ ] error as well on
# 'after_start_point' case as well for
# another case?
raise_err: bool = False
if last_iter:
raise_err: bool = True
# trigger aio task to error on next loop
# tick/checkpoint
if aio_err_trigger == 'before_start_point':
aio_ev.set()
pre_sleep: float = 0
await tn.start(
pre_started_err,
raise_err,
pre_sleep,
(aio_ev if (
aio_err_trigger == 'after_trio_task_starts'
and
last_iter
) else None
),
)
if (
aio_err_trigger == 'after_start_point'
and
last_iter
):
aio_ev.set()
with pytest.raises(
expected_exception=ExceptionGroup,
) as excinfo:
tractor.to_asyncio.run_as_asyncio_guest(
trio_main=_trio_main,
)
eg = excinfo.value
rte_eg, rest_eg = eg.split(RuntimeError)
# ensure the trio-task's error bubbled despite the aio-side
# having (maybe) errored first.
if aio_err_trigger in (
'after_trio_task_starts',
'after_start_point',
):
assert len(errs := rest_eg.exceptions) == 1
typerr = errs[0]
assert (
type(typerr) is TypeError
and
'trio-side' in typerr.args
)
# when aio errors BEFORE (last) trio task is scheduled, we should
# never see anythinb but the aio-side.
else:
assert len(rtes := rte_eg.exceptions) == 1
assert 'asyncio-side' in rtes[0].args[0]

View File

@ -1,6 +1,8 @@
"""
RPC related
"""
'''
RPC (or maybe better labelled as "RTS: remote task scheduling"?)
related API and error checks.
'''
import itertools
import pytest
@ -13,9 +15,19 @@ async def sleep_back_actor(
func_name,
func_defined,
exposed_mods,
*,
reg_addr: tuple,
):
if actor_name:
async with tractor.find_actor(actor_name) as portal:
async with tractor.find_actor(
actor_name,
# NOTE: must be set manually since
# the subactor doesn't have the reg_addr
# fixture code run in it!
# TODO: maybe we should just set this once in the
# _state mod and derive to all children?
registry_addrs=[reg_addr],
) as portal:
try:
await portal.run(__name__, func_name)
except tractor.RemoteActorError as err:
@ -24,7 +36,7 @@ async def sleep_back_actor(
if not exposed_mods:
expect = tractor.ModuleNotExposed
assert err.type is expect
assert err.boxed_type is expect
raise
else:
await trio.sleep(float('inf'))
@ -42,14 +54,25 @@ async def short_sleep():
(['tmp_mod'], 'import doggy', ModuleNotFoundError),
(['tmp_mod'], '4doggy', SyntaxError),
],
ids=['no_mods', 'this_mod', 'this_mod_bad_func', 'fail_to_import',
'fail_on_syntax'],
ids=[
'no_mods',
'this_mod',
'this_mod_bad_func',
'fail_to_import',
'fail_on_syntax',
],
)
def test_rpc_errors(arb_addr, to_call, testdir):
"""Test errors when making various RPC requests to an actor
def test_rpc_errors(
reg_addr,
to_call,
testdir,
):
'''
Test errors when making various RPC requests to an actor
that either doesn't have the requested module exposed or doesn't define
the named function.
"""
'''
exposed_mods, funcname, inside_err = to_call
subactor_exposed_mods = []
func_defined = globals().get(funcname, False)
@ -77,8 +100,13 @@ def test_rpc_errors(arb_addr, to_call, testdir):
# spawn a subactor which calls us back
async with tractor.open_nursery(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
enable_modules=exposed_mods.copy(),
# NOTE: will halt test in REPL if uncommented, so only
# do that if actually debugging subactor but keep it
# disabled for the test.
# debug_mode=True,
) as n:
actor = tractor.current_actor()
@ -95,6 +123,7 @@ def test_rpc_errors(arb_addr, to_call, testdir):
exposed_mods=exposed_mods,
func_defined=True if func_defined else False,
enable_modules=subactor_exposed_mods,
reg_addr=reg_addr,
)
def run():
@ -105,18 +134,20 @@ def test_rpc_errors(arb_addr, to_call, testdir):
run()
else:
# underlying errors aren't propagated upwards (yet)
with pytest.raises(remote_err) as err:
with pytest.raises(
expected_exception=(remote_err, ExceptionGroup),
) as err:
run()
# get raw instance from pytest wrapper
value = err.value
# might get multiple `trio.Cancelled`s as well inside an inception
if isinstance(value, trio.MultiError):
if isinstance(value, ExceptionGroup):
value = next(itertools.dropwhile(
lambda exc: not isinstance(exc, tractor.RemoteActorError),
value.exceptions
))
if getattr(value, 'type', None):
assert value.type is inside_err
assert value.boxed_type is inside_err

View File

@ -0,0 +1,74 @@
"""
Verifying internal runtime state and undocumented extras.
"""
import os
import pytest
import trio
import tractor
from tractor._testing import tractor_test
_file_path: str = ''
def unlink_file():
print('Removing tmp file!')
os.remove(_file_path)
async def crash_and_clean_tmpdir(
tmp_file_path: str,
error: bool = True,
):
global _file_path
_file_path = tmp_file_path
actor = tractor.current_actor()
actor.lifetime_stack.callback(unlink_file)
assert os.path.isfile(tmp_file_path)
await trio.sleep(0.1)
if error:
assert 0
else:
actor.cancel_soon()
@pytest.mark.parametrize(
'error_in_child',
[True, False],
)
@tractor_test
async def test_lifetime_stack_wipes_tmpfile(
tmp_path,
error_in_child: bool,
):
child_tmp_file = tmp_path / "child.txt"
child_tmp_file.touch()
assert child_tmp_file.exists()
path = str(child_tmp_file)
try:
with trio.move_on_after(0.5):
async with tractor.open_nursery() as n:
await ( # inlined portal
await n.run_in_actor(
crash_and_clean_tmpdir,
tmp_file_path=path,
error=error_in_child,
)
).result()
except (
tractor.RemoteActorError,
# tractor.BaseExceptionGroup,
BaseExceptionGroup,
):
pass
# tmp file should have been wiped by
# teardown stack.
assert not child_tmp_file.exists()

View File

@ -1,38 +1,38 @@
"""
Spawning basics
"""
from typing import Dict, Tuple, Optional
from typing import (
Any,
)
import pytest
import trio
import tractor
from conftest import tractor_test
from tractor._testing import tractor_test
data_to_pass_down = {'doggy': 10, 'kitty': 4}
async def spawn(
is_arbiter: bool,
data: Dict,
arb_addr: Tuple[str, int],
data: dict,
reg_addr: tuple[str, int],
):
namespaces = [__name__]
await trio.sleep(0.1)
async with tractor.open_root_actor(
arbiter_addr=arb_addr,
arbiter_addr=reg_addr,
):
actor = tractor.current_actor()
assert actor.is_arbiter == is_arbiter
data = data_to_pass_down
if actor.is_arbiter:
async with tractor.open_nursery(
) as nursery:
async with tractor.open_nursery() as nursery:
# forks here
portal = await nursery.run_in_actor(
@ -40,7 +40,7 @@ async def spawn(
is_arbiter=False,
name='sub-actor',
data=data,
arb_addr=arb_addr,
reg_addr=reg_addr,
enable_modules=namespaces,
)
@ -54,12 +54,14 @@ async def spawn(
return 10
def test_local_arbiter_subactor_global_state(arb_addr):
def test_local_arbiter_subactor_global_state(
reg_addr,
):
result = trio.run(
spawn,
True,
data_to_pass_down,
arb_addr,
reg_addr,
)
assert result == 10
@ -93,7 +95,9 @@ async def test_movie_theatre_convo(start_method):
await portal.cancel_actor()
async def cellar_door(return_value: Optional[str]):
async def cellar_door(
return_value: str|None,
):
return return_value
@ -103,16 +107,18 @@ async def cellar_door(return_value: Optional[str]):
)
@tractor_test
async def test_most_beautiful_word(
start_method,
return_value
start_method: str,
return_value: Any,
debug_mode: bool,
):
'''
The main ``tractor`` routine.
'''
with trio.fail_after(1):
async with tractor.open_nursery() as n:
async with tractor.open_nursery(
debug_mode=debug_mode,
) as n:
portal = await n.run_in_actor(
cellar_door,
return_value=return_value,
@ -139,9 +145,9 @@ async def check_loglevel(level):
def test_loglevel_propagated_to_subactor(
start_method,
capfd,
arb_addr,
reg_addr,
):
if start_method == 'forkserver':
if start_method == 'mp_forkserver':
pytest.skip(
"a bug with `capfd` seems to make forkserver capture not work?")
@ -150,13 +156,13 @@ def test_loglevel_propagated_to_subactor(
async def main():
async with tractor.open_nursery(
name='arbiter',
loglevel=level,
start_method=start_method,
arbiter_addr=arb_addr,
arbiter_addr=reg_addr,
) as tn:
await tn.run_in_actor(
check_loglevel,
loglevel=level,
level=level,
)

View File

@ -2,17 +2,23 @@
Broadcast channels for fan-out to local tasks.
"""
from contextlib import asynccontextmanager
from contextlib import (
asynccontextmanager as acm,
)
from functools import partial
from itertools import cycle
import time
from typing import Optional, List, Tuple
from typing import Optional
import pytest
import trio
from trio.lowlevel import current_task
import tractor
from tractor.trionics import broadcast_receiver, Lagged
from tractor.trionics import (
broadcast_receiver,
Lagged,
collapse_eg,
)
@tractor.context
@ -37,7 +43,7 @@ async def echo_sequences(
async def ensure_sequence(
stream: tractor.ReceiveMsgStream,
stream: tractor.MsgStream,
sequence: list,
delay: Optional[float] = None,
@ -59,21 +65,21 @@ async def ensure_sequence(
break
@asynccontextmanager
@acm
async def open_sequence_streamer(
sequence: List[int],
arb_addr: Tuple[str, int],
sequence: list[int],
reg_addr: tuple[str, int],
start_method: str,
) -> tractor.MsgStream:
async with tractor.open_nursery(
arbiter_addr=arb_addr,
arbiter_addr=reg_addr,
start_method=start_method,
) as tn:
) as an:
portal = await tn.start_actor(
portal = await an.start_actor(
'sequence_echoer',
enable_modules=[__name__],
)
@ -83,14 +89,14 @@ async def open_sequence_streamer(
) as (ctx, first):
assert first is None
async with ctx.open_stream(backpressure=True) as stream:
async with ctx.open_stream(allow_overruns=True) as stream:
yield stream
await portal.cancel_actor()
def test_stream_fan_out_to_local_subscriptions(
arb_addr,
reg_addr,
start_method,
):
@ -100,7 +106,7 @@ def test_stream_fan_out_to_local_subscriptions(
async with open_sequence_streamer(
sequence,
arb_addr,
reg_addr,
start_method,
) as stream:
@ -135,7 +141,7 @@ def test_stream_fan_out_to_local_subscriptions(
]
)
def test_consumer_and_parent_maybe_lag(
arb_addr,
reg_addr,
start_method,
task_delays,
):
@ -147,14 +153,17 @@ def test_consumer_and_parent_maybe_lag(
async with open_sequence_streamer(
sequence,
arb_addr,
reg_addr,
start_method,
) as stream:
try:
async with trio.open_nursery() as n:
async with (
collapse_eg(),
trio.open_nursery() as tn,
):
n.start_soon(
tn.start_soon(
ensure_sequence,
stream,
sequence.copy(),
@ -208,10 +217,11 @@ def test_consumer_and_parent_maybe_lag(
def test_faster_task_to_recv_is_cancelled_by_slower(
arb_addr,
reg_addr,
start_method,
):
'''Ensure that if a faster task consuming from a stream is cancelled
'''
Ensure that if a faster task consuming from a stream is cancelled
the slower task can continue to receive all expected values.
'''
@ -221,13 +231,13 @@ def test_faster_task_to_recv_is_cancelled_by_slower(
async with open_sequence_streamer(
sequence,
arb_addr,
reg_addr,
start_method,
) as stream:
async with trio.open_nursery() as n:
n.start_soon(
async with trio.open_nursery() as tn:
tn.start_soon(
ensure_sequence,
stream,
sequence.copy(),
@ -249,7 +259,7 @@ def test_faster_task_to_recv_is_cancelled_by_slower(
continue
print('cancelling faster subtask')
n.cancel_scope.cancel()
tn.cancel_scope.cancel()
try:
value = await stream.receive()
@ -267,7 +277,7 @@ def test_faster_task_to_recv_is_cancelled_by_slower(
# the faster subtask was cancelled
break
# await tractor.breakpoint()
# await tractor.pause()
# await stream.receive()
print(f'final value: {value}')
@ -298,7 +308,7 @@ def test_subscribe_errors_after_close():
def test_ensure_slow_consumers_lag_out(
arb_addr,
reg_addr,
start_method,
):
'''This is a pure local task test; no tractor
@ -367,13 +377,13 @@ def test_ensure_slow_consumers_lag_out(
f'on {lags}:{value}')
return
async with trio.open_nursery() as nursery:
async with trio.open_nursery() as tn:
for i in range(1, num_laggers):
task_name = f'sub_{i}'
laggers[task_name] = 0
nursery.start_soon(
tn.start_soon(
partial(
sub_and_print,
delay=i*0.001,
@ -409,8 +419,8 @@ def test_ensure_slow_consumers_lag_out(
seq = brx._state.subs[brx.key]
assert seq == len(brx._state.queue) - 1
# all backpressured entries in the underlying
# channel should have been copied into the caster
# all no_overruns entries in the underlying
# channel should have been copied into the bcaster
# queue trailing-window
async for i in rx:
print(f'bped: {i}')
@ -460,3 +470,52 @@ def test_first_recver_is_cancelled():
assert value == 1
trio.run(main)
def test_no_raise_on_lag():
'''
Run a simple 2-task broadcast where one task is slow but configured
so that it does not raise `Lagged` on overruns using
`raise_on_lasg=False` and verify that the task does not raise.
'''
size = 100
tx, rx = trio.open_memory_channel(size)
brx = broadcast_receiver(rx, size)
async def slow():
async with brx.subscribe(
raise_on_lag=False,
) as br:
async for msg in br:
print(f'slow task got: {msg}')
await trio.sleep(0.1)
async def fast():
async with brx.subscribe() as br:
async for msg in br:
print(f'fast task got: {msg}')
async def main():
async with (
tractor.open_root_actor(
# NOTE: so we see the warning msg emitted by the bcaster
# internals when the no raise flag is set.
loglevel='warning',
),
collapse_eg(),
trio.open_nursery() as n,
):
n.start_soon(slow)
n.start_soon(fast)
for i in range(1000):
await tx.send(i)
# simulate user nailing ctl-c after realizing
# there's a lag in the slow task.
await trio.sleep(1)
raise KeyboardInterrupt
with pytest.raises(KeyboardInterrupt):
trio.run(main)

View File

@ -3,9 +3,13 @@ Reminders for oddities in `trio` that we need to stay aware of and/or
want to see changed.
'''
from contextlib import (
asynccontextmanager as acm,
)
import pytest
import trio
from trio_typing import TaskStatus
from trio import TaskStatus
@pytest.mark.parametrize(
@ -60,7 +64,9 @@ def test_stashed_child_nursery(use_start_soon):
async def main():
async with (
trio.open_nursery() as pn,
trio.open_nursery(
strict_exception_groups=False,
) as pn,
):
cn = await pn.start(mk_child_nursery)
assert cn
@ -80,3 +86,118 @@ def test_stashed_child_nursery(use_start_soon):
with pytest.raises(NameError):
trio.run(main)
@pytest.mark.parametrize(
('unmask_from_canc', 'canc_from_finally'),
[
(True, False),
(True, True),
pytest.param(False, True,
marks=pytest.mark.xfail(reason="never raises!")
),
],
# TODO, ask ronny how to impl this .. XD
# ids='unmask_from_canc={0}, canc_from_finally={1}',#.format,
)
def test_acm_embedded_nursery_propagates_enter_err(
canc_from_finally: bool,
unmask_from_canc: bool,
debug_mode: bool,
):
'''
Demo how a masking `trio.Cancelled` could be handled by unmasking from the
`.__context__` field when a user (by accident) re-raises from a `finally:`.
'''
import tractor
@acm
async def maybe_raise_from_masking_exc(
tn: trio.Nursery,
unmask_from: BaseException|None = trio.Cancelled
# TODO, maybe offer a collection?
# unmask_from: set[BaseException] = {
# trio.Cancelled,
# },
):
if not unmask_from:
yield
return
try:
yield
except* unmask_from as be_eg:
# TODO, if we offer `unmask_from: set`
# for masker_exc_type in unmask_from:
matches, rest = be_eg.split(unmask_from)
if not matches:
raise
for exc_match in be_eg.exceptions:
if (
(exc_ctx := exc_match.__context__)
and
type(exc_ctx) not in {
# trio.Cancelled, # always by default?
unmask_from,
}
):
exc_ctx.add_note(
f'\n'
f'WARNING: the above error was masked by a {unmask_from!r} !?!\n'
f'Are you always cancelling? Say from a `finally:` ?\n\n'
f'{tn!r}'
)
raise exc_ctx from exc_match
@acm
async def wraps_tn_that_always_cancels():
async with (
trio.open_nursery() as tn,
maybe_raise_from_masking_exc(
tn=tn,
unmask_from=(
trio.Cancelled
if unmask_from_canc
else None
),
)
):
try:
yield tn
finally:
if canc_from_finally:
tn.cancel_scope.cancel()
await trio.lowlevel.checkpoint()
async def _main():
with tractor.devx.maybe_open_crash_handler(
pdb=debug_mode,
) as bxerr:
assert not bxerr.value
async with (
wraps_tn_that_always_cancels() as tn,
):
assert not tn.cancel_scope.cancel_called
assert 0
assert (
(err := bxerr.value)
and
type(err) is AssertionError
)
with pytest.raises(ExceptionGroup) as excinfo:
trio.run(_main)
eg: ExceptionGroup = excinfo.value
assert_eg, rest_eg = eg.split(AssertionError)
assert len(assert_eg.exceptions) == 1

View File

@ -1,7 +0,0 @@
[tool.towncrier]
package = "tractor"
filename = "NEWS.rst"
directory = "nooz/"
title_format = "tractor {version} ({project_date})"
version = "0.1.0a4"
template = "nooz/_template.rst"

View File

@ -15,64 +15,56 @@
# along with this program. If not, see <https://www.gnu.org/licenses/>.
"""
tractor: structured concurrent "actors".
tractor: structured concurrent ``trio``-"actors".
"""
from trio import MultiError
from ._clustering import open_actor_cluster
from ._ipc import Channel
from ._clustering import (
open_actor_cluster as open_actor_cluster,
)
from ._context import (
Context as Context, # the type
context as context, # a func-decorator
)
from ._streaming import (
Context,
ReceiveMsgStream,
MsgStream,
stream,
context,
MsgStream as MsgStream,
stream as stream,
)
from ._discovery import (
get_arbiter,
find_actor,
wait_for_actor,
query_actor,
get_registry as get_registry,
find_actor as find_actor,
wait_for_actor as wait_for_actor,
query_actor as query_actor,
)
from ._supervise import (
open_nursery as open_nursery,
ActorNursery as ActorNursery,
)
from ._state import (
current_actor as current_actor,
is_root_process as is_root_process,
current_ipc_ctx as current_ipc_ctx,
debug_mode as debug_mode
)
from ._supervise import open_nursery
from ._state import current_actor, is_root_process
from ._exceptions import (
RemoteActorError,
ModuleNotExposed,
ContextCancelled,
ContextCancelled as ContextCancelled,
ModuleNotExposed as ModuleNotExposed,
MsgTypeError as MsgTypeError,
RemoteActorError as RemoteActorError,
TransportClosed as TransportClosed,
)
from ._debug import breakpoint, post_mortem
from . import msg
from ._root import run, run_daemon, open_root_actor
from ._portal import Portal
__all__ = [
'Channel',
'Context',
'ContextCancelled',
'ModuleNotExposed',
'MsgStream',
'MultiError',
'Portal',
'ReceiveMsgStream',
'RemoteActorError',
'breakpoint',
'context',
'current_actor',
'find_actor',
'get_arbiter',
'is_root_process',
'msg',
'open_actor_cluster',
'open_nursery',
'open_root_actor',
'post_mortem',
'query_actor',
'run',
'run_daemon',
'stream',
'to_asyncio',
'wait_for_actor',
]
from .devx import (
breakpoint as breakpoint,
pause as pause,
pause_from_sync as pause_from_sync,
post_mortem as post_mortem,
)
from . import msg as msg
from ._root import (
run_daemon as run_daemon,
open_root_actor as open_root_actor,
)
from ._ipc import Channel as Channel
from ._portal import Portal as Portal
from ._runtime import Actor as Actor
# from . import hilevel as hilevel

File diff suppressed because it is too large Load Diff

View File

@ -18,13 +18,11 @@
This is the "bootloader" for actors started using the native trio backend.
"""
import sys
import trio
import argparse
from ast import literal_eval
from ._actor import Actor
from ._runtime import Actor
from ._entry import _trio_main
@ -37,9 +35,8 @@ def parse_ipaddr(arg):
return (str(host), int(port))
from ._entry import _trio_main
if __name__ == "__main__":
__tracebackhide__: bool = True
parser = argparse.ArgumentParser()
parser.add_argument("--uid", type=parse_uid)

View File

@ -19,10 +19,13 @@ Actor cluster helpers.
'''
from __future__ import annotations
from contextlib import asynccontextmanager as acm
from contextlib import (
asynccontextmanager as acm,
)
from multiprocessing import cpu_count
from typing import AsyncGenerator, Optional
from typing import (
AsyncGenerator,
)
import trio
import tractor
@ -32,9 +35,12 @@ import tractor
async def open_actor_cluster(
modules: list[str],
count: int = cpu_count(),
names: Optional[list[str]] = None,
start_method: Optional[str] = None,
names: list[str] | None = None,
hard_kill: bool = False,
# passed through verbatim to ``open_root_actor()``
**runtime_kwargs,
) -> AsyncGenerator[
dict[str, tractor.Portal],
None,
@ -49,7 +55,9 @@ async def open_actor_cluster(
raise ValueError(
'Number of names is {len(names)} but count it {count}')
async with tractor.open_nursery(start_method=start_method) as an:
async with tractor.open_nursery(
**runtime_kwargs,
) as an:
async with trio.open_nursery() as n:
uid = tractor.current_actor().uid

2618
tractor/_context.py 100644

File diff suppressed because it is too large Load Diff

View File

@ -1,652 +0,0 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
"""
Multi-core debugging for da peeps!
"""
import bdb
import sys
from functools import partial
from contextlib import asynccontextmanager as acm
from typing import (
Tuple,
Optional,
Callable,
AsyncIterator,
AsyncGenerator,
)
import tractor
import trio
from trio_typing import TaskStatus
from .log import get_logger
from . import _state
from ._discovery import get_root
from ._state import is_root_process, debug_mode
from ._exceptions import is_multi_cancelled
try:
# wtf: only exported when installed in dev mode?
import pdbpp
except ImportError:
# pdbpp is installed in regular mode...it monkey patches stuff
import pdb
assert pdb.xpm, "pdbpp is not installed?" # type: ignore
pdbpp = pdb
log = get_logger(__name__)
__all__ = ['breakpoint', 'post_mortem']
# TODO: wrap all these in a static global class: ``DebugLock`` maybe?
# placeholder for function to set a ``trio.Event`` on debugger exit
_pdb_release_hook: Optional[Callable] = None
# actor-wide variable pointing to current task name using debugger
_local_task_in_debug: Optional[str] = None
# actor tree-wide actor uid that supposedly has the tty lock
_global_actor_in_debug: Optional[Tuple[str, str]] = None
# lock in root actor preventing multi-access to local tty
_debug_lock: trio.StrictFIFOLock = trio.StrictFIFOLock()
_local_pdb_complete: Optional[trio.Event] = None
_no_remote_has_tty: Optional[trio.Event] = None
# XXX: set by the current task waiting on the root tty lock
# and must be cancelled if this actor is cancelled via message
# otherwise deadlocks with the parent actor may ensure
_debugger_request_cs: Optional[trio.CancelScope] = None
class TractorConfig(pdbpp.DefaultConfig):
"""Custom ``pdbpp`` goodness.
"""
# sticky_by_default = True
class PdbwTeardown(pdbpp.Pdb):
"""Add teardown hooks to the regular ``pdbpp.Pdb``.
"""
# override the pdbpp config with our coolio one
DefaultConfig = TractorConfig
# TODO: figure out how to disallow recursive .set_trace() entry
# since that'll cause deadlock for us.
def set_continue(self):
try:
super().set_continue()
finally:
global _local_task_in_debug
_local_task_in_debug = None
_pdb_release_hook()
def set_quit(self):
try:
super().set_quit()
finally:
global _local_task_in_debug
_local_task_in_debug = None
_pdb_release_hook()
# TODO: will be needed whenever we get to true remote debugging.
# XXX see https://github.com/goodboy/tractor/issues/130
# # TODO: is there some way to determine this programatically?
# _pdb_exit_patterns = tuple(
# str.encode(patt + "\n") for patt in (
# 'c', 'cont', 'continue', 'q', 'quit')
# )
# def subactoruid2proc(
# actor: 'Actor', # noqa
# uid: Tuple[str, str]
# ) -> trio.Process:
# n = actor._actoruid2nursery[uid]
# _, proc, _ = n._children[uid]
# return proc
# async def hijack_stdin():
# log.info(f"Hijacking stdin from {actor.uid}")
# trap std in and relay to subproc
# async_stdin = trio.wrap_file(sys.stdin)
# async with aclosing(async_stdin):
# async for msg in async_stdin:
# log.runtime(f"Stdin input:\n{msg}")
# # encode to bytes
# bmsg = str.encode(msg)
# # relay bytes to subproc over pipe
# # await proc.stdin.send_all(bmsg)
# if bmsg in _pdb_exit_patterns:
# log.info("Closing stdin hijack")
# break
@acm
async def _acquire_debug_lock(
uid: Tuple[str, str]
) -> AsyncIterator[trio.StrictFIFOLock]:
'''Acquire a root-actor local FIFO lock which tracks mutex access of
the process tree's global debugger breakpoint.
This lock avoids tty clobbering (by preventing multiple processes
reading from stdstreams) and ensures multi-actor, sequential access
to the ``pdb`` repl.
'''
global _debug_lock, _global_actor_in_debug, _no_remote_has_tty
task_name = trio.lowlevel.current_task().name
log.debug(
f"Attempting to acquire TTY lock, remote task: {task_name}:{uid}"
)
we_acquired = False
if _no_remote_has_tty is None:
# mark the tty lock as being in use so that the runtime
# can try to avoid clobbering any connection from a child
# that's currently relying on it.
_no_remote_has_tty = trio.Event()
try:
log.debug(
f"entering lock checkpoint, remote task: {task_name}:{uid}"
)
we_acquired = True
await _debug_lock.acquire()
_global_actor_in_debug = uid
log.debug(f"TTY lock acquired, remote task: {task_name}:{uid}")
# NOTE: critical section: this yield is unshielded!
# IF we received a cancel during the shielded lock entry of some
# next-in-queue requesting task, then the resumption here will
# result in that ``trio.Cancelled`` being raised to our caller
# (likely from ``_hijack_stdin_for_child()`` below)! In
# this case the ``finally:`` below should trigger and the
# surrounding caller side context should cancel normally
# relaying back to the caller.
yield _debug_lock
finally:
# if _global_actor_in_debug == uid:
if we_acquired and _debug_lock.locked():
_debug_lock.release()
# IFF there are no more requesting tasks queued up fire, the
# "tty-unlocked" event thereby alerting any monitors of the lock that
# we are now back in the "tty unlocked" state. This is basically
# and edge triggered signal around an empty queue of sub-actor
# tasks that may have tried to acquire the lock.
stats = _debug_lock.statistics()
if (
not stats.owner
):
log.debug(f"No more tasks waiting on tty lock! says {uid}")
_no_remote_has_tty.set()
_no_remote_has_tty = None
_global_actor_in_debug = None
log.debug(f"TTY lock released, remote task: {task_name}:{uid}")
def handler(signum, frame, *args):
"""Specialized debugger compatible SIGINT handler.
In childred we always ignore to avoid deadlocks since cancellation
should always be managed by the parent supervising actor. The root
is always cancelled on ctrl-c.
"""
if is_root_process():
tractor.current_actor().cancel_soon()
else:
print(
"tractor ignores SIGINT while in debug mode\n"
"If you have a special need for it please open an issue.\n"
)
@tractor.context
async def _hijack_stdin_for_child(
ctx: tractor.Context,
subactor_uid: Tuple[str, str]
) -> str:
'''
Hijack the tty in the root process of an actor tree such that
the pdbpp debugger console can be allocated to a sub-actor for repl
bossing.
'''
task_name = trio.lowlevel.current_task().name
# TODO: when we get to true remote debugging
# this will deliver stdin data?
log.debug(
"Attempting to acquire TTY lock\n"
f"remote task: {task_name}:{subactor_uid}"
)
log.debug(f"Actor {subactor_uid} is WAITING on stdin hijack lock")
with trio.CancelScope(shield=True):
try:
lock = None
async with _acquire_debug_lock(subactor_uid) as lock:
# indicate to child that we've locked stdio
await ctx.started('Locked')
log.debug(f"Actor {subactor_uid} acquired stdin hijack lock")
# wait for unlock pdb by child
async with ctx.open_stream() as stream:
assert await stream.receive() == 'pdb_unlock'
# try:
# assert await stream.receive() == 'pdb_unlock'
except (
# BaseException,
trio.MultiError,
trio.BrokenResourceError,
trio.Cancelled, # by local cancellation
trio.ClosedResourceError, # by self._rx_chan
) as err:
# XXX: there may be a race with the portal teardown
# with the calling actor which we can safely ignore.
# The alternative would be sending an ack message
# and allowing the client to wait for us to teardown
# first?
if lock and lock.locked():
lock.release()
if isinstance(err, trio.Cancelled):
raise
finally:
log.debug(
"TTY lock released, remote task:"
f"{task_name}:{subactor_uid}")
return "pdb_unlock_complete"
async def wait_for_parent_stdin_hijack(
actor_uid: Tuple[str, str],
task_status: TaskStatus[trio.CancelScope] = trio.TASK_STATUS_IGNORED
):
'''
Connect to the root actor via a ctx and invoke a task which locks
a root-local TTY lock.
This function is used by any sub-actor to acquire mutex access to
pdb and the root's TTY for interactive debugging (see below inside
``_breakpoint()``). It can be used to ensure that an intermediate
nursery-owning actor does not clobber its children if they are in
debug (see below inside ``maybe_wait_for_debugger()``).
'''
global _debugger_request_cs
with trio.CancelScope(shield=True) as cs:
_debugger_request_cs = cs
try:
async with get_root() as portal:
# this syncs to child's ``Context.started()`` call.
async with portal.open_context(
tractor._debug._hijack_stdin_for_child,
subactor_uid=actor_uid,
) as (ctx, val):
log.pdb('locked context')
assert val == 'Locked'
async with ctx.open_stream() as stream:
# unblock local caller
task_status.started(cs)
try:
assert _local_pdb_complete
await _local_pdb_complete.wait()
finally:
# TODO: shielding currently can cause hangs...
with trio.CancelScope(shield=True):
await stream.send('pdb_unlock')
# sync with callee termination
assert await ctx.result() == "pdb_unlock_complete"
except tractor.ContextCancelled:
log.warning('Root actor cancelled debug lock')
finally:
log.debug(f"Exiting debugger for actor {actor_uid}")
global _local_task_in_debug
_local_task_in_debug = None
log.debug(f"Child {actor_uid} released parent stdio lock")
async def _breakpoint(
debug_func,
# TODO:
# shield: bool = False
) -> None:
'''``tractor`` breakpoint entry for engaging pdb machinery
in the root or a subactor.
'''
# TODO: is it possible to debug a trio.Cancelled except block?
# right now it seems like we can kinda do with by shielding
# around ``tractor.breakpoint()`` but not if we move the shielded
# scope here???
# with trio.CancelScope(shield=shield):
actor = tractor.current_actor()
task_name = trio.lowlevel.current_task().name
global _local_pdb_complete, _pdb_release_hook
global _local_task_in_debug, _global_actor_in_debug
await trio.lowlevel.checkpoint()
if not _local_pdb_complete or _local_pdb_complete.is_set():
_local_pdb_complete = trio.Event()
# TODO: need a more robust check for the "root" actor
if actor._parent_chan and not is_root_process():
if _local_task_in_debug:
if _local_task_in_debug == task_name:
# this task already has the lock and is
# likely recurrently entering a breakpoint
return
# if **this** actor is already in debug mode block here
# waiting for the control to be released - this allows
# support for recursive entries to `tractor.breakpoint()`
log.warning(f"{actor.uid} already has a debug lock, waiting...")
await _local_pdb_complete.wait()
await trio.sleep(0.1)
# mark local actor as "in debug mode" to avoid recurrent
# entries/requests to the root process
_local_task_in_debug = task_name
# assign unlock callback for debugger teardown hooks
_pdb_release_hook = _local_pdb_complete.set
# this **must** be awaited by the caller and is done using the
# root nursery so that the debugger can continue to run without
# being restricted by the scope of a new task nursery.
# NOTE: if we want to debug a trio.Cancelled triggered exception
# we have to figure out how to avoid having the service nursery
# cancel on this task start? I *think* this works below?
# actor._service_n.cancel_scope.shield = shield
with trio.CancelScope(shield=True):
await actor._service_n.start(
wait_for_parent_stdin_hijack,
actor.uid,
)
elif is_root_process():
# we also wait in the root-parent for any child that
# may have the tty locked prior
global _debug_lock
# TODO: wait, what about multiple root tasks acquiring it though?
# root process (us) already has it; ignore
if _global_actor_in_debug == actor.uid:
return
# XXX: since we need to enter pdb synchronously below,
# we have to release the lock manually from pdb completion
# callbacks. Can't think of a nicer way then this atm.
if _debug_lock.locked():
log.warning(
'Root actor attempting to shield-acquire active tty lock'
f' owned by {_global_actor_in_debug}')
# must shield here to avoid hitting a ``Cancelled`` and
# a child getting stuck bc we clobbered the tty
with trio.CancelScope(shield=True):
await _debug_lock.acquire()
else:
# may be cancelled
await _debug_lock.acquire()
_global_actor_in_debug = actor.uid
_local_task_in_debug = task_name
# the lock must be released on pdb completion
def teardown():
global _local_pdb_complete, _debug_lock
global _global_actor_in_debug, _local_task_in_debug
_debug_lock.release()
_global_actor_in_debug = None
_local_task_in_debug = None
_local_pdb_complete.set()
_pdb_release_hook = teardown
# block here one (at the appropriate frame *up*) where
# ``breakpoint()`` was awaited and begin handling stdio.
log.debug("Entering the synchronous world of pdb")
debug_func(actor)
def _mk_pdb() -> PdbwTeardown:
# XXX: setting these flags on the pdb instance are absolutely
# critical to having ctrl-c work in the ``trio`` standard way! The
# stdlib's pdb supports entering the current sync frame on a SIGINT,
# with ``trio`` we pretty much never want this and if we did we can
# handle it in the ``tractor`` task runtime.
pdb = PdbwTeardown()
pdb.allow_kbdint = True
pdb.nosigint = True
return pdb
def _set_trace(actor=None):
pdb = _mk_pdb()
if actor is not None:
log.pdb(f"\nAttaching pdb to actor: {actor.uid}\n")
pdb.set_trace(
# start 2 levels up in user code
frame=sys._getframe().f_back.f_back,
)
else:
# we entered the global ``breakpoint()`` built-in from sync code
global _local_task_in_debug, _pdb_release_hook
_local_task_in_debug = 'sync'
def nuttin():
pass
_pdb_release_hook = nuttin
pdb.set_trace(
# start 2 levels up in user code
frame=sys._getframe().f_back,
)
breakpoint = partial(
_breakpoint,
_set_trace,
)
def _post_mortem(actor):
log.pdb(f"\nAttaching to pdb in crashed actor: {actor.uid}\n")
pdb = _mk_pdb()
# custom Pdb post-mortem entry
pdbpp.xpm(Pdb=lambda: pdb)
post_mortem = partial(
_breakpoint,
_post_mortem,
)
async def _maybe_enter_pm(err):
if (
debug_mode()
# NOTE: don't enter debug mode recursively after quitting pdb
# Iow, don't re-enter the repl if the `quit` command was issued
# by the user.
and not isinstance(err, bdb.BdbQuit)
# XXX: if the error is the likely result of runtime-wide
# cancellation, we don't want to enter the debugger since
# there's races between when the parent actor has killed all
# comms and when the child tries to contact said parent to
# acquire the tty lock.
# Really we just want to mostly avoid catching KBIs here so there
# might be a simpler check we can do?
and not is_multi_cancelled(err)
):
log.debug("Actor crashed, entering debug mode")
await post_mortem()
return True
else:
return False
@acm
async def acquire_debug_lock(
subactor_uid: Tuple[str, str],
) -> AsyncGenerator[None, tuple]:
'''
Grab root's debug lock on entry, release on exit.
This helper is for actor's who don't actually need
to acquired the debugger but want to wait until the
lock is free in the tree root.
'''
if not debug_mode():
yield None
return
async with trio.open_nursery() as n:
cs = await n.start(
wait_for_parent_stdin_hijack,
subactor_uid,
)
yield None
cs.cancel()
async def maybe_wait_for_debugger(
poll_steps: int = 2,
poll_delay: float = 0.1,
child_in_debug: bool = False,
) -> None:
if not debug_mode() and not child_in_debug:
return
if (
is_root_process()
):
global _no_remote_has_tty, _global_actor_in_debug, _wait_all_tasks_lock
# If we error in the root but the debugger is
# engaged we don't want to prematurely kill (and
# thus clobber access to) the local tty since it
# will make the pdb repl unusable.
# Instead try to wait for pdb to be released before
# tearing down.
sub_in_debug = None
for _ in range(poll_steps):
if _global_actor_in_debug:
sub_in_debug = tuple(_global_actor_in_debug)
log.debug(
'Root polling for debug')
with trio.CancelScope(shield=True):
await trio.sleep(poll_delay)
# TODO: could this make things more deterministic? wait
# to see if a sub-actor task will be scheduled and grab
# the tty lock on the next tick?
# XXX: doesn't seem to work
# await trio.testing.wait_all_tasks_blocked(cushion=0)
debug_complete = _no_remote_has_tty
if (
(debug_complete and
not debug_complete.is_set())
):
log.debug(
'Root has errored but pdb is in use by '
f'child {sub_in_debug}\n'
'Waiting on tty lock to release..')
await debug_complete.wait()
await trio.sleep(poll_delay)
continue
else:
log.debug(
'Root acquired TTY LOCK'
)

View File

@ -15,46 +15,71 @@
# along with this program. If not, see <https://www.gnu.org/licenses/>.
"""
Actor discovery API.
Discovery (protocols) API for automatic addressing and location
management of (service) actors.
"""
from typing import Tuple, Optional, Union, AsyncGenerator
from __future__ import annotations
from typing import (
AsyncGenerator,
AsyncContextManager,
TYPE_CHECKING,
)
from contextlib import asynccontextmanager as acm
from tractor.log import get_logger
from .trionics import gather_contexts
from ._ipc import _connect_chan, Channel
from ._portal import (
Portal,
open_portal,
LocalPortal,
)
from ._state import current_actor, _runtime_vars
from ._state import (
current_actor,
_runtime_vars,
)
if TYPE_CHECKING:
from ._runtime import Actor
log = get_logger(__name__)
@acm
async def get_arbiter(
async def get_registry(
host: str,
port: int,
) -> AsyncGenerator[Union[Portal, LocalPortal], None]:
'''Return a portal instance connected to a local or remote
arbiter.
) -> AsyncGenerator[
Portal | LocalPortal | None,
None,
]:
'''
actor = current_actor()
Return a portal instance connected to a local or remote
registry-service actor; if a connection already exists re-use it
(presumably to call a `.register_actor()` registry runtime RPC
ep).
if not actor:
raise RuntimeError("No actor instance has been defined yet?")
if actor.is_arbiter:
'''
actor: Actor = current_actor()
if actor.is_registrar:
# we're already the arbiter
# (likely a re-entrant call from the arbiter actor)
yield LocalPortal(actor, Channel((host, port)))
yield LocalPortal(
actor,
Channel((host, port))
)
else:
async with _connect_chan(host, port) as chan:
# TODO: try to look pre-existing connection from
# `Actor._peers` and use it instead?
async with (
_connect_chan(host, port) as chan,
open_portal(chan) as regstr_ptl,
):
yield regstr_ptl
async with open_portal(chan) as arb_portal:
yield arb_portal
@acm
@ -62,51 +87,125 @@ async def get_root(
**kwargs,
) -> AsyncGenerator[Portal, None]:
# TODO: rename mailbox to `_root_maddr` when we finally
# add and impl libp2p multi-addrs?
host, port = _runtime_vars['_root_mailbox']
assert host is not None
async with _connect_chan(host, port) as chan:
async with open_portal(chan, **kwargs) as portal:
yield portal
async with (
_connect_chan(host, port) as chan,
open_portal(chan, **kwargs) as portal,
):
yield portal
def get_peer_by_name(
name: str,
# uuid: str|None = None,
) -> list[Channel]|None: # at least 1
'''
Scan for an existing connection (set) to a named actor
and return any channels from `Actor._peers`.
This is an optimization method over querying the registrar for
the same info.
'''
actor: Actor = current_actor()
to_scan: dict[tuple, list[Channel]] = actor._peers.copy()
pchan: Channel|None = actor._parent_chan
if pchan:
to_scan[pchan.uid].append(pchan)
for aid, chans in to_scan.items():
_, peer_name = aid
if name == peer_name:
if not chans:
log.warning(
'No IPC chans for matching peer {peer_name}\n'
)
continue
return chans
return None
@acm
async def query_actor(
name: str,
arbiter_sockaddr: Optional[tuple[str, int]] = None,
regaddr: tuple[str, int]|None = None,
) -> AsyncGenerator[tuple[str, int], None]:
) -> AsyncGenerator[
tuple[str, int]|None,
None,
]:
'''
Simple address lookup for a given actor name.
Lookup a transport address (by actor name) via querying a registrar
listening @ `regaddr`.
Returns the (socket) address or ``None``.
Returns the transport protocol (socket) address or `None` if no
entry under that name exists.
'''
actor = current_actor()
async with get_arbiter(
*arbiter_sockaddr or actor._arb_addr
) as arb_portal:
actor: Actor = current_actor()
if (
name == 'registrar'
and actor.is_registrar
):
raise RuntimeError(
'The current actor IS the registry!?'
)
sockaddr = await arb_portal.run_from_ns(
maybe_peers: list[Channel]|None = get_peer_by_name(name)
if maybe_peers:
yield maybe_peers[0].raddr
return
reg_portal: Portal
regaddr: tuple[str, int] = regaddr or actor.reg_addrs[0]
async with get_registry(*regaddr) as reg_portal:
# TODO: return portals to all available actors - for now
# just the last one that registered
sockaddr: tuple[str, int] = await reg_portal.run_from_ns(
'self',
'find_actor',
name=name,
)
yield sockaddr
# TODO: return portals to all available actors - for now just
# the last one that registered
if name == 'arbiter' and actor.is_arbiter:
raise RuntimeError("The current actor is the arbiter")
yield sockaddr if sockaddr else None
@acm
async def maybe_open_portal(
addr: tuple[str, int],
name: str,
):
async with query_actor(
name=name,
regaddr=addr,
) as sockaddr:
pass
if sockaddr:
async with _connect_chan(*sockaddr) as chan:
async with open_portal(chan) as portal:
yield portal
else:
yield None
@acm
async def find_actor(
name: str,
arbiter_sockaddr: Tuple[str, int] = None
registry_addrs: list[tuple[str, int]]|None = None,
) -> AsyncGenerator[Optional[Portal], None]:
only_first: bool = True,
raise_on_none: bool = False,
) -> AsyncGenerator[
Portal | list[Portal] | None,
None,
]:
'''
Ask the arbiter to find actor(s) by name.
@ -114,39 +213,102 @@ async def find_actor(
known to the arbiter.
'''
async with query_actor(
name=name,
arbiter_sockaddr=arbiter_sockaddr,
) as sockaddr:
# optimization path, use any pre-existing peer channel
maybe_peers: list[Channel]|None = get_peer_by_name(name)
if maybe_peers and only_first:
async with open_portal(maybe_peers[0]) as peer_portal:
yield peer_portal
return
if sockaddr:
async with _connect_chan(*sockaddr) as chan:
async with open_portal(chan) as portal:
yield portal
else:
if not registry_addrs:
# XXX NOTE: make sure to dynamically read the value on
# every call since something may change it globally (eg.
# like in our discovery test suite)!
from . import _root
registry_addrs = (
_runtime_vars['_registry_addrs']
or
_root._default_lo_addrs
)
maybe_portals: list[
AsyncContextManager[tuple[str, int]]
] = list(
maybe_open_portal(
addr=addr,
name=name,
)
for addr in registry_addrs
)
portals: list[Portal]
async with gather_contexts(
mngrs=maybe_portals,
) as portals:
# log.runtime(
# 'Gathered portals:\n'
# f'{portals}'
# )
# NOTE: `gather_contexts()` will return a
# `tuple[None, None, ..., None]` if no contact
# can be made with any regstrar at any of the
# N provided addrs!
if not any(portals):
if raise_on_none:
raise RuntimeError(
f'No actor "{name}" found registered @ {registry_addrs}'
)
yield None
return
portals: list[Portal] = list(portals)
if only_first:
yield portals[0]
else:
# TODO: currently this may return multiple portals
# given there are multi-homed or multiple registrars..
# SO, we probably need de-duplication logic?
yield portals
@acm
async def wait_for_actor(
name: str,
arbiter_sockaddr: Tuple[str, int] = None
registry_addr: tuple[str, int] | None = None,
) -> AsyncGenerator[Portal, None]:
"""Wait on an actor to register with the arbiter.
'''
Wait on at least one peer actor to register `name` with the
registrar, yield a `Portal to the first registree.
A portal to the first registered actor is returned.
"""
actor = current_actor()
'''
actor: Actor = current_actor()
async with get_arbiter(
*arbiter_sockaddr or actor._arb_addr,
) as arb_portal:
sockaddrs = await arb_portal.run_from_ns(
# optimization path, use any pre-existing peer channel
maybe_peers: list[Channel]|None = get_peer_by_name(name)
if maybe_peers:
async with open_portal(maybe_peers[0]) as peer_portal:
yield peer_portal
return
regaddr: tuple[str, int] = (
registry_addr
or
actor.reg_addrs[0]
)
# TODO: use `.trionics.gather_contexts()` like
# above in `find_actor()` as well?
reg_portal: Portal
async with get_registry(*regaddr) as reg_portal:
sockaddrs = await reg_portal.run_from_ns(
'self',
'wait_for_actor',
name=name,
)
sockaddr = sockaddrs[-1]
# get latest registered addr by default?
# TODO: offer multi-portal yields in multi-homed case?
sockaddr: tuple[str, int] = sockaddrs[-1]
async with _connect_chan(*sockaddr) as chan:
async with open_portal(chan) as portal:

View File

@ -18,15 +18,32 @@
Sub-process entry points.
"""
from __future__ import annotations
from functools import partial
from typing import Tuple, Any
import signal
import multiprocessing as mp
import os
import textwrap
from typing import (
Any,
TYPE_CHECKING,
)
import trio # type: ignore
from .log import get_console_log, get_logger
from .log import (
get_console_log,
get_logger,
)
from . import _state
from .devx import _debug
from .to_asyncio import run_as_asyncio_guest
from ._runtime import (
async_main,
Actor,
)
if TYPE_CHECKING:
from ._spawn import SpawnMethodKey
log = get_logger(__name__)
@ -34,37 +51,40 @@ log = get_logger(__name__)
def _mp_main(
actor: 'Actor', # type: ignore
accept_addr: Tuple[str, int],
forkserver_info: Tuple[Any, Any, Any, Any, Any],
start_method: str,
parent_addr: Tuple[str, int] = None,
actor: Actor,
accept_addrs: list[tuple[str, int]],
forkserver_info: tuple[Any, Any, Any, Any, Any],
start_method: SpawnMethodKey,
parent_addr: tuple[str, int] | None = None,
infect_asyncio: bool = False,
) -> None:
'''
The routine called *after fork* which invokes a fresh ``trio.run``
The routine called *after fork* which invokes a fresh `trio.run()`
'''
actor._forkserver_info = forkserver_info
from ._spawn import try_set_start_method
spawn_ctx = try_set_start_method(start_method)
spawn_ctx: mp.context.BaseContext = try_set_start_method(start_method)
assert spawn_ctx
if actor.loglevel is not None:
log.info(
f"Setting loglevel for {actor.uid} to {actor.loglevel}")
f'Setting loglevel for {actor.uid} to {actor.loglevel}'
)
get_console_log(actor.loglevel)
assert spawn_ctx
# TODO: use scops headers like for `trio` below!
# (well after we libify it maybe..)
log.info(
f"Started new {spawn_ctx.current_process()} for {actor.uid}")
_state._current_actor = actor
log.debug(f"parent_addr is {parent_addr}")
f'Started new {spawn_ctx.current_process()} for {actor.uid}'
# f"parent_addr is {parent_addr}"
)
_state._current_actor: Actor = actor
trio_main = partial(
actor._async_main,
accept_addr,
async_main,
actor=actor,
accept_addrs=accept_addrs,
parent_addr=parent_addr
)
try:
@ -77,14 +97,116 @@ def _mp_main(
pass # handle it the same way trio does?
finally:
log.info(f"Actor {actor.uid} terminated")
log.info(
f'`mp`-subactor {actor.uid} exited'
)
# TODO: move this func to some kinda `.devx._conc_lang.py` eventually
# as we work out our multi-domain state-flow-syntax!
def nest_from_op(
input_op: str,
#
# ?TODO? an idea for a syntax to the state of concurrent systems
# as a "3-domain" (execution, scope, storage) model and using
# a minimal ascii/utf-8 operator-set.
#
# try not to take any of this seriously yet XD
#
# > is a "play operator" indicating (CPU bound)
# exec/work/ops required at the "lowest level computing"
#
# execution primititves (tasks, threads, actors..) denote their
# lifetime with '(' and ')' since parentheses normally are used
# in many langs to denote function calls.
#
# starting = (
# >( opening/starting; beginning of the thread-of-exec (toe?)
# (> opened/started, (finished spawning toe)
# |_<Task: blah blah..> repr of toe, in py these look like <objs>
#
# >) closing/exiting/stopping,
# )> closed/exited/stopped,
# |_<Task: blah blah..>
# [OR <), )< ?? ]
#
# ending = )
# >c) cancelling to close/exit
# c)> cancelled (caused close), OR?
# |_<Actor: ..>
# OR maybe "<c)" which better indicates the cancel being
# "delivered/returned" / returned" to LHS?
#
# >x) erroring to eventuall exit
# x)> errored and terminated
# |_<Actor: ...>
#
# scopes: supers/nurseries, IPC-ctxs, sessions, perms, etc.
# >{ opening
# {> opened
# }> closed
# >} closing
#
# storage: like queues, shm-buffers, files, etc..
# >[ opening
# [> opened
# |_<FileObj: ..>
#
# >] closing
# ]> closed
# IPC ops: channels, transports, msging
# => req msg
# <= resp msg
# <=> 2-way streaming (of msgs)
# <- recv 1 msg
# -> send 1 msg
#
# TODO: still not sure on R/L-HS approach..?
# =>( send-req to exec start (task, actor, thread..)
# (<= recv-req to ^
#
# (<= recv-req ^
# <=( recv-resp opened remote exec primitive
# <=) recv-resp closed
#
# )<=c req to stop due to cancel
# c=>) req to stop due to cancel
#
# =>{ recv-req to open
# <={ send-status that it closed
tree_str: str,
# NOTE: so move back-from-the-left of the `input_op` by
# this amount.
back_from_op: int = 0,
) -> str:
'''
Depth-increment the input (presumably hierarchy/supervision)
input "tree string" below the provided `input_op` execution
operator, so injecting a `"\n|_{input_op}\n"`and indenting the
`tree_str` to nest content aligned with the ops last char.
'''
return (
f'{input_op}\n'
+
textwrap.indent(
tree_str,
prefix=(
len(input_op)
-
(back_from_op + 1)
) * ' ',
)
)
def _trio_main(
actor: 'Actor', # type: ignore
actor: Actor,
*,
parent_addr: Tuple[str, int] = None,
parent_addr: tuple[str, int] | None = None,
infect_asyncio: bool = False,
) -> None:
@ -92,32 +214,73 @@ def _trio_main(
Entry point for a `trio_run_in_process` subactor.
'''
log.info(f"Started new trio process for {actor.uid}")
if actor.loglevel is not None:
log.info(
f"Setting loglevel for {actor.uid} to {actor.loglevel}")
get_console_log(actor.loglevel)
log.info(
f"Started {actor.uid}")
_debug.hide_runtime_frames()
_state._current_actor = actor
log.debug(f"parent_addr is {parent_addr}")
trio_main = partial(
actor._async_main,
async_main,
actor,
parent_addr=parent_addr
)
if actor.loglevel is not None:
get_console_log(actor.loglevel)
actor_info: str = (
f'|_{actor}\n'
f' uid: {actor.uid}\n'
f' pid: {os.getpid()}\n'
f' parent_addr: {parent_addr}\n'
f' loglevel: {actor.loglevel}\n'
)
log.info(
'Starting new `trio` subactor:\n'
+
nest_from_op(
input_op='>(', # see syntax ideas above
tree_str=actor_info,
back_from_op=2, # since "complete"
)
)
logmeth = log.info
exit_status: str = (
'Subactor exited\n'
+
nest_from_op(
input_op=')>', # like a "closed-to-play"-icon from super perspective
tree_str=actor_info,
back_from_op=1,
)
)
try:
if infect_asyncio:
actor._infected_aio = True
run_as_asyncio_guest(trio_main)
else:
trio.run(trio_main)
except KeyboardInterrupt:
log.warning(f"Actor {actor.uid} received KBI")
logmeth = log.cancel
exit_status: str = (
'Actor received KBI (aka an OS-cancel)\n'
+
nest_from_op(
input_op='c)>', # closed due to cancel (see above)
tree_str=actor_info,
)
)
except BaseException as err:
logmeth = log.error
exit_status: str = (
'Main actor task exited due to crash?\n'
+
nest_from_op(
input_op='x)>', # closed by error
tree_str=actor_info,
)
)
# NOTE since we raise a tb will already be shown on the
# console, thus we do NOT use `.exception()` above.
raise err
finally:
log.info(f"Actor {actor.uid} terminated")
logmeth(exit_status)

Some files were not shown because too many files have changed in this diff Show More