tractor

Commit Graph

Author	SHA1	Message	Date
Tyler Goodlet	a26f817ed1	Another loosie in the trioisms suite	2025-03-27 13:38:47 -04:00
Tyler Goodlet	e815dcd3c8	Use `collapse_eg()` in broadcaster suite Around the test embedded `trio.open_nursery()` calls as expected. Also tidy up the various nursery var names.	2025-03-27 13:38:47 -04:00
Tyler Goodlet	3ad558230a	Fix docs tests with yet another loosie-goosie So the KBI propagates up to the actor nursery scope and also avoid running any `examples/multihost/` subdir scripts.	2025-03-27 13:38:47 -04:00
Tyler Goodlet	22f405a707	Another couple loose-ifies for discovery and advanced fault suites	2025-03-27 13:38:47 -04:00
Tyler Goodlet	e5bcefb575	Add (masked) meta-debug-fixture for determining if `debug_mode` is set in harness..	2025-03-27 13:38:47 -04:00
Tyler Goodlet	8f7c022afe	Various test tweaks related to 3.13 egs Including changes like, - loose eg flagging in various test emedded `trio.open_nursery()`s. - changes to eg handling (like using `except*`). - added `debug_mode` integration to tests that needed some REPLin in order to figure out appropriate updates.	2025-03-27 13:38:47 -04:00
Tyler Goodlet	8f774f52b1	Another loose-egs flag in `test_child_manages_service_nursery`	2025-03-27 13:38:47 -04:00
Tyler Goodlet	b021772a1e	Mask ctlc borked REPL tests Namely the `tractor.pause_from_sync()` examples using both bg threads and `asyncio` which seem to go into bad states where SIGINT is ignored.. Deats, - add `maybe_expect_timeout()` cm to ensure the EOF hangs get `.xfail()`ed instead. - @pytest.mark.ctlcs_bish` `test_pause_from_sync` and don't expect the greenback prompt msg. - also mark `test_sync_pause_from_aio_task`.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	03406e020c	Repair/update `stackscope` test Seems that on 3.13 it's not showing our script code in the output now? Gotta get an example for @oremanj to see what's up but really it'd be nice to just custom format stuff above `trio`'s runtime by def.. Anyway, update the `.devx._stackscope`, - log formatting to be a little more "sclangy" lookin. - change the per-actor "delimiter" lines style. - report the `signal.getsignal(SIGINT)` which i needed in the `sync_bp.py` with ctl-c causing a hang.. - mask the `_tree_dumped` duplicator log report as well as the "dumped fine" one. - add an example `pkill --signal SIGUSR1` cmdline. Tweak the test to cope with, - not showing our script lines now.. which i've commented in the `assert_before()` patts.. - to expect the newly formatted delimiter (ascii) lines to separate the root vs. hanger sub-actor sections.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	b0acc9ffe8	Add a mark to `pytest.xfail()` questionably conc py stuff (ur mam `.xfail()`s bish!)	2025-03-27 13:24:25 -04:00
Tyler Goodlet	fc325a621b	Be extra sure to re-raise EoCs from translator That is whenever `trio.EndOfChannel` is raised (presumably from the `._to_trio.receive()` call inside `LinkedTaskChannel.receive()`) we need to be extra certain that we let it bubble upward transparently DESPITE special exc-as-signal handling that is normally suppressed from the aio side; REPEAT we want to ALWAYS bubble any `trio_err == trio.EndOfChannel` in the `finally:` handler of `translate_aio_errors()` despite `chan._trio_to_raise == AsyncioTaskExited` such that the caller's iterable machinery will operate as normal when the inter-task stream is stopped (again, presumably by the aio side task terminating the inter-task stream). Main impl deats for this, - in the EoC handler block ensure we assign both `chan._trio_err` and the local `trio_err` as well as continue to re-raise. - add a case to the match block in the `finally:` handler which FOR SURE re-raises any `type(trio_err) is EndOfChannel`! Additionally fix a bad bug, - a ref bug where we were NOT using the `except BaseException as _trio_err` to assign to `chan._trio_err` (by accident was missing the leading `_`..) Unrelated impl tweak, - move all `maybe_raise_aio_side_err()` content back to inline with its parent func - makes it easier to use `tractor.pause()` mostly Bp - go back to trying to use `aio_task.set_exception(aio_taskc)` for now even though i'm pretty sure we're going to move to a try-fute-first style helper for this in the future. Adjust some tests to match/mk-them-green, - break from `aio_echo_server()` recv loop on `to_asyncio.TrioTaskExited` much like how you'd expect to (implicitly with a `for`) with a `trio.EndOfChannel`. - toss in a masked `value is None` pause point i needed for debugging inf looping caused by not re-raising EoCs per the main patch description. - add a debug-mode sized delay to root-infected test.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	d5ba9be3a9	More `debug_mode` test support, better nursery var names	2025-03-27 13:24:25 -04:00
Tyler Goodlet	639186aa37	Add per-side graceful-exit/cancel excs-as-signals Such that any combination of task terminations/exits can be explicitly handled and "dual side independent" crash cases re-raised in egs. The main error-or-exit impl changes include, - use of new per-side "signaling exceptions": - TrioTaskExited\|TrioCancelled for signalling aio. - AsyncioTaskExited\|AsyncioCancelled for signalling trio. - NOT overloading the `LinkedTaskChannel._trio/aio_err` fields for err-as-signal relay and instead add a new pair of `._trio/aio_to_raise` maybe-exc-attrs which allow each side's task to specify what it would want the other side to raise to signal its/a termination outcome: - `._trio_to_raise: AsyncioTaskExited\|AsyncioCancelled` to signal, \|_ the aio task having returned while the trio side was still reading from the `asyncio.Queue` or is just not `.done()`. \|_ the aio task being self or trio-request cancelled where a `asyncio.CancelledError` is raised and caught but NOT relayed as is back to trio; instead signal a "more explicit" exc type. - `._aio_to_raise: TrioTaskExited\|TrioCancelled` to signal, \|_ the trio task having returned while the aio side was still reading from the mem chan and indicating that the trio side might not care any more about future streamed values (like the `Stop/EndOfChannel` equivs for ipc `Context`s). \|_ when the trio task canceld we do a `asyncio.Future.set_exception(TrioTaskExited())` to indicate to the aio side verbosely that it should cancel due to the trio parent. - `_aio/trio_err` are now left to only capturing the actual per-side task excs for introspection / other side's handling logic. - supporting "graceful exits" depending on API in use from `translate_aio_errors()` such that if either side exits but the other side isn't expect to consume the final `return`ed value, we just exit silently, which required: - adding a `suppress_graceful_exits: bool` flag. - adjusting the `maybe_raise_aio_side_err()` logic to use that flag and suppress only on certain combos of `._trio_to_raise/._trio_err`. - prefer to raise `._trio_to_raise` when the aio-side is the src and vice versa. - filling out pedantic logging for cancellation cases indicating which side is the cause. - add a `LinkedTaskChannel._aio_result` modelled after our `Context._result` a a similar `.wait_for_result()` interface which allows maybe accessing the aio task's final return value if desired when using the `open_channel_from()` API. - rename `cancel_trio()` done handler -> `signal_trio_when_done()` Also some fairly major test suite updates, - add a `delay: int` producing fixture which delivers a much larger timeout whenever `debug_mode` is set so that the REPL can be used without a surrounding cancel firing. - add a new `test_aio_exits_early_relays_AsyncioTaskExited` including a paired `exit_early: bool` flag to `push_from_aio_task()`. - adjust `test_trio_closes_early_causes_aio_checkpoint_raise` to expect a `to_asyncio.TrioTaskExited`.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	182218a776	Another `is` fix..	2025-03-27 13:24:25 -04:00
Tyler Goodlet	6de17a3949	Unset `$PYTHON_COLORS` for test debugger suite.. Since obvi all our `pexpect` patterns aren't going to match with a heck-ton of terminal color escape sequences in the output XD	2025-03-27 13:24:25 -04:00
Tyler Goodlet	41a3297b9f	Tweak some test asserts to better `is` style	2025-03-27 13:24:25 -04:00
Tyler Goodlet	255db4b127	Save an MIA `breakpoint()`-restore test from prior!? It appears that during the reorg commit `a356233b47` this was intended to be moved (presumably where i have here) to `test_tooling` but was somehow just never pasted over XD Good thing this was caught while going through the remaining TODO bullets in #2 !! Also includes fixed relative `.conftest` imports!	2025-03-27 13:24:25 -04:00
Tyler Goodlet	66a7d660f6	Draft test-doc for "out-of-band" `asyncio.Task`.. Since there's no way to activate `greenback`'s portal in such cases, we should at least have a test verifying our very loud error about the inability to support this usage..	2025-03-27 13:24:25 -04:00
Tyler Goodlet	9b393338ca	Add a `tests/test_root_infect_asyncio` Might as well break apart the specific test set since there are some (minor) subtleties and the orig test mod is already getting pretty big XD Includes both the new "independent"-event-loops test as well as the std usage base case suite.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	4edf36a895	Impl a proto "unmasker" `@acm` alongside our test Such that the suite verifies the wip `maybe_raise_from_masking_exc()` will raise from a `trio.Cancelled.__context__` since I can't think of any reason a `Cancelled` should ever be raised in-place of a non-`Cancelled` XD Not sure what should be raised instead (or maybe just a `log.warning()` emitted?) but this starts a draft for refinement at the least. Use the new `@pytest.mark.parametrize` explicit tuple-of-params form with an `pytest.param + `.mark.xfail()` for the default behaviour case.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	bfd1864180	Add a "raise-from-`finally:`" example test Since i wasted 2 days just to find an example of this inside an `@acm`, figured I better reproduce for the purposes of maybe implementing a warning sys (inside our wip proto `open_taskman()`) when a nursery detects a single `Cancelled` in an eg where the `.__context__` is set to some non-cancel error (which likely means a cancel-causing source exception was suppressed by accident). Left in a buncha commented code using `maybe_open_nursery()` which i thought might be part of the issue but didn't end up being required; will likely remove on a follow up refinement.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	3c8b1aa888	Add an inter-leaved-task error test Trying to replicate cases where errors are raised in both `trio` and `asyncio` tasks independently (at least in `.to_asyncio` API terms) with a new `test_trio_prestarted_task_bubbles` that generates 3 cases inside a `@acm` calls stack composing a `trio.Nursery` with a `to_asyncio.open_channel_from()` call where a set of `trio` tasks are started in a loop using `.start()` with various exc raising sequences, - the aio task raising before the last `trio` task spawns. - the aio task raising just after the last trio task spawns, but before it starts. - after the last trio task `.start()` call returns control to the parent - but (for now) did not error. TODO, still more cases to discover as i'm still fighting a `modden` bug of this sort atm.. Other, - tweak some other tests to have timeouts since some recent hangs were found.. - started mucking with py3.13 and thus adjustments for strict egs in some tests; full patchset to test suite likely coming soon!	2025-03-27 13:24:25 -04:00
Tyler Goodlet	a283d8c05a	Support and test infected-`asyncio`-mode for root Such that you can use, ```python tractor.to_asyncio.run_as_asyncio_guest( trio_main=_trio_main, ) ``` to boostrap the root actor (and thus main parent process) to embed the actor-rumtime into an `asyncio` loop. Prove it all works with an subactor-free version of the aio echo-server test suite B)	2025-03-27 13:24:25 -04:00
Tyler Goodlet	a58c1cad91	Change `tractor.breakpoint()` to new `.pause()` in test suite	2025-03-27 13:24:25 -04:00
Tyler Goodlet	e1d96099fc	Wrap `asyncio_bp.py` ex into test suite Ensuring we can at least use `breakpoint()` from an infected actor's `asyncio.Task` spawned via a `.to_asyncio` API. Also includes a little `tests/devx/` reorging, - start splitting out non-`tractor.pause()` tests into a new `test_pause_from_non_trio.py` for all the `.pause_from_sync()` use in bg-threaded or `asyncio` applications. - factor harness commonalities to the `devx/conftest` (namely the `do_ctlc()` masher). - mv `test_pause_from_sync` to the new non`-trio` mod. NOTE, the `ctlc=True` is still failing for `test_pause_from_asyncio_task` which is a user-happiness bug but not anything fundamentally broken - just need to handle the `asyncio` case in `.devx._debug.sigint_shield()`!	2025-03-27 13:24:25 -04:00
Tyler Goodlet	ccd60b0c6e	Add `breakpoint()` hook restoration example + test	2025-03-27 13:24:25 -04:00
Tyler Goodlet	00d1c8ea29	Fix multi-daemon debug test `break` signal.. It was expecting `AssertionError` as a proceed-in-test signal (by breaking from a continue loop), but `in_prompt_msg(raise_on_err=True)` was changed to raise `ValueError`; so instead just use as a predicate for the `break`. Also rework `in_prompt_msg()` to accept the `child: BaseSpawn` as input instead of `before: str` remove the casting boilerplate, and adjust all usage to match.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	5cdfee3bcf	Pass `infect_asyncio` setting via runtime-vars The reason for this "duplication" with the `--asyncio` CLI flag (passed to the child during spawn) is 2-fold: - allows verifying inside `Actor._from_parent()` that the `trio` runtime was started via `.start_guest_run()` as well as if the `Actor._infected_aio` spawn-entrypoint value has been set (by the `._entry.<spawn-backend>_main()` whenever `--asyncio` is passed) such that any mismatch can be signaled via an `InternalError`. - enables checking the `._state._runtime_vars['_is_infected_aio']` value directly (say from a non-actor/`trio`-thread) instead of calling `._state.current_actor(err_on_no_runtime=False)` in certain edge cases. Impl/testing deats: - add `._state._runtime_vars['_is_infected_aio'] = False` default. - raise `InternalError` on any `--asyncio`-flag-passed vs. `_runtime_vars`-value-relayed-from-parent inside `Actor._from_parent()` and include a `Runner.is_guest` assert for good measure B) - set and relay `infect_asyncio: bool` via runtime-vars to child in `ActorNursery.start_actor()`. - verify `actor.is_infected_aio()`, `actor._infected_aio` and `_state._runtime_vars['_is_infected_aio']` are all set in test suite's `asyncio_actor()` endpoint.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	64d506970a	Officially test proto-ed `stackscope` integration By re-purposing our `pexpect`-based console matching with a new `debugging/shield_hang_in_sub.py` example, this tests a few "hanging actor" conditions more formally: - that despite a hanging actor's task we can dump a `stackscope.extract()` tree on relay of `SIGUSR1`. - the actor tree will terminate despite a shielded forever-sleep by our "T-800" zombie reaper machinery activating and hard killing the underlying subprocess. Some test deats: - simulates the expect actions of a real user by manually using `os.kill()` to send both signals to the actor-tree program. - `pexpect`-matches against `log.devx()` emissions under normal `debug_mode == True` usage. - ensure we get the actual "T-800 deployed" `log.error()` msg and that the actor tree eventually terminates! Surrounding (re-org/impl/test-suite) changes: - allow disabling usage via a `maybe_enable_greenback: bool` to `open_root_actor()` but enable by def. - pretty up the actual `.devx()` content from `.devx._stackscope` including be extra pedantic about the conc-primitives for each signal event. - try to avoid double handles of `SIGUSR1` even though it seems the original (what i thought was a) problem was actually just double logging in the handler.. \|_ avoid double applying the handler func via `signal.signal()`, \|_ use a global to avoid double handle func calls and, \|_ a `threading.RLock` around handling. - move common fixtures and helper routines from `test_debugger` to `tests/devx/conftest.py` and import them for use in both test mods.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	de7b114303	Start a new `tests/devx/` tooling-subsuite-pkg	2025-03-27 13:24:25 -04:00
Tyler Goodlet	f195c5ec47	Move `mk_cmd()` to `._testing` Since we're going to need it more generally for `.devx` sub-sys tooling tests. Also, up the sync-pause ctl-c delay another 10ms..	2025-03-27 13:24:25 -04:00
Tyler Goodlet	92713af63e	Get multi-threaded sync-pausing fully workin! The final issue was making sure we do the same thing on ctl-c/SIGINT from the user. That is, if there's already a bg-thread in REPL, we `log.pdb()` about SIGINT shielding and re-draw the prompt; the same UX as normal actor-runtime-task behaviour. Reasons this wasn't workin.. and the fix: - `.pause_from_sync()` was overriding the local `repl` var with `None` delivered by (transitive) calls to `_pause(debug_func=None)`.. so remove all that and only assign it OAOO prior to thread-type case branching. - always call `DebugStatus.shield_sigint()` as needed from all requesting threads/tasks: - in `_pause_from_bg_root_thread()` BEFORE calling `._pause()` AND BEFORE yielding back to the bg-thread via `.started(out)` to ensure we're definitely overriding the handler in the `trio`-main-thread task before unblocking the requesting bg-thread. - from any requesting bg-thread in the root actor such that both its main-`trio`-thread scheduled task (as per above bullet) AND it are SIGINT shielded. - always call `.shield_sigint()` BEFORE any `greenback._await()` case don't entirely grok why yet, but it works)? - for `greenback._await()` case always set `bg_task` to the current one.. - tweaks to the `SIGINT` handler, now renamed `sigint_shield()` so as not to name-collide with the methods when editor-searching: - always try to `repr()` the REPL thread/task "owner" as well as the active `PdbREPL` instance. - add `.devx()` notes around the prompt flushing deats and comments for any root-actor-bg-thread edge cases. Related/supporting refinements: - add `get_lock()`/`get_debug_req()` factory funcs since the plan is to eventually implement both as `@singleton` instances per actor. - fix `acquire_debug_lock()`'s call-sig-bug for scheduling `request_root_stdio_lock()`.. - in `._pause()` only call `mk_pdb()` when `debug_func != None`. - add some todo/warning notes around the `cls.repl = None` in `DebugStatus.release()` `test_pause_from_sync()` tweaks: - don't use a `attach_patts.copy()`, since we always `break` on match. - do `pytest.fail()` on that ^ loop's fallthrough.. - pass `do_ctlc(child, patt=attach_key)` such that we always match the the current thread's name with the ctl-c triggered `.pdb()` emission. - oh yeah, return the last `before: str` from `do_ctlc()`. - in the script, flip `abandon_on_cancel=True` since when `False` it seems to cause `trio.run()` to hang on exit from the last bg-thread case?!?	2025-03-27 13:24:25 -04:00
Tyler Goodlet	b057a1681c	Todo a test for sync-pausing from non-main-root-tasks	2025-03-27 13:24:25 -04:00
Tyler Goodlet	53409f2942	Demo-abandonment on shielded `trio`-side work Finally this reproduces the issue as it (originally?) exhibited inside `piker` where the `Actor.lifetime_stack` wasn't closed in cases where during `infected_aio`-actor cancellation/shutdown `trio` side tasks which are doing shielded (teardown) work are NOT being watched/waited on from the `aio_main()` task-closure inside `run_as_asyncio_guest()`! This is then the root cause of the guest-run being abandoned since if our `aio_main()` task-closure doesn't know it should allow the run to finish, it's going to call `loop.close()` eventually resulting in the `GeneratorExit` thrown into `trio._core._run.unrolled_run()`.. So, this extends the `test_sigint_closes_lifetime_stack()` suite to include cases for such shielded `trio`-task ops: - add a new `trio_side_is_shielded: bool` which will toggle whether to add a shielded 0.5s `trio.sleep()` loop to `manage_file()` which should outlive the `asyncio` event-loop shutdown sequence and result in an abandoned guest-run and thus a leaked file. - parametrize the existing suite with this case resulting in a total 16 test set B) This patch demonstrates the problem with our `aio_main()` task-closure impl via the now 4 failing tests, a fix is coming in a follow up commit!	2025-03-27 13:24:25 -04:00
Tyler Goodlet	7f00921be1	Lel, revert `AsyncioCancelled` inherit, module.. Turns out it somehow breaks our `to_asyncio` error relay since obvi `asyncio`'s runtime seems to specially handle it (prolly via `isinstance()` ?) and it caused our `test_aio_cancelled_from_aio_causes_trio_cancelled()` to hang.. Further, obvi `unpack_error()` won't be able to find the type def if not kept inside `._exceptions`.. So given all that, revert the change/move as well as: - tweak the aio-from-aio cancel test to timeout. - do `trio.sleep()` conc with any bg aio task by moving out nursery block. - add a `send_sigint_to: str` parameter to `test_sigint_closes_lifetime_stack()` such that we test the SIGINT being relayed to just the parent or the child.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	a9b3336318	Hack `asyncio` to not abandon a guest-mode run? Took me a while to figure out what the heck was going on but, turns out `asyncio` changed their SIGINT handling in 3.11 as per: https://docs.python.org/3/library/asyncio-runner.html#handling-keyboard-interruption I'm not entirely sure if it's the 3.11 changes or possibly wtv further updates were made in 3.12 but more or less due to the way our current main task was written the `trio` guest-run was getting abandoned on SIGINTs sent from the OS to the infected child proc.. Note that much of the bug and soln cases are layed out in very detailed comment-notes both in the new test and `run_as_asyncio_guest()`, right above the final "fix" lines. Add new `test_infected_aio.test_sigint_closes_lifetime_stack()` test suite which reliably triggers all abandonment issues with multiple cases of different parent behaviour post-sending-SIGINT-to-child: 1. briefly sleep then raise a KBI in the parent which was originally demonstrating the file leak not being cleaned up by `Actor.lifetime_stack.close()` and simulates a ctl-c from the console (relayed in tandem by the OS to the parent and child processes). 2. do `Context.wait_for_result()` on the child context which would hang and timeout since the actor runtime would never complete and thus never relay a `ContextCancelled`. 3. both with and without running a `asyncio` task in the `manage_file` child actor; originally it seemed that with an aio task scheduled in the child actor the guest-run abandonment always was the "loud" case where there seemed to be some actor teardown but with tbs from python failing to gracefully exit the `trio` runtime.. The (seemingly working) "fix" required 2 lines of code to be run inside a `asyncio.CancelledError` handler around the call to `await trio_done_fut`: - `Actor.cancel_soon()` which schedules the actor runtime to cancel on the next `trio` runner cycle and results in a "self cancellation" of the actor. - "pumping the `asyncio` event loop" with a non-0 `.sleep(0.1)` XD \|_ seems that a "shielded" pump with some actual `delay: float >= 0` did the trick to get `asyncio` to allow the `trio` runner/loop to fully complete its guest-run without abandonment. Other supporting changes: - move `._exceptions.AsyncioCancelled`, our renamed `asyncio.CancelledError` error-sub-type-wrapper, to `.to_asyncio` and make it derive from `CancelledError` so as to be sure when raised by our `asyncio` x-> `trio` exception relay machinery that `asyncio` is getting the specific type it expects during cancellation. - do "summary status" style logging in `run_as_asyncio_guest()` wherein we compile the eventual `startup_msg: str` emitted just before waiting on the `trio_done_fut`. - shield-wait with `out: Outcome = await asyncio.shield(trio_done_fut)` even though it seems to do nothing in the SIGINT handling case..(I presume it might help avoid abandonment in a `asyncio.Task.cancel()` case maybe?)	2025-03-27 13:24:25 -04:00
Tyler Goodlet	d1b4d4be52	Adjusts advanced fault tests to match new `TransportClosed` semantics	2025-03-24 14:04:52 -04:00
Tyler Goodlet	32f7742e53	Finally implement peer-lookup optimization.. There's a been a todo for soo long for this XD Since all `Actor`'s store a set of `._peers` we can try a lookup on that table as a shortcut before pinging the registry Bo Impl deats: - add a new `._discovery.get_peer_by_name()` routine which attempts the `._peers` lookup by combining a copy of that `dict` + an entry added for `Actor._parent_chan` (since all subs have a parent and often the desired contact is just that connection). - change `.find_actor()` (for the `only_first == True` case), `.query_actor()` and `.wait_for_actor()` to call the new helper and deliver appropriate outputs if possible. Other, - deprecate `get_arbiter()` def and all usage in tests and examples. - drop lingering use of `arbiter_sockaddr` arg to various routines. - tweak the `Actor` doc str as well as some code fmting and a tweak to the `._stream_handler()`'s initial `con_status: str` logging value since the way it was could never be reached.. oh and `.warning()` on any new connections which already have a `_pre_chan: Channel` entry in `._peers` so we can start minimizing IPC duplications.	2025-03-24 14:04:52 -04:00
Tyler Goodlet	0332604044	(Re)type annot some tests - For the (still not finished) `test_caps_based_msging`, switch to using the new `PayloadMsg`. - add `testdir` fixture type.	2025-03-24 14:04:52 -04:00
Tyler Goodlet	c7f153c266	Update `MsgTypeError` content matching to latest	2025-03-24 14:04:52 -04:00
Tyler Goodlet	89c2137fc9	Update pld-rx limiting test(s) to use deco input The tests only use one input spec (conveniently) so there's not much to change in the logic, - only pass the `maybe_msg_spec` to the child-side decorator and obvi drop the surrounding `msgops.limit_plds()` block in the child. - tweak a few `MsgDec` asserts, mostly dropping the `msg._ops._def_any_spec` state checks since the child-side won't have any pre pld-spec state given the runtime now applies the `pld_spec` before running the task's func body. - also allowed dropping the `finally:` which did a similar check outside the `.limit_plds()` block.	2025-03-24 14:04:52 -04:00
Tyler Goodlet	f83e06d371	Use new `._debug._repl_fail_msg` inside `test_pause_from_sync`	2025-03-24 14:04:52 -04:00
Tyler Goodlet	2f1a97e73e	Catch `.pause_from_sync()` in root bg thread bugs! Originally discovered as while using `tractor.pause_from_sync()` from the `i3ipc` client running in a bg-thread that uses `asyncio` inside `modden`. Turns out we definitely aren't correctly handling `.pause_from_sync()` from the root actor when called from a `trio.to_thread.run_sync()` bg thread: - root-actor bg threads which can't `Lock._debug_lock.acquire()` since they aren't in `trio.Task`s. - even if scheduled via `.to_thread.run_sync(_debug._pause)` the acquirer won't be the task/thread which calls `Lock.release()` from `PdbREPL` hooks; this results in a RTE raised by `trio`.. - multiple threads will step on each other's stdio since cpython's GIL seems to ctx switch threads on every input from the user to the REPL loop.. Reproduce via reworking our example and test so that they catch and fail for all edge cases: - rework the `/examples/debugging/sync_bp.py` example to demonstrate the above issues, namely the stdio clobbering in the REPL when multiple threads and/or a subactor try to debug simultaneously. \|_ run one thread using a task nursery to ensure it runs conc with the nursery's parent task. \|_ ensure the bg threads run conc a subactor usage of `.pause_from_sync()`. \|_ gravely detail all the special cases inside a TODO comment. \|_ add some control flags to `sync_pause()` helper and don't use `breakpoint()` by default. - extend and adjust `test_debugger.test_pause_from_sync` to match (and thus currently fail) by ensuring exclusive `PdbREPL` attachment when the 2 bg root-actor threads are concurrently interacting alongside the subactor: \|_ should only see one of the `_pause_msg` logs at a time for either one of the threads or the subactor. \|_ ensure each attaches (in no particular order) before expecting the script to exit. Impl adjustments to `.devx._debug`: - drop `Lock.repl`, no longer used. - add `Lock._owned_by_root: bool` for the `.ctx_in_debug == None` root-actor-task active case. - always `log.exception()` for any `._debug_lock.release()` ownership RTE emitted by `trio`, like we used to.. - add special `Lock.release()` log message for the stale lock but `._owned_by_root == True` case; oh yeah and actually `log.devx(message)`.. - rename `Lock.acquire()` -> `.acquire_for_ctx()` since it's only ever used from subactor IPC usage; well that and for local root-task usage we should prolly add a `.acquire_from_root_task()`? - buncha `._pause()` impl improvements: \|_ type `._pause()`'s `debug_func` as a `partial` as well. \|_ offer `called_from_sync: bool` and `called_from_bg_thread: bool` for the special case handling when called from `.pause_from_sync()` \|_ only set `DebugStatus.repl/repl_task` when `debug_func != None` (OW ensure the `.repl_task` is not the current one). \|_ handle error logging even when `debug_func is None`.. \|_ lotsa detailed commentary around root-actor-bg-thread special cases. - when `._set_trace(hide_tb=False)` do `pdbp.set_trace(frame=currentframe())` so the `._debug` internal frames are always included. - by default always hide tracebacks for `.pause[_from_sync]()` internals. - improve `.pause_from_sync()` to avoid root-bg-thread crashes: \|_ pass new `called_from_xxx_` flags and ensure `DebugStatus.repl_task` is actually set to the `threading.current_thread()` when needed. \|_ manually call `Lock._debug_lock.acquire_nowait()` for the non-bg thread case. \|_ TODO: still need to implement the bg-thread case using a bg `trio.Task`-in-thread with an `trio.Event` set by thread REPL exit.	2025-03-24 14:04:52 -04:00
Tyler Goodlet	4bc7569981	Woops, set `post_mortem=False` by default again!	2025-03-24 14:04:52 -04:00
Tyler Goodlet	15a47dc4f7	Finally, officially support shielded REPL-ing! It's been a long time prepped and now finally implemented! Offer a `shield: bool` argument from our async `._debug` APIs: - `await tractor.pause(shield=True)`, - `await tractor.post_mortem(shield=True)` ^-These-^ can now be used inside cancelled `trio.CancelScope`s, something very handy when introspecting complex (distributed) system tear/shut-downs particularly under remote error or (inter-peer) cancellation conditions B) Thanks to previous prepping in a prior attempt and various patches from the rigorous rework of `.devx._debug` internals around typed msg specs, there ain't much that was needed! Impl deats - obvi passthrough `shield` from the public API endpoints (was already done from a prior attempt). - put ad-hoc internal `with trio.CancelScope(shield=shield):` around all checkpoints inside `._pause()` for both the root-process and subactor case branches. Add a fairly rigorous example, `examples/debugging/shielded_pause.py` with a wrapping `pexpect` test, `test_debugger.test_shield_pause()` and ensure it covers as many cases as i can think of offhand: - multiple `.pause()` entries in a loop despite parent scope cancellation in a subactor RPC task which itself spawns a sub-task. - a `trio.Nursery.parent_task` which raises, is handled and tries to enter and unshielded `.post_mortem()`, which of course internally raises `Cancelled` in a `._pause()` checkpoint, so we catch the `Cancelled` again and then debug the debugger's internal cancellation with specific checks for the particular raising checkpoint-LOC. - do ^- the latter -^ for both subactor and root cases to ensure we can debug `._pause()` itself when it tries to REPL engage from a cancelled task scope Bo	2025-03-24 14:04:52 -04:00
Tyler Goodlet	5bab7648e2	Add a `tractor.post_mortem()` API test + example Since turns out we didn't have a single example using that API Bo The test granular-ly checks all use cases: - `.post_mortem()` manual calls in both subactor and root. - ensuring built-in RPC crash handling activates after each manual one from ^. - drafted some call-stack frame checking that i commented out for now since we need to first do ANSI escape code removal due to the colorization that `pdbp` does by default. \|_ added a TODO with SO link on `assert_before()`. Also todo-staged a shielded-pause test to match with the already existing-but-needs-refinement example B)	2025-03-24 14:04:52 -04:00
Tyler Goodlet	d099466d21	Change `reraise` to `post_mortem: bool` in `maybe_expect_raises()`	2025-03-24 14:04:52 -04:00
Tyler Goodlet	4b843d6219	Ensure only a boxed traceback for MTE on parent side	2025-03-24 14:04:51 -04:00
Tyler Goodlet	fa2893cc87	Ensure ctx error-state matches the MTE scenario Namely checking that `Context._remote_error` is set to the raised MTE in the invalid started and return value cases since prior to the recent underlying changes to the `Context.result()` impl, it would not match. Further, - do asserts for non-MTE raising cases in both the parent and child. - add todos for testing ctx-outcomes for per-side-validation policies i anticipate supporting and implied msg-dialog race cases therein.	2025-03-24 14:04:51 -04:00
Tyler Goodlet	9dc7602f21	Fix `test_basic_payload_spec` bad msg matching Expecting `Started` or `Return` with respective bad `.pld` values depending on what type of failure is test parametrized. This makes the suite run green it seems B)	2025-03-24 14:04:51 -04:00

1 2 3 4 5 ...

463 Commits (hilevel_serman)