tractor

Commit Graph

Author	SHA1	Message	Date
Tyler Goodlet	1f1a3f19d5	Fix multi-daemon debug test `break` signal.. It was expecting `AssertionError` as a proceed-in-test signal (by breaking from a continue loop), but `in_prompt_msg(raise_on_err=True)` was changed to raise `ValueError`; so instead just use as a predicate for the `break`. Also rework `in_prompt_msg()` to accept the `child: BaseSpawn` as input instead of `before: str` remove the casting boilerplate, and adjust all usage to match.	2024-07-12 15:57:41 -04:00
Tyler Goodlet	a628eabb30	Officially test proto-ed `stackscope` integration By re-purposing our `pexpect`-based console matching with a new `debugging/shield_hang_in_sub.py` example, this tests a few "hanging actor" conditions more formally: - that despite a hanging actor's task we can dump a `stackscope.extract()` tree on relay of `SIGUSR1`. - the actor tree will terminate despite a shielded forever-sleep by our "T-800" zombie reaper machinery activating and hard killing the underlying subprocess. Some test deats: - simulates the expect actions of a real user by manually using `os.kill()` to send both signals to the actor-tree program. - `pexpect`-matches against `log.devx()` emissions under normal `debug_mode == True` usage. - ensure we get the actual "T-800 deployed" `log.error()` msg and that the actor tree eventually terminates! Surrounding (re-org/impl/test-suite) changes: - allow disabling usage via a `maybe_enable_greenback: bool` to `open_root_actor()` but enable by def. - pretty up the actual `.devx()` content from `.devx._stackscope` including be extra pedantic about the conc-primitives for each signal event. - try to avoid double handles of `SIGUSR1` even though it seems the original (what i thought was a) problem was actually just double logging in the handler.. \|_ avoid double applying the handler func via `signal.signal()`, \|_ use a global to avoid double handle func calls and, \|_ a `threading.RLock` around handling. - move common fixtures and helper routines from `test_debugger` to `tests/devx/conftest.py` and import them for use in both test mods.	2024-07-10 19:58:27 -04:00
Tyler Goodlet	fc95c6719f	Get multi-threaded sync-pausing fully workin! The final issue was making sure we do the same thing on ctl-c/SIGINT from the user. That is, if there's already a bg-thread in REPL, we `log.pdb()` about SIGINT shielding and re-draw the prompt; the same UX as normal actor-runtime-task behaviour. Reasons this wasn't workin.. and the fix: - `.pause_from_sync()` was overriding the local `repl` var with `None` delivered by (transitive) calls to `_pause(debug_func=None)`.. so remove all that and only assign it OAOO prior to thread-type case branching. - always call `DebugStatus.shield_sigint()` as needed from all requesting threads/tasks: - in `_pause_from_bg_root_thread()` BEFORE calling `._pause()` AND BEFORE yielding back to the bg-thread via `.started(out)` to ensure we're definitely overriding the handler in the `trio`-main-thread task before unblocking the requesting bg-thread. - from any requesting bg-thread in the root actor such that both its main-`trio`-thread scheduled task (as per above bullet) AND it are SIGINT shielded. - always call `.shield_sigint()` BEFORE any `greenback._await()` case don't entirely grok why yet, but it works)? - for `greenback._await()` case always set `bg_task` to the current one.. - tweaks to the `SIGINT` handler, now renamed `sigint_shield()` so as not to name-collide with the methods when editor-searching: - always try to `repr()` the REPL thread/task "owner" as well as the active `PdbREPL` instance. - add `.devx()` notes around the prompt flushing deats and comments for any root-actor-bg-thread edge cases. Related/supporting refinements: - add `get_lock()`/`get_debug_req()` factory funcs since the plan is to eventually implement both as `@singleton` instances per actor. - fix `acquire_debug_lock()`'s call-sig-bug for scheduling `request_root_stdio_lock()`.. - in `._pause()` only call `mk_pdb()` when `debug_func != None`. - add some todo/warning notes around the `cls.repl = None` in `DebugStatus.release()` `test_pause_from_sync()` tweaks: - don't use a `attach_patts.copy()`, since we always `break` on match. - do `pytest.fail()` on that ^ loop's fallthrough.. - pass `do_ctlc(child, patt=attach_key)` such that we always match the the current thread's name with the ctl-c triggered `.pdb()` emission. - oh yeah, return the last `before: str` from `do_ctlc()`. - in the script, flip `abandon_on_cancel=True` since when `False` it seems to cause `trio.run()` to hang on exit from the last bg-thread case?!?	2024-07-10 12:29:05 -04:00
Tyler Goodlet	31207f92ee	Finally implement peer-lookup optimization.. There's a been a todo for soo long for this XD Since all `Actor`'s store a set of `._peers` we can try a lookup on that table as a shortcut before pinging the registry Bo Impl deats: - add a new `._discovery.get_peer_by_name()` routine which attempts the `._peers` lookup by combining a copy of that `dict` + an entry added for `Actor._parent_chan` (since all subs have a parent and often the desired contact is just that connection). - change `.find_actor()` (for the `only_first == True` case), `.query_actor()` and `.wait_for_actor()` to call the new helper and deliver appropriate outputs if possible. Other, - deprecate `get_arbiter()` def and all usage in tests and examples. - drop lingering use of `arbiter_sockaddr` arg to various routines. - tweak the `Actor` doc str as well as some code fmting and a tweak to the `._stream_handler()`'s initial `con_status: str` logging value since the way it was could never be reached.. oh and `.warning()` on any new connections which already have a `_pre_chan: Channel` entry in `._peers` so we can start minimizing IPC duplications.	2024-07-04 19:40:11 -04:00
Tyler Goodlet	30d60379c1	Drop thread logging to make `log.pdb()` patts match in test	2024-06-07 22:35:59 -04:00
Tyler Goodlet	408a74784e	Catch `.pause_from_sync()` in root bg thread bugs! Originally discovered as while using `tractor.pause_from_sync()` from the `i3ipc` client running in a bg-thread that uses `asyncio` inside `modden`. Turns out we definitely aren't correctly handling `.pause_from_sync()` from the root actor when called from a `trio.to_thread.run_sync()` bg thread: - root-actor bg threads which can't `Lock._debug_lock.acquire()` since they aren't in `trio.Task`s. - even if scheduled via `.to_thread.run_sync(_debug._pause)` the acquirer won't be the task/thread which calls `Lock.release()` from `PdbREPL` hooks; this results in a RTE raised by `trio`.. - multiple threads will step on each other's stdio since cpython's GIL seems to ctx switch threads on every input from the user to the REPL loop.. Reproduce via reworking our example and test so that they catch and fail for all edge cases: - rework the `/examples/debugging/sync_bp.py` example to demonstrate the above issues, namely the stdio clobbering in the REPL when multiple threads and/or a subactor try to debug simultaneously. \|_ run one thread using a task nursery to ensure it runs conc with the nursery's parent task. \|_ ensure the bg threads run conc a subactor usage of `.pause_from_sync()`. \|_ gravely detail all the special cases inside a TODO comment. \|_ add some control flags to `sync_pause()` helper and don't use `breakpoint()` by default. - extend and adjust `test_debugger.test_pause_from_sync` to match (and thus currently fail) by ensuring exclusive `PdbREPL` attachment when the 2 bg root-actor threads are concurrently interacting alongside the subactor: \|_ should only see one of the `_pause_msg` logs at a time for either one of the threads or the subactor. \|_ ensure each attaches (in no particular order) before expecting the script to exit. Impl adjustments to `.devx._debug`: - drop `Lock.repl`, no longer used. - add `Lock._owned_by_root: bool` for the `.ctx_in_debug == None` root-actor-task active case. - always `log.exception()` for any `._debug_lock.release()` ownership RTE emitted by `trio`, like we used to.. - add special `Lock.release()` log message for the stale lock but `._owned_by_root == True` case; oh yeah and actually `log.devx(message)`.. - rename `Lock.acquire()` -> `.acquire_for_ctx()` since it's only ever used from subactor IPC usage; well that and for local root-task usage we should prolly add a `.acquire_from_root_task()`? - buncha `._pause()` impl improvements: \|_ type `._pause()`'s `debug_func` as a `partial` as well. \|_ offer `called_from_sync: bool` and `called_from_bg_thread: bool` for the special case handling when called from `.pause_from_sync()` \|_ only set `DebugStatus.repl/repl_task` when `debug_func != None` (OW ensure the `.repl_task` is not the current one). \|_ handle error logging even when `debug_func is None`.. \|_ lotsa detailed commentary around root-actor-bg-thread special cases. - when `._set_trace(hide_tb=False)` do `pdbp.set_trace(frame=currentframe())` so the `._debug` internal frames are always included. - by default always hide tracebacks for `.pause[_from_sync]()` internals. - improve `.pause_from_sync()` to avoid root-bg-thread crashes: \|_ pass new `called_from_xxx_` flags and ensure `DebugStatus.repl_task` is actually set to the `threading.current_thread()` when needed. \|_ manually call `Lock._debug_lock.acquire_nowait()` for the non-bg thread case. \|_ TODO: still need to implement the bg-thread case using a bg `trio.Task`-in-thread with an `trio.Event` set by thread REPL exit.	2024-06-06 16:56:30 -04:00
Tyler Goodlet	8ea0f08386	Finally, officially support shielded REPL-ing! It's been a long time prepped and now finally implemented! Offer a `shield: bool` argument from our async `._debug` APIs: - `await tractor.pause(shield=True)`, - `await tractor.post_mortem(shield=True)` ^-These-^ can now be used inside cancelled `trio.CancelScope`s, something very handy when introspecting complex (distributed) system tear/shut-downs particularly under remote error or (inter-peer) cancellation conditions B) Thanks to previous prepping in a prior attempt and various patches from the rigorous rework of `.devx._debug` internals around typed msg specs, there ain't much that was needed! Impl deats - obvi passthrough `shield` from the public API endpoints (was already done from a prior attempt). - put ad-hoc internal `with trio.CancelScope(shield=shield):` around all checkpoints inside `._pause()` for both the root-process and subactor case branches. Add a fairly rigorous example, `examples/debugging/shielded_pause.py` with a wrapping `pexpect` test, `test_debugger.test_shield_pause()` and ensure it covers as many cases as i can think of offhand: - multiple `.pause()` entries in a loop despite parent scope cancellation in a subactor RPC task which itself spawns a sub-task. - a `trio.Nursery.parent_task` which raises, is handled and tries to enter and unshielded `.post_mortem()`, which of course internally raises `Cancelled` in a `._pause()` checkpoint, so we catch the `Cancelled` again and then debug the debugger's internal cancellation with specific checks for the particular raising checkpoint-LOC. - do ^- the latter -^ for both subactor and root cases to ensure we can debug `._pause()` itself when it tries to REPL engage from a cancelled task scope Bo	2024-05-30 17:52:24 -04:00
Tyler Goodlet	2f854a3e86	Add a `tractor.post_mortem()` API test + example Since turns out we didn't have a single example using that API Bo The test granular-ly checks all use cases: - `.post_mortem()` manual calls in both subactor and root. - ensuring built-in RPC crash handling activates after each manual one from ^. - drafted some call-stack frame checking that i commented out for now since we need to first do ANSI escape code removal due to the colorization that `pdbp` does by default. \|_ added a TODO with SO link on `assert_before()`. Also todo-staged a shielded-pause test to match with the already existing-but-needs-refinement example B)	2024-05-30 16:03:28 -04:00
Tyler Goodlet	27fd96729a	Tweaks to debugger examples Light stuff like comments, typing, and a couple API usage updates.	2024-05-28 09:22:59 -04:00
Tyler Goodlet	d2dee87b36	Modernize streaming example script - add typing, - apply multi-line call style, - use 'cancel' log level, - enable debug mode.	2024-05-09 16:51:51 -04:00
Tyler Goodlet	eca2c02f8b	Flip to `.pause()` in subactor bp example	2024-04-14 18:53:42 -04:00
Tyler Goodlet	0fcd424d57	Start a new `._testing.fault_simulation` Since I needed the `break_ipc()` helper from the `examples/advanced_faults/ipc_failure_during_stream.py` used in the `test_advanced_faults` suite, might as well move it into a pkg-wide importable module. Also changed the default break method to be `socket_close` which just calls `Stream.socket.close()` underneath in `trio`. Also tweak that example to not keep sending after the stream has been broken since with new `trio` that will raise `ClosedResourceError` and in the wrapping test we generally speaking want to see a hang and then cancel via simulated user sent SIGINT/ctl-c.	2024-04-03 10:19:50 -04:00
Tyler Goodlet	72b4dc1461	Provision for infected-`asyncio` debug mode support It's almost there, we're just missing the final translation code to get from an `asyncio` side task to be able to call `.devx._debug..wait_for_parent_stdin_hijack()` to do root actor TTY locking. Then we just need to ensure internals also do the right thing with `greenback()` for equivalent sync `breakpoint()` style pause points. Since i'm deferring this until later, tossing in some xfail tests to `test_infected_asyncio` with TODOs for the needed implementation as well as eventual test org. By "provision" it means we add: - `greenback` init block to `_run_asyncio_task()` when debug mode is enabled (but which will currently rte when `asyncio` is detected) using `.bestow_portal()` around the `asyncio.Task`. - a call to `_debug.maybe_init_greenback()` in the `run_as_asyncio_guest()` guest-mode entry point. - as part of `._debug.Lock.is_main_trio_thread()` whenever the async-lib is not 'trio' error lock the backend name (which is obvi `'asyncio'` in this use case).	2024-03-25 16:09:32 -04:00
Tyler Goodlet	4f863a6989	Refine and test `tractor.pause_from_sync()` Now supports use from any `trio` task, any sync thread started with `trio.to_thread.run_sync()` AND also via `breakpoint()` builtin API! The only bit missing now is support for `asyncio` tasks when in infected mode.. Bo `greenback` setup/API adjustments: - move `._rpc.maybe_import_gb()` to -> `devx._debug` and factor out the cached import checking into a sync func whilst placing the async `.ensure_portal()` bootstrapping into a new async `maybe_init_greenback()`. - use the new init-er func inside `open_root_actor()` with the output predicating whether we override the `breakpoint()` hook. core `devx._debug` implementation deatz: - make `mk_mpdb()` only return the `pdp.Pdb` subtype instance since the sigint unshielding func is now accessible from the `Lock` singleton from anywhere. - add non-main thread support (at least for `trio.to_thread` use cases) to our `Lock` with a new `.is_trio_thread()` predicate that delegates directly to `trio`'s internal version. - do `Lock.is_trio_thread()` checks inside any methods which require special provisions when invoked from a non-main `trio` thread: - `.[un]shield_sigint()` methods since `signal.signal` usage is only allowed from cpython's main thread. - `.release()` since `trio.StrictFIFOLock` can only be called from a `trio` task. - rework `.pause_from_sync()` itself to directly call `._set_trace()` and don't bother with `greenback._await()` when we're already calling it from a `.to_thread.run_sync()` thread, oh and try to use the thread/task name when setting `Lock.local_task_in_debug`. - make it an RTE for now if you try to use `.pause_from_sync()` from any infected-`asyncio` task, but support is (hopefully) coming soon! For testing we add a new `test_debugger.py::test_pause_from_sync()` which includes a ctrl-c parametrization around the `examples/debugging/sync_bp.py` script which includes all currently supported/working usages: - `tractor.pause_from_sync()`. - via `breakpoint()` overload. - from a `trio.to_thread.run_sync()` spawn.	2024-03-22 19:58:25 -04:00
Tyler Goodlet	c04d77a3c9	First draft workin minus non-main-thread usage!	2024-03-20 19:13:13 -04:00
Tyler Goodlet	8ab5e08830	Adjust advanced faults test(s) for absorbed EoCs More or less just simplifies to not seeing the stream closure errors and instead expecting KBIs from the simulated user who 'ctl-cs after hang'. Toss in a little `stuff_hangin_ctlc()` to the script to wrap all that and always check stream closure before sending the final KBI.	2024-03-19 19:33:06 -04:00
Tyler Goodlet	9221c57234	Adjust all `RemoteActorError.type` using tests To instead use the new `.boxed_type` B)	2024-03-19 18:08:54 -04:00
Tyler Goodlet	71de56b09a	Drop now-deprecated deps on modern `trio`/Python - `trio_typing` is nearly obsolete since `trio >= 0.23` - `exceptiongroup` is built-in to python 3.11 - `async_generator` primitives have lived in `contextlib` for quite a while!	2024-03-13 18:41:24 -04:00
Tyler Goodlet	96992bcbb9	Add (back) a `tractor._testing` sub-pkg Since importing from our top level `conftest.py` is not scaleable or as "future forward thinking" in terms of: - LoC-wise (it's only one file), - prevents "external" (aka non-test) example scripts from importing content easily, - seemingly(?) can't be used via abs-import if using a `[tool.pytest.ini_options]` in a `pyproject.toml` vs. a `pytest.ini`, see: https://docs.pytest.org/en/8.0.x/reference/customize.html#pyproject-toml) => Go back to having an internal "testing" pkg like `trio` (kinda) does. Deats: - move generic top level helpers into pkg-mod including the new `expect_ctxc()` (which i needed in the advanced faults testing script. - move `@tractor_test` into `._testing.pytest` sub-mod. - adjust all the helper imports to be a `from tractor._testing import <..>` Rework `test_ipc_channel_break_during_stream()` and backing script: - make test(s) pull `debug_mode` from new fixture (which is now controlled manually from `--tpdb` flag) and drop the previous parametrized input. - update logic in ^ test for "which-side-fails" cases to better match recently updated/stricter cancel/failure semantics in terms of `ClosedResouruceError` vs. `EndOfChannel` expectations. - handle `ExceptionGroup`s with expected embedded errors in test. - better pendantics around whether to expect a user simulated KBI. - for `examples/advanced_faults/ipc_failure_during_stream.py` script: - generalize ipc breakage in new `break_ipc()` with support for diff internal `trio` methods and a #TODO for future disti frameworks - only make one sub-actor task break and the other just stream. - use new `._testing.expect_ctxc()` around ctx block. - add a bit of exception handling with `print()`s around ctxc (unused except if 'msg' break method is set) and eoc cases. - don't break parent side ipc in loop any more then once after first break, checked via flag var. - add a `pre_close: bool` flag to control whether `MsgStreama.aclose()` is called before any ipc breakage method. Still TODO: - drop `pytest.ini` and add the alt section to `pyproject.py`. -> currently can't get `--rootdir=` opt to work.. not showing in console header. -> ^ also breaks on 'tests' `enable_modules` imports in subactors during discovery tests?	2024-03-13 09:09:08 -04:00
Tyler Goodlet	6b1ceee19f	Type out the full-fledged streaming ex.	2023-10-18 15:36:00 -04:00
Tyler Goodlet	ee87cf0e29	Add a debug-mode-breakpoint-causes-hang case! Only found this by luck more or less (while working on something in a client project) and it turns out we can actually get to (yet another) hang state where SIGINT will be ignored by the root actor on teardown.. I've added all the necessary logic flags to reproduce. We obviously need a follow up bug issue and a test suite to replicate! It appears as though the following are required based on very light tinkering: - infected asyncio mode active - debug mode active - the `trio` context must breakpoint before `.started()`-ing - the `asyncio` must not error	2023-06-21 14:07:31 -04:00
Tyler Goodlet	ebcb275cd8	Add (first-draft) infected-`asyncio` actor task uses debugger example	2023-06-21 14:07:31 -04:00
Tyler Goodlet	79622bbeea	Restore `breakpoint()` hook after runtime exits Previously we were leaking our (pdb++) override into the Python runtime which would always result in a runtime error whenever `breakpoint()` is called outside our runtime; after exit of the root actor . This explicitly restores any previous hook override (detected during startup) or deletes the hook and restores the environment if none existed prior. Also adds a new WIP debugging example script to ensure breakpointing works as normal after runtime close; this will be added to the test suite.	2023-05-15 00:47:29 -04:00
Tyler Goodlet	e34823aab4	Add parent vs. child cancels first cases	2023-01-29 14:55:02 -05:00
Tyler Goodlet	6c35ba2cb6	Add IPC breakage on both parent and child side With the new fancy `_pytest.pathlib.import_path()` we can do real parametrization of the example-script-module code and thus configure whether the child, parent, or both silently break the IPC connection. Parametrize the test for all the above mentioned cases as well as the case where the IPC never breaks but we still simulate the user hammering ctl-c / SIGINT to terminate the actor tree. Adjust expected errors based on each case and heavily document each of these.	2023-01-29 14:55:02 -05:00
Tyler Goodlet	3a0817ff55	Skip `advanced_faults/` subset in docs examples tests	2023-01-29 14:55:02 -05:00
Tyler Goodlet	7fddb4416b	Handle `mp` spawn method cases in test suite	2023-01-29 14:55:02 -05:00
Tyler Goodlet	4f8586a928	Wrap ex in new test, change dir helpers to use `pathlib.Path`	2023-01-29 14:55:02 -05:00
Tyler Goodlet	fb9ff45745	Move example to a new `advanced_faults` egs subset dir	2023-01-29 14:55:02 -05:00
Tyler Goodlet	36a83cb306	Refine example to drop IPC mid-stream Use a task nursery in the subactor to spawn tasks which cancel the IPC channel mid stream to simulate the most concurrent case we're likely to see. Make `main()` accept a `debug_mode: bool` for parametrization. Fill out detailed comments/docs on this example.	2023-01-29 14:55:02 -05:00
Tyler Goodlet	158569adae	Add WIP example of silent IPC breaks while streaming	2023-01-29 14:55:02 -05:00
Tyler Goodlet	c47575997a	Expand nested case to include error prop and breakpointing	2022-10-14 19:42:23 -04:00
Tyler Goodlet	c0dd5d7ffc	Adjust multi-daemon test to be more deterministic	2022-10-14 19:42:23 -04:00
Tyler Goodlet	10eeda2d2b	Use built-ins for all data-structure-type annotations	2022-09-15 23:41:28 -04:00
Tyler Goodlet	56c19093bb	Add basic module-not-found when opening a ctx eg.	2022-08-02 12:17:06 -04:00
Tyler Goodlet	2a61aa099b	Move pydantic-click hang example to new dir, skip in test suite	2022-08-02 12:16:58 -04:00
Tyler Goodlet	4fd924cfd2	Make example a subpkg for `python -m <mod>` testing	2022-07-27 11:40:02 -04:00
Tyler Goodlet	fe0fd1a1c1	Add example that triggers bug #302	2022-07-27 11:40:02 -04:00
Tyler Goodlet	dd23e78de1	Add back in async gen loop	2022-07-27 11:40:02 -04:00
Tyler Goodlet	bb732cefd0	Drop high log level in ctx example	2022-07-27 11:40:02 -04:00
Tyler Goodlet	21dccb2e79	A `.open_context()` example that causes a hang! Finally! I think this may be the root issue we've been seeing in production in a client project. No idea yet why this is happening but the fault-causing sequence seems to be: - `.open_context()` in a child actor - enter the debugger via `tractor.breakpoint()` - continue from that entry via `c` command in REPL - raise an error just after inside the context task's body Looking at logging it appears as though the child thinks it has the tty but no input is accepted on the REPL and a further `ctrl-c` results in some teardown but also a further hang where both parent and child become unresponsive..	2022-07-27 11:40:02 -04:00
Tyler Goodlet	99c4319940	Fix example name typo	2022-07-27 11:40:02 -04:00
Tyler Goodlet	42f9d10252	Add a pre-started breakpoint example	2022-07-27 11:40:02 -04:00
Tyler Goodlet	98de2fab31	Add context test that opens an inter-task-channel that errors	2022-07-14 16:13:12 -04:00
Tyler Goodlet	73d252e09e	Emphasize `asyncio` only with sleeps	2021-12-17 09:38:54 -05:00
Tyler Goodlet	b463841019	Add infected `asyncio` echo server example	2021-12-17 09:38:04 -05:00
Tyler Goodlet	949aa9c405	Lol. should probably push the example code...	2021-12-10 12:48:05 -05:00
Tyler Goodlet	7b9d410c4d	Adjust remaining examples and tests for non-backpressure default	2021-12-05 19:52:09 -05:00
Tyler Goodlet	546e1b2fa3	Drop unecessary partial	2021-11-04 10:41:25 -04:00
Tyler Goodlet	4cbb8641de	Add an `open_actor_cluster()` usage example	2021-11-02 15:37:36 -04:00

1 2

94 Commits (1f1a3f19d50405f418398b70d36443fdb7654064)