tractor

jc211

tractor

Author	SHA1	Message	Date
Tyler Goodlet	110a023a03	Drop `asyncio_bp` loglevel setting by default	2024-07-29 17:53:52 -04:00
Tyler Goodlet	89127614d5	First draft, `asyncio`-task, sync-pausing Bo Mostly due to magic from @oremanj where we slap in a little bit of `.from_asyncio`-type stuff to run a `trio`-task from `asyncio.Task` code! I'm not gonna go into tooo too much detail but basically the primary thing needed was a way to (blocking-ly) invoke a `trio.lowlevel.Task` from an `asyncio` one (which we now have with a new `run_trio_task_in_future()` thanks to draft code from the aforementioned jefe) which we now invoke from a dedicated aio case-branch inside `.devx._debug.pause_from_sync()`. Further include a case inside `DebugStatus.release()` to handle using the same func to set the `repl_release: trio.Event` from the aio side when releasing the REPL on exit cmds. Prolly more refinements to come ;{o	2024-07-15 13:42:08 -04:00
Tyler Goodlet	1f1a3f19d5	Fix multi-daemon debug test `break` signal.. It was expecting `AssertionError` as a proceed-in-test signal (by breaking from a continue loop), but `in_prompt_msg(raise_on_err=True)` was changed to raise `ValueError`; so instead just use as a predicate for the `break`. Also rework `in_prompt_msg()` to accept the `child: BaseSpawn` as input instead of `before: str` remove the casting boilerplate, and adjust all usage to match.	2024-07-12 15:57:41 -04:00
Tyler Goodlet	a628eabb30	Officially test proto-ed `stackscope` integration By re-purposing our `pexpect`-based console matching with a new `debugging/shield_hang_in_sub.py` example, this tests a few "hanging actor" conditions more formally: - that despite a hanging actor's task we can dump a `stackscope.extract()` tree on relay of `SIGUSR1`. - the actor tree will terminate despite a shielded forever-sleep by our "T-800" zombie reaper machinery activating and hard killing the underlying subprocess. Some test deats: - simulates the expect actions of a real user by manually using `os.kill()` to send both signals to the actor-tree program. - `pexpect`-matches against `log.devx()` emissions under normal `debug_mode == True` usage. - ensure we get the actual "T-800 deployed" `log.error()` msg and that the actor tree eventually terminates! Surrounding (re-org/impl/test-suite) changes: - allow disabling usage via a `maybe_enable_greenback: bool` to `open_root_actor()` but enable by def. - pretty up the actual `.devx()` content from `.devx._stackscope` including be extra pedantic about the conc-primitives for each signal event. - try to avoid double handles of `SIGUSR1` even though it seems the original (what i thought was a) problem was actually just double logging in the handler.. \|_ avoid double applying the handler func via `signal.signal()`, \|_ use a global to avoid double handle func calls and, \|_ a `threading.RLock` around handling. - move common fixtures and helper routines from `test_debugger` to `tests/devx/conftest.py` and import them for use in both test mods.	2024-07-10 19:58:27 -04:00
Tyler Goodlet	fc95c6719f	Get multi-threaded sync-pausing fully workin! The final issue was making sure we do the same thing on ctl-c/SIGINT from the user. That is, if there's already a bg-thread in REPL, we `log.pdb()` about SIGINT shielding and re-draw the prompt; the same UX as normal actor-runtime-task behaviour. Reasons this wasn't workin.. and the fix: - `.pause_from_sync()` was overriding the local `repl` var with `None` delivered by (transitive) calls to `_pause(debug_func=None)`.. so remove all that and only assign it OAOO prior to thread-type case branching. - always call `DebugStatus.shield_sigint()` as needed from all requesting threads/tasks: - in `_pause_from_bg_root_thread()` BEFORE calling `._pause()` AND BEFORE yielding back to the bg-thread via `.started(out)` to ensure we're definitely overriding the handler in the `trio`-main-thread task before unblocking the requesting bg-thread. - from any requesting bg-thread in the root actor such that both its main-`trio`-thread scheduled task (as per above bullet) AND it are SIGINT shielded. - always call `.shield_sigint()` BEFORE any `greenback._await()` case don't entirely grok why yet, but it works)? - for `greenback._await()` case always set `bg_task` to the current one.. - tweaks to the `SIGINT` handler, now renamed `sigint_shield()` so as not to name-collide with the methods when editor-searching: - always try to `repr()` the REPL thread/task "owner" as well as the active `PdbREPL` instance. - add `.devx()` notes around the prompt flushing deats and comments for any root-actor-bg-thread edge cases. Related/supporting refinements: - add `get_lock()`/`get_debug_req()` factory funcs since the plan is to eventually implement both as `@singleton` instances per actor. - fix `acquire_debug_lock()`'s call-sig-bug for scheduling `request_root_stdio_lock()`.. - in `._pause()` only call `mk_pdb()` when `debug_func != None`. - add some todo/warning notes around the `cls.repl = None` in `DebugStatus.release()` `test_pause_from_sync()` tweaks: - don't use a `attach_patts.copy()`, since we always `break` on match. - do `pytest.fail()` on that ^ loop's fallthrough.. - pass `do_ctlc(child, patt=attach_key)` such that we always match the the current thread's name with the ctl-c triggered `.pdb()` emission. - oh yeah, return the last `before: str` from `do_ctlc()`. - in the script, flip `abandon_on_cancel=True` since when `False` it seems to cause `trio.run()` to hang on exit from the last bg-thread case?!?	2024-07-10 12:29:05 -04:00
Tyler Goodlet	30d60379c1	Drop thread logging to make `log.pdb()` patts match in test	2024-06-07 22:35:59 -04:00
Tyler Goodlet	408a74784e	Catch `.pause_from_sync()` in root bg thread bugs! Originally discovered as while using `tractor.pause_from_sync()` from the `i3ipc` client running in a bg-thread that uses `asyncio` inside `modden`. Turns out we definitely aren't correctly handling `.pause_from_sync()` from the root actor when called from a `trio.to_thread.run_sync()` bg thread: - root-actor bg threads which can't `Lock._debug_lock.acquire()` since they aren't in `trio.Task`s. - even if scheduled via `.to_thread.run_sync(_debug._pause)` the acquirer won't be the task/thread which calls `Lock.release()` from `PdbREPL` hooks; this results in a RTE raised by `trio`.. - multiple threads will step on each other's stdio since cpython's GIL seems to ctx switch threads on every input from the user to the REPL loop.. Reproduce via reworking our example and test so that they catch and fail for all edge cases: - rework the `/examples/debugging/sync_bp.py` example to demonstrate the above issues, namely the stdio clobbering in the REPL when multiple threads and/or a subactor try to debug simultaneously. \|_ run one thread using a task nursery to ensure it runs conc with the nursery's parent task. \|_ ensure the bg threads run conc a subactor usage of `.pause_from_sync()`. \|_ gravely detail all the special cases inside a TODO comment. \|_ add some control flags to `sync_pause()` helper and don't use `breakpoint()` by default. - extend and adjust `test_debugger.test_pause_from_sync` to match (and thus currently fail) by ensuring exclusive `PdbREPL` attachment when the 2 bg root-actor threads are concurrently interacting alongside the subactor: \|_ should only see one of the `_pause_msg` logs at a time for either one of the threads or the subactor. \|_ ensure each attaches (in no particular order) before expecting the script to exit. Impl adjustments to `.devx._debug`: - drop `Lock.repl`, no longer used. - add `Lock._owned_by_root: bool` for the `.ctx_in_debug == None` root-actor-task active case. - always `log.exception()` for any `._debug_lock.release()` ownership RTE emitted by `trio`, like we used to.. - add special `Lock.release()` log message for the stale lock but `._owned_by_root == True` case; oh yeah and actually `log.devx(message)`.. - rename `Lock.acquire()` -> `.acquire_for_ctx()` since it's only ever used from subactor IPC usage; well that and for local root-task usage we should prolly add a `.acquire_from_root_task()`? - buncha `._pause()` impl improvements: \|_ type `._pause()`'s `debug_func` as a `partial` as well. \|_ offer `called_from_sync: bool` and `called_from_bg_thread: bool` for the special case handling when called from `.pause_from_sync()` \|_ only set `DebugStatus.repl/repl_task` when `debug_func != None` (OW ensure the `.repl_task` is not the current one). \|_ handle error logging even when `debug_func is None`.. \|_ lotsa detailed commentary around root-actor-bg-thread special cases. - when `._set_trace(hide_tb=False)` do `pdbp.set_trace(frame=currentframe())` so the `._debug` internal frames are always included. - by default always hide tracebacks for `.pause[_from_sync]()` internals. - improve `.pause_from_sync()` to avoid root-bg-thread crashes: \|_ pass new `called_from_xxx_` flags and ensure `DebugStatus.repl_task` is actually set to the `threading.current_thread()` when needed. \|_ manually call `Lock._debug_lock.acquire_nowait()` for the non-bg thread case. \|_ TODO: still need to implement the bg-thread case using a bg `trio.Task`-in-thread with an `trio.Event` set by thread REPL exit.	2024-06-06 16:56:30 -04:00
Tyler Goodlet	8ea0f08386	Finally, officially support shielded REPL-ing! It's been a long time prepped and now finally implemented! Offer a `shield: bool` argument from our async `._debug` APIs: - `await tractor.pause(shield=True)`, - `await tractor.post_mortem(shield=True)` ^-These-^ can now be used inside cancelled `trio.CancelScope`s, something very handy when introspecting complex (distributed) system tear/shut-downs particularly under remote error or (inter-peer) cancellation conditions B) Thanks to previous prepping in a prior attempt and various patches from the rigorous rework of `.devx._debug` internals around typed msg specs, there ain't much that was needed! Impl deats - obvi passthrough `shield` from the public API endpoints (was already done from a prior attempt). - put ad-hoc internal `with trio.CancelScope(shield=shield):` around all checkpoints inside `._pause()` for both the root-process and subactor case branches. Add a fairly rigorous example, `examples/debugging/shielded_pause.py` with a wrapping `pexpect` test, `test_debugger.test_shield_pause()` and ensure it covers as many cases as i can think of offhand: - multiple `.pause()` entries in a loop despite parent scope cancellation in a subactor RPC task which itself spawns a sub-task. - a `trio.Nursery.parent_task` which raises, is handled and tries to enter and unshielded `.post_mortem()`, which of course internally raises `Cancelled` in a `._pause()` checkpoint, so we catch the `Cancelled` again and then debug the debugger's internal cancellation with specific checks for the particular raising checkpoint-LOC. - do ^- the latter -^ for both subactor and root cases to ensure we can debug `._pause()` itself when it tries to REPL engage from a cancelled task scope Bo	2024-05-30 17:52:24 -04:00
Tyler Goodlet	2f854a3e86	Add a `tractor.post_mortem()` API test + example Since turns out we didn't have a single example using that API Bo The test granular-ly checks all use cases: - `.post_mortem()` manual calls in both subactor and root. - ensuring built-in RPC crash handling activates after each manual one from ^. - drafted some call-stack frame checking that i commented out for now since we need to first do ANSI escape code removal due to the colorization that `pdbp` does by default. \|_ added a TODO with SO link on `assert_before()`. Also todo-staged a shielded-pause test to match with the already existing-but-needs-refinement example B)	2024-05-30 16:03:28 -04:00
Tyler Goodlet	27fd96729a	Tweaks to debugger examples Light stuff like comments, typing, and a couple API usage updates.	2024-05-28 09:22:59 -04:00
Tyler Goodlet	eca2c02f8b	Flip to `.pause()` in subactor bp example	2024-04-14 18:53:42 -04:00
Tyler Goodlet	72b4dc1461	Provision for infected-`asyncio` debug mode support It's almost there, we're just missing the final translation code to get from an `asyncio` side task to be able to call `.devx._debug..wait_for_parent_stdin_hijack()` to do root actor TTY locking. Then we just need to ensure internals also do the right thing with `greenback()` for equivalent sync `breakpoint()` style pause points. Since i'm deferring this until later, tossing in some xfail tests to `test_infected_asyncio` with TODOs for the needed implementation as well as eventual test org. By "provision" it means we add: - `greenback` init block to `_run_asyncio_task()` when debug mode is enabled (but which will currently rte when `asyncio` is detected) using `.bestow_portal()` around the `asyncio.Task`. - a call to `_debug.maybe_init_greenback()` in the `run_as_asyncio_guest()` guest-mode entry point. - as part of `._debug.Lock.is_main_trio_thread()` whenever the async-lib is not 'trio' error lock the backend name (which is obvi `'asyncio'` in this use case).	2024-03-25 16:09:32 -04:00
Tyler Goodlet	4f863a6989	Refine and test `tractor.pause_from_sync()` Now supports use from any `trio` task, any sync thread started with `trio.to_thread.run_sync()` AND also via `breakpoint()` builtin API! The only bit missing now is support for `asyncio` tasks when in infected mode.. Bo `greenback` setup/API adjustments: - move `._rpc.maybe_import_gb()` to -> `devx._debug` and factor out the cached import checking into a sync func whilst placing the async `.ensure_portal()` bootstrapping into a new async `maybe_init_greenback()`. - use the new init-er func inside `open_root_actor()` with the output predicating whether we override the `breakpoint()` hook. core `devx._debug` implementation deatz: - make `mk_mpdb()` only return the `pdp.Pdb` subtype instance since the sigint unshielding func is now accessible from the `Lock` singleton from anywhere. - add non-main thread support (at least for `trio.to_thread` use cases) to our `Lock` with a new `.is_trio_thread()` predicate that delegates directly to `trio`'s internal version. - do `Lock.is_trio_thread()` checks inside any methods which require special provisions when invoked from a non-main `trio` thread: - `.[un]shield_sigint()` methods since `signal.signal` usage is only allowed from cpython's main thread. - `.release()` since `trio.StrictFIFOLock` can only be called from a `trio` task. - rework `.pause_from_sync()` itself to directly call `._set_trace()` and don't bother with `greenback._await()` when we're already calling it from a `.to_thread.run_sync()` thread, oh and try to use the thread/task name when setting `Lock.local_task_in_debug`. - make it an RTE for now if you try to use `.pause_from_sync()` from any infected-`asyncio` task, but support is (hopefully) coming soon! For testing we add a new `test_debugger.py::test_pause_from_sync()` which includes a ctrl-c parametrization around the `examples/debugging/sync_bp.py` script which includes all currently supported/working usages: - `tractor.pause_from_sync()`. - via `breakpoint()` overload. - from a `trio.to_thread.run_sync()` spawn.	2024-03-22 19:58:25 -04:00
Tyler Goodlet	c04d77a3c9	First draft workin minus non-main-thread usage!	2024-03-20 19:13:13 -04:00
Tyler Goodlet	9221c57234	Adjust all `RemoteActorError.type` using tests To instead use the new `.boxed_type` B)	2024-03-19 18:08:54 -04:00
Tyler Goodlet	71de56b09a	Drop now-deprecated deps on modern `trio`/Python - `trio_typing` is nearly obsolete since `trio >= 0.23` - `exceptiongroup` is built-in to python 3.11 - `async_generator` primitives have lived in `contextlib` for quite a while!	2024-03-13 18:41:24 -04:00
Tyler Goodlet	ee87cf0e29	Add a debug-mode-breakpoint-causes-hang case! Only found this by luck more or less (while working on something in a client project) and it turns out we can actually get to (yet another) hang state where SIGINT will be ignored by the root actor on teardown.. I've added all the necessary logic flags to reproduce. We obviously need a follow up bug issue and a test suite to replicate! It appears as though the following are required based on very light tinkering: - infected asyncio mode active - debug mode active - the `trio` context must breakpoint before `.started()`-ing - the `asyncio` must not error	2023-06-21 14:07:31 -04:00
Tyler Goodlet	ebcb275cd8	Add (first-draft) infected-`asyncio` actor task uses debugger example	2023-06-21 14:07:31 -04:00
Tyler Goodlet	79622bbeea	Restore `breakpoint()` hook after runtime exits Previously we were leaking our (pdb++) override into the Python runtime which would always result in a runtime error whenever `breakpoint()` is called outside our runtime; after exit of the root actor . This explicitly restores any previous hook override (detected during startup) or deletes the hook and restores the environment if none existed prior. Also adds a new WIP debugging example script to ensure breakpointing works as normal after runtime close; this will be added to the test suite.	2023-05-15 00:47:29 -04:00
Tyler Goodlet	c47575997a	Expand nested case to include error prop and breakpointing	2022-10-14 19:42:23 -04:00
Tyler Goodlet	c0dd5d7ffc	Adjust multi-daemon test to be more deterministic	2022-10-14 19:42:23 -04:00
Tyler Goodlet	56c19093bb	Add basic module-not-found when opening a ctx eg.	2022-08-02 12:17:06 -04:00
Tyler Goodlet	dd23e78de1	Add back in async gen loop	2022-07-27 11:40:02 -04:00
Tyler Goodlet	bb732cefd0	Drop high log level in ctx example	2022-07-27 11:40:02 -04:00
Tyler Goodlet	21dccb2e79	A `.open_context()` example that causes a hang! Finally! I think this may be the root issue we've been seeing in production in a client project. No idea yet why this is happening but the fault-causing sequence seems to be: - `.open_context()` in a child actor - enter the debugger via `tractor.breakpoint()` - continue from that entry via `c` command in REPL - raise an error just after inside the context task's body Looking at logging it appears as though the child thinks it has the tty but no input is accepted on the REPL and a further `ctrl-c` results in some teardown but also a further hang where both parent and child become unresponsive..	2022-07-27 11:40:02 -04:00
Tyler Goodlet	99c4319940	Fix example name typo	2022-07-27 11:40:02 -04:00
Tyler Goodlet	42f9d10252	Add a pre-started breakpoint example	2022-07-27 11:40:02 -04:00
Tyler Goodlet	949aa9c405	Lol. should probably push the example code...	2021-12-10 12:48:05 -05:00
Tyler Goodlet	8ba10315c1	Fix type path to new `_supervise` mod	2021-10-23 15:54:40 -04:00
Tyler Goodlet	b14699d40b	Adjust debugger tests to expect depth > 1 crashes With the new fixes to the trio spawner we can expect that both root and depth > 1 nursery owning actors will now not clobber any children that are in debug (either via breakpoint or through crashing). The tests changed now include more checks which ensure the 2nd level parent-ish actors also bubble up through into `pdb` and don't kill any of their (crashed) children before they're done themselves debugging.	2021-10-14 13:39:46 -04:00
Tyler Goodlet	674fbbc6b3	Docs and comments tidying	2021-08-01 10:44:13 -04:00
Tyler Goodlet	13b76c9439	Add fast fail test using the context api	2021-07-31 12:46:40 -04:00
Tyler Goodlet	63bdddd0c9	Add debug example that causes pdb stdin clobbering	2021-07-31 12:46:40 -04:00
Tyler Goodlet	98a0594c26	Drop `tractor.run()` from all examples	2021-05-07 11:21:40 -04:00
Tyler Goodlet	5a5e6baad1	Update all examples to new streaming API	2021-04-28 12:23:08 -04:00
Tyler Goodlet	b7b2436bc1	Remove tractor run from some debug examples	2021-02-24 13:14:40 -05:00
Tyler Goodlet	5127effd88	Drop warning level logging assert(s)	2020-12-26 15:45:55 -05:00
Tyler Goodlet	a668f714d5	Allow passing function refs to `Portal.run()` This resolves and completes #69 allowing all RPC invocation APIs to pass function references directly instead of explicit `str` names for the target namespace and function (this is still done implicitly underneath). This brings us closer to `trio`'s task running API as well as acknowledges that any inter-host RPC system (and API) will likely need to be implemented on top of local RPC primitives anyway. Even if this ends up not being true we can always go to "function stubs" as part of our IAC protocol or, add a new method to do explicit namespace calls: `.run_from_module()` or whatever everyone votes on. Resolves #69 Further, this commit drops `Actor.statespace` from the entire system since a user can easily get this same functionality using module level variables. Fix docs to match all these changes (luckily mostly already done due to example scripts referencing).	2020-12-21 09:09:55 -05:00
Tyler Goodlet	0118589875	Add race case handling for mp backend	2020-12-12 13:30:14 -05:00
Tyler Goodlet	a8406c8626	Toss in another tests with daemon subactors	2020-10-15 23:16:56 -04:00
Tyler Goodlet	acb4cb0b2b	Add test showing issue with child in tty lock when cancelled	2020-10-07 06:08:31 -04:00
Tyler Goodlet	abf8bb2813	Add a deep nested error propagation test	2020-10-06 09:21:53 -04:00
Tyler Goodlet	371025947a	Add a multi-subactor test where the root errors	2020-10-05 11:58:58 -04:00
Tyler Goodlet	31c1a32d58	Add re-entrant root breakpoint test; demonstrates a bug..	2020-10-05 11:58:58 -04:00
Tyler Goodlet	e387e8b322	Add a multi-subactor test with nesting	2020-10-05 11:58:58 -04:00
Tyler Goodlet	73a32f7d9c	Add initial subactor debug tests	2020-10-05 11:58:58 -04:00
Tyler Goodlet	0a2a94fee0	Add initial root actor debugger tests	2020-10-05 11:58:58 -04:00

47 Commits (61cd95d883bbcbe1ca0dd6deb3829b29f29cfeff)