tractor

Commit Graph

Author	SHA1	Message	Date
Tyler Goodlet	05b143d9ef	Big debugger rework, more tolerance for internal err-hangs Since i was running into them (internal errors) during lock request machinery dev and was getting all sorts of difficult to understand hangs whenever i intro-ed a bug to either side of the ipc ctx; this all while trying to get the msg-spec working for `Lock` requesting subactors.. Deats: - hideframes for `@acm`s and `trio.Event.wait()`, `Lock.release()`. - better detail out the `Lock.acquire/release()` impls - drop `Lock.remote_task_in_debug`, use new `.ctx_in_debug`. - add a `Lock.release(force: bool)`. - move most of what was `_acquire_debug_lock_from_root_task()` and some of the `lock_tty_for_child().__a[enter/exit]()` logic into `Lock.[acquire/release]()` including bunch more logging. - move `lock_tty_for_child()` up in the module to below `Lock`, with some rework: - drop `subactor_uid: tuple` arg since we can just use the `ctx`.. - add exception handler blocks for reporting internal (impl) errors and always force release the lock in such cases. - extend `DebugStatus` (prolly will rename to `DebugRequest` btw): - add `.req_ctx: Context` for subactor side. - add `.req_finished: trio.Event` to sub to signal request task exit. - extend `.shield_sigint()` doc-str. - add `.release()` to encaps all the state mgmt previously strewn about inside `._pause()`.. - use new `DebugStatus.release()` to replace all the duplication: - inside `PdbREPL.set_[continue/quit]()`. - inside `._pause()` for the subactor branch on internal repl-invocation error cases, - in the `_enter_repl_sync()` closure on error, - replace `apply_debug_codec()` -> `apply_debug_pldec()` in tandem with the new `PldRx` sub-sys which handles the new `__pld_spec__`. - add a new `pformat_cs()` helper orig to help debug cs stack a corruption; going to move to `.devx.pformat` obvi. - rename `wait_for_parent_stdin_hijack()` -> `request_root_stdio_lock()` with improvements: - better doc-str and add todos, - use `DebugStatus` more stringently to encaps all subactor req state. - error handling blocks for cancellation and straight up impl errors directly around the `.open_context()` block with the latter doing a `ctx.cancel()` to avoid hanging in the shielded `.req_cs` scope. - similar exc blocks for the func's overall body with explicit `log.exception()` reporting. - only set the new `DebugStatus.req_finished: trio.Event` in `finally`. - rename `mk_mpdb()` -> `mk_pdb()` and don't cal `.shield_sigint()` implicitly since the caller usage does matter for this. - factor out `any_connected_locker_child()` from the SIGINT handler. - rework SIGINT handler to better handle any stale-lock/hang cases: - use new `Lock.ctx_in_debug: Context` to detect subactor-in-debug. and use it to cancel any lock request instead of the lower level - use `problem: str` summary approach to log emissions. - rework `_pause()` given all of the above, stuff not yet mentioned: - don't take `shield: bool` input and proxy to `debug_func()` (for now). - drop `extra_frames_up_when_async: int` usage, expect `**debug_func_kwargs` to passthrough an `api_frame: Frametype` (more on this later). - lotsa asserts around the request ctx vs. task-in-debug ctx using new `current_ipc_ctx()`. - asserts around `DebugStatus` state. - rework and simplify the `debug_func` hooks, `_set_trace()`/`_post_mortem()`: - make them accept a non-optional `repl: PdbRepl` and `api_frame: FrameType` which should be used to set the current frame when the REPL engages. - always hide the hook frames. - always accept a `tb: TracebackType` to `_post_mortem()`. \|_ copy and re-impl what was the delegation to `pdbp.xpm()`/`pdbp.post_mortem()` and instead call the underlying `Pdb.interaction()` ourselves with a `caller_frame` and tb instance. - adjust the public `.pause()` impl: - accept optional `hide_tb` and `api_frame` inputs. - mask opening a cancel-scope for now (can cause `trio` stack corruption, see notes) and thus don't use the `shield` input other then to eventually passthrough to `_post_mortem()`? \|_ thus drop `task_status` support for now as well. \|_ pretty sure correct soln is a debug-nursery around `._invoke()`. - since no longer using `extra_frames_up_when_async` inside `debug_func()`s ensure all public apis pass a `api_frame`. - re-impl our `tractor.post_mortem()` to directly call into `._pause()` instead of binding in via `partial` and mk it take similar input as `.pause()`. - drop `Lock.release()` from `_maybe_enter_pm()`, expose and pass expected frame and tb. - use necessary changes from all the above within `maybe_wait_for_debugger()` and `acquire_debug_lock()`. Lel, sorry thought that would be shorter.. There's still a lot more re-org to do particularly with `DebugStatus` encapsulation but it's coming in follow up.	2024-05-08 11:44:55 -04:00
Tyler Goodlet	a73b24cf4a	First draft, sub-msg-spec for debugger `Lock` sys Since it's totes possible to have a spec applied that won't permit `str`s, might as well formalize a small msg set for subactors to request the tree-wide TTY `Lock`. BTW, I'm prolly not going into every single change here in this first WIP since there's still a variety of broken stuff mostly to do with races on the codec apply being done in a `trio.lowleve.RunVar`; it should be re-done with a `ContextVar` such that each task does NOT mutate the global setting.. New msg set and usage is simply: - `LockStatus` which is the reponse msg delivered from `lock_tty_for_child()` - `LockRelease` a one-off request msg from the subactor to drop the `Lock` from a `MsgStream.send()`. - use these msgs throughout the root and sub sides of the locking ctx funcs: `lock_tty_for_child()` & `wait_for_parent_stdin_hijack()` The codec is now applied in both the root and sub `Lock` request tasks: - for root inside `lock_tty_for_child()` before the `.started()`. - for subs, inside `wait_for_parent_stdin_hijack()` since we only want to affect the codec for the locking task. - (hence the need for ctx-var as mentioned above but currently this can cause races which will break against other app tasks competing for the codec setting). - add a `apply_debug_codec()` helper for use in both cases. - add more detailed logging to both the root and sub side of `Lock` requesting funcs including requiring that the sub-side task "uid" (a `tuple[str, int]` = (trio.Task.name, id(trio.Task)` be provided (more on this later). A main issue discovered while proto-testing all this was the ability of a sub to "double lock" (leading to self-deadlock) via an error in `wait_for_parent_stdin_hijack()` which, for ex., can happen in debug mode via crash handling of a `MsgTypeError` received from the root during a codec applied msg-spec race! Originally I was attempting to solve this by making the SIGINT override handler more resilient but this case is somewhat impossible to detect by an external root task other then checking for duplicate ownership via the new `subactor_task_uid`. => SO NOW, we always stick the current task uid in the `Lock._blocked: set` and raise an rte on a double request by the same remote task. Included is a variety of small refinements: - finally figured out how to mark a variety of `.__exit__()` frames with `pdbp.hideframe()` to actually hide them B) - add cls methods around managing `Lock._locking_task_cs` from root only. - re-org all the `Lock` attrs into those only used in root vs. subactors and proto-prep a new `DebugStatus` actor-singleton to be used in subs. - add a `Lock.repr()` to contextually print the current conc primitives. - rename our `Pdb`-subtype to `PdbREPL`. - rigor out the SIGINT handler a bit, originally to try and hack-solve the double-lock issue mentioned above, but now just with better logging and logic for most (all?) possible hang cases that should be hang-recoverable after enough ctrl-c mashing by the user.. well hopefully: - using `Lock.repr()` for both root and sub cases. - lots more `log.warn()`s and handler reversions on stale lock or cs detection. - factor `._pause()` impl a little better moving the actual repl entry to a new `_enter_repl_sync()` (originally for easier wrapping in the sub case with `apply_codec()`).	2024-04-16 13:25:19 -04:00
Tyler Goodlet	1f7f84fdfa	Mk debugger tests work for arbitrary pre-REPL format Since this was changed as part of overall project wide logging format updates, and i ended up changing the both the crash and pause `.pdb()` msgs to include some multi-line-ascii-"stuff", might as well make the pre-prompt checks in the test suite more flexible to match. As such, this exposes 2 new constants inside the `.devx._debug` mod: - `._pause_msg: str` for the pre `tractor.pause()` header emitted via `log.pdb()` and, - `._crash_msg: str` for the pre `._post_mortem()` equiv when handling errors in debug mode. Adjust the test suite to use these values and thus make us more capable to absorb changes in the future as well: - add a new `in_prompt_msg()` predicate, very similar to `assert_before()` but minus `assert`s which takes in a `parts: list[str]` to match in the pre-prompt stdout. - delegate to `in_prompt_msg()` in `assert_before()` since it was mostly duplicate minus `assert`. - adjust all previous `<patt> in before` asserts to instead use `in_prompt_msg()` with separated pre-prompt-header vs. actor-name `parts`. - use new `._pause/crash_msg` values in all such calls including any `assert_before()` cases.	2024-03-05 12:22:04 -05:00
Tyler Goodlet	bf0739c194	Add `stackscope` tree pprinter triggered by SIGUSR1 Can be optionally enabled via a new `enable_stack_on_sig()` which will swap in the SIGUSR1 handler. Much thanks to @oremanj for writing this amazing project, it's thus far helped me fix some very subtle hangs inside our new IPC-context cancellation machinery that would have otherwise taken much more manual pdb-ing and hair pulling XD Full credit for `dump_task_tree()` goes to the original project author with some minor tweaks as was handed to me via the trio-general matrix room B) Slight changes from orig version: - use a `log.pdb()` emission to pprint to console - toss in an ex sh CLI cmd to trigger the dump from another terminal using `kill` + `pgrep`.	2024-02-20 09:05:34 -05:00
Tyler Goodlet	e94f1261b5	Move `maybe_open_crash_handler()` CLI `--pdb`-driven wrapper to debug mod	2023-10-02 18:10:34 -04:00
Tyler Goodlet	fa9a9cfb1d	Kick off `.devx` subpkg for our dev tools B) Where `.devx` is "developer experience", a hopefully broad enough subpkg name for all the slick stuff planned to augment working on the actor runtime 💥 Move the `._debug` module into the new subpkg and adjust rest of core code base to reflect import path change. Also add a new `.devx._debug.open_crash_handler()` manager for wrapping any sync code outside a `trio.run()` which is handy for eventual CLI addons for popular frameworks like `click`/`typer`.	2023-09-28 14:14:50 -04:00

6 Commits (05b143d9ef874e420d154a56fce00044de1b43a2)