Commit Graph

84 Commits (326b258fd56c2468464b3173b3cac12b2d3f9cc0)

Author SHA1 Message Date
Tyler Goodlet 4569d11052 Move `.is_multi_cancelled()` to `.trioniics._beg`
Since it's for beg filtering, the current impl should be renamed anyway;
it's not just for filtering cancelled excs.

Deats,
- added a real doc string, links to official eg docs and fixed the
  return typing.
- adjust all internal imports to match.
2025-07-16 15:49:18 -04:00
Tyler Goodlet f5056cdd02 Mk `test_crash_handler_cms` suite go green
Turns out there were some subtle internal bugs discovered by the just
added `tests/devx/test_tooling::test_crash_handler_cms` suite.

So this fixes,

- a mis-ordering around `rt_repl_fixture :=` in the logic of
  `DebugStatus.maybe_enter_repl_fixture()`.

- `.devx.debug._post_mortem._post_mortem()` ensuring we now **always**
  call `DebugStatus.release()`, and thus unwind the exist-stack managing
  the `repl_fixture` exit/teardown, **even in the case** where
  `yield False` is delivered from the user-fixture-fn (meaning
  `dnter_repl=False`) thus triggering an early `return` (as is done in
  the new test suite).
2025-07-14 18:07:50 -04:00
Tyler Goodlet d000642462 Report `enable_stack_on_sig` on `stackscope` import failure 2025-07-14 13:15:07 -04:00
Tyler Goodlet 5b69975f81 Drop "
" from tail of `BoxedMaybeException.pformat()`
2025-07-14 00:00:13 -04:00
Tyler Goodlet 8d506796ec Re-impl as `DebugStatus.maybe_enter_repl_fixture()`
Dropping the `_maybe_open_repl_fixture()` approach and instead using
a `DebugStatus._fixture_stack = ExitStack()` which provides for much
simpler support around both sync and async pausing APIs thanks to only
invoking `repl_fixture.__exit__()` on actual `PdbREPL` interaction being
complete!

Deats,
- all `repl_fixture` detection logic still happens in one place (the new
  method) but we aren't limited to closing it via an immediate post REPL
  `.__exit__()` call which instead is triggered by,
- `DebugStatus.release()` which now calls `._fixture_stack.close()` and
  thus only invokes `repl_fixture.__exit__()` when user REPL-ing is
  **actually complete** an arbitrary amount of debugging time later.
- include the notes for `@acm` support above the new method, though not
  sure if they're as relevant any more?

Benefits,
- we can drop the previously added indent levels from
  `_enter_repl_sync()` and `_post_mortem()`.
- now we automatically have support for the `.pause_from_sync()` API
  since `_enter_repl_sync()` doesn't close the prior
  `_maybe_open_repl_fixture()` immediately when `debug_func=None`; the
  user's `__exit__()` is only ever called once `.release()` is.

Other,
- add big 'CASE' comments around the various blocks in
  `.pause_from_sync()`, i was having trouble figuring out which i was
  using from a `breakpoint()` in a dependent app..
2025-07-14 00:00:12 -04:00
Tyler Goodlet 02d03ce700 Always pass `repl: PdbREPL` as first param to fixture 2025-07-14 00:00:12 -04:00
Tyler Goodlet 116137d066 Reorg `.devx.debug` into sub-mods!
Which cleans out the pkg-mod to just the expected exports with (its
longstanding todo comment list) and thus a separation-of-concerns
and smaller mod-file sizes via the following new sub-mods:
- `._trace` for the `.pause()`/`breakpoint()`/`pdb.set_trace()`-style
  APIs including all sync-caller variants.
- `._post_mortem` to contain our async `.post_mortem()` and all other
  public crash handling APIs for use from sync callers.
- `._sync` for the high-level syncing helper-routines used throughout the
  runtime to avoid multi-proc TTY use collisions.

And also,
- remove `hide_runtime_frames()` since moved to `.devx._frame_stack`.
2025-07-14 00:00:12 -04:00
Tyler Goodlet 7f87b4e717 Mv `.hide_runtime_frames()` -> `.devx._frame_stack`
A much more relevant module for a call-stack-frame hider ;)
2025-07-14 00:00:12 -04:00
Tyler Goodlet e4758550f7 Start splitting into `devx.debug.` sub-mods
From what was originall the `.devx._debug` monolith module, since that
file was way out of ctl in terms of LoC!

New modules so far include,
- ._repl: our `pdb[p]` ext type/lowlevel-APIs and `mk_pdb()` factory.
- ._sigint: just our REPL-interaction shield-handler.
- ._tty_lock: containing all the root-actor TTY mutex machinery
  including the `Lock`/`DebugStatus` primitives/APIs as well as the
  inter-tree IPC context eps:
  * the server-side `lock_stdio_for_peer()` which pairs with the,
  * client-(subactor)-side `request_root_stdio_lock()` via the,
  * pld-msg-spec of `LockStatus/LockRelease`.
  AND the `any_connected_locker_child()` predicate.
2025-07-14 00:00:12 -04:00
Tyler Goodlet a7efbfdbc2 Add `_maybe_open_repl_fixture()`
Factoring the (basically duplicate) content from both use spots into
a common `@cm` which delivers a `bool` signalling whether the REPL
should be engaged. Fixes a lingering bug with `nullcontext()` calling
btw..
2025-07-14 00:00:12 -04:00
Tyler Goodlet 1c6660c497 Mk `.devx._debug` a sub-pkg `.devx.debug`
With plans for much factoring of the original module into sub-mods!
Adjust all imports and refs throughout to match.
2025-07-14 00:00:12 -04:00
Tyler Goodlet 202befa360 Add exc suppression to `open_crash_handler()`
By supporting a new optional param to `open_crash_handler()`,
`raise_on_exit: bool|Sequence[Type[BaseException]] = True` which
determines whether, after the REPL interaction completes, the handled
exception is raised upward. This is **very** handy for writing bits of
"debug-able but resilient code" as is the case in (many) dependent
projects/apps.

Impl,
- `raise_on_exit` can be a `bool` or (set) sequence of types which will
  always be raised.
- also add a `BoxedMaybeException.raise_on_exit` equiv which (for now)
  we check matches (in case down the road we want to offer dynamic ctls).
- rename both crash-handler cm's `tb_hide` -> `hide_tb`.
2025-07-14 00:00:12 -04:00
Tyler Goodlet c24708b273 Add initial `repl_fixture` support B)
It turns out to be fairly useful to allow hooking into a given actor's
entry-and-exit around `.devx._debug._pause/._post_mortem()` calls which
engage the `pdbp.Pdb` REPL (really our `._debug.PdbREPL` but yeah).

Some very handy use cases include,
- swapping out-of-band (config) state that may otherwise halt the
  user's app since the actor normally handles kb&mouse input, in thread,
  which means that the handler will be blocked while the REPL is in use.
- (remotely) reporting actor-runtime state for monitoring purposes
  around crashes or pauses in normal operation.
- allowing for crash-handling to be hard-disabled via
  `._state._runtime_vars` say for when you never want a debugger to be
  entered in a production instance where you're not-sure-if/don't-want
  per-actor `debug_mode: bool` settings to always be unset, say bc
  you're still debugging some edge cases that ow you'd normally want to
  REPL up.

Impl details,
- add a new optional `._state._runtime_vars['repl_fixture']` field which
  for now can be manually set; i saw no reason for a formal API yet
  since we want to convert the `dict` to a struct anyway (first).
- augment both `.devx._debug._pause()/._post_mortem()` with a new
  optional `repl_fixture: AbstractContextManager[bool]` kwarg which
  when provided is `with repl_fixture()` opened around the lowlevel
  REPL interaction calls; if the enter-result, an expected `bool`, is
  `False` then the interaction is hard-bypassed.
  * for the `._pause()` case the `@cm` is opened around the entire body
    of the embedded `_enter_repl_sync()` closure (for now) though
    ideally longer term this entire routine is factored to be a lot less
    "nested" Bp
  * in `_post_mortem()` the entire previous body is wrapped similarly
    and also now excepts an optional `boxed_maybe_exc: BoxedMaybeException`
    only passed in the `open_crash_handler()` caller case.
- when the new runtime-var is overridden, (only manually atm) it is used
  instead but only whenever the above `repl_fixture` kwarg is left null.
- add a `BoxedMaybeException.pformat() = __repr__()` which when
  a `.value: Exception` is set renders a more "objecty" repr of the exc.

Obviously tests for all this should be coming soon!
2025-07-14 00:00:12 -04:00
Tyler Goodlet 6cb361352c Use `.is_debug_mode()` for maybe-crash-handling
Such that the default is `None` and in the case where the caller *does
not* set the `pdb` arg to an explicit `bool` we instead determine it via
the output from `._state.is_debug_mode()` allowing for more "nonchalant"
usage throughout a (test) code base which passes the `debug_mode: bool`
as runtime config; allows delegation to the per-actor proc-global state.
2025-07-14 00:00:12 -04:00
Tyler Goodlet b09e35f3dc Mv in `modden.repr` content: some `reprlib`-utils
Since I'd like to use some `reprlib` formatting which `modden` already
implemented (and it's a main dependee project), figured I'd just bring
it all into `.devx.pformat` for now.
2025-07-13 23:33:47 -04:00
Tyler Goodlet fc57a4d639 Formally add a `nest_from_op()` for "sclang"-fmting
Moving it from where i (oddly) first wrote it up in `._entry` to a more
proper place with its pals in `.devx.pformat` ;p

Iface summary, what caller provides:
- `input_op`: a "sclang" chars-symbol to represent the conc "operation",
- `text`: the "entity" being *operated on*,
- `nest_prefix/indent`: what the ^ will be prefixed with.
- `prefix_op: bool` so that, when unset, the `input_op` is instead
   used as a suffix to the first line of `text`.
- `next_indent: int|None = None` such that when set (and not null) we
   use that exact ws-indent instead of calculating it from the
   `len(nest_prefix)` allowing for specifying a `0`-indent easily.
  - includes logic where we either get a non-zero value and
    apply it strictly to both the `nest_prefix` and `text`, OR we
    auto-calc it from any `nest_prefix`, NOT a conflation of both..
- `op_suffix: str = '\n'` for instead of assuming `f'{input_op}\n'`.
- `rm_from_first_ln: str` which allows removing chars from the
  first line of `text` after a `str.strip()` (handy for removing
  the '<Channel' first chevron from type-reprs).

There's also a huge comment-doc for "sclang" into the fn body which is
the terrible "primer" on this whole idea for the moment XD
2025-07-13 23:27:03 -04:00
Tyler Goodlet c2e7dc7407 Avoid silent `stackscope`-test fail due to dep
Oddly my env was borked bc a missing sub-dep (`typing-extensions`
apparently not added by `uv` for `stackscope`?) and then `stackscope`
was silently failing import and caused the shield-pause test to also
fail (since it couldn't match the expected `log.devx()` on console). The
import failure is not very explanatory due to the `log.warning()`;
change it to `.error()` level.

Also, explicitly import `_sync_pause_from_builtin` in
`examples/debugging/restore_builtin_breakpoint.py` to ensure the ref is
exported properly from `.devx.debug` (which it wasn't during dev of the
prior commit Bp).
2025-07-13 13:45:15 -04:00
Tyler Goodlet f67b0639b8 Move peer-tracking attrs from `Actor` -> `IPCServer`
Namely transferring the `Actor` peer-`Channel` tracking attrs,
- `._peers` which maps the uids to client channels (with duplicates
  apparently..)
- the `._peer_connected: dict[tuple[str, str], trio.Event]` child-peer
  syncing table mostly used by parent actors to wait on sub's to connect
  back during spawn.
- the `._no_more_peers = trio.Event()` level triggered state signal.

Further we move over with some minor reworks,
- `.wait_for_peer()` verbatim (adjusting all dependants).
- factor the no-more-peers shielded wait branch-block out of
  the end of `async_main()` into 2 new server meths,
  * `.has_peers()` with optional chan-connected checking flag.
  * `.wait_for_no_more_peers()` which *just* does the
    maybe-shielded `._no_more_peers.wait()`
2025-07-08 18:05:05 -04:00
Tyler Goodlet 0711576678 Passthrough `_pause()` kwargs from `_maybe_enter_pm()` 2025-07-08 18:05:05 -04:00
Tyler Goodlet 00d8a2a099 Improve `TransportClosed.__repr__()`, add `src_exc`
By borrowing from the implementation of `RemoteActorError.pformat()`
which is now factored into a new `.devx.pformat_exc()` and re-used for
both error types while maintaining the same func-sig. Obviously delegate
`RemoteActorError.pformat()` to the new helper accordingly and keeping
the prior `body` generation from `.devx.pformat_boxed_tb()` as before.

The new helper allows for,
- passing any of a `header|message|body: str` which are all combined in
  that order in the final output.
- getting the `exc.message` as the default `message` part.
- generating an objecty-looking "type-name" header to be rendered by
  default when `header` is not overridden.
- "first-line-of `message`" processing which we split-off and then
  re-inject as a `f'<{type(exc).__name__}( {first} )>'` top line header.
- an optional `tail: str = '>'` to "close the object"-look only added
  when `with_type_header: bool = True`.

Adjustments to `TransportClosed` around this include,
- replacing the init `cause` arg for a `src_exc` which is now always
  assigned to a same named instance var.
- displaying that new `.src_exc` in the `body: str` arg to the
  `.devx.pformat.pformat_exc()` call so you can always see the
  underlying (normally `trio`) source error.
- just make it inherit from `Exception` not `trio.BrokenResourceError`
  to avoid handlers catching `TransportClosed` as the former
  particularly in testing when we want to sometimes to distinguish them.
2025-07-08 18:05:05 -04:00
Tyler Goodlet acac605c37 Move `DebugRequestError` to `._exceptions` 2025-07-08 18:05:04 -04:00
Guillermo Rodriguez eceb292415 move tractor._ipc.py into tractor.ipc._chan.py 2025-07-08 12:57:28 -04:00
Tyler Goodlet 2d18e6a4be Match `maybe_open_crash_handler()` to non-maybe version
Such that it will deliver a `BoxedMaybeException` to the caller
regardless whether `pdb` is set, and proxy through all `**kwargs`.
2025-03-27 13:38:47 -04:00
Tyler Goodlet 8b4ed31d3b Handle egs on failed `request_root_stdio_lock()`
Namely when the subactor fails to lock the root, in which case we
try to be very verbose about how/what failed in logging as well
as ensure we cancel the employed IPC ctx.

Implement the outer `BaseException` handler to handle both styles,
- match on an eg (or the prior std cancel excs) only raising a lone
  sub-exc from for former.
- always `as _req_err:` and assign to a new func-global `req_err`
  to enable the above matching.

Other,
- raise `DebugStateError` on `status.subactor_uid != actor_uid`.
- fix a `_repl_fail_report` ref error due to making silly assumptions
  about the `_repl_fail_msg` global; now copy from global as default.
- various log-fmt and logic expression styling tweaks.
- ignore `trio.Cancelled` by default in `open_crash_handler()`.
2025-03-27 13:38:47 -04:00
Tyler Goodlet 03406e020c Repair/update `stackscope` test
Seems that on 3.13 it's not showing our script code in the output now?
Gotta get an example for @oremanj to see what's up but really it'd be
nice to just custom format stuff above `trio`'s runtime by def..

Anyway, update the `.devx._stackscope`,
- log formatting to be a little more "sclangy" lookin.
- change the per-actor "delimiter" lines style.
- report the `signal.getsignal(SIGINT)` which i needed in the
  `sync_bp.py` with ctl-c causing a hang..
- mask the `_tree_dumped` duplicator log report as well as the "dumped
  fine" one.
- add an example `pkill --signal SIGUSR1` cmdline.

Tweak the test to cope with,
- not showing our script lines now.. which i've commented in the
  `assert_before()` patts..
- to expect the newly formatted delimiter (ascii) lines to separate the
  root vs. hanger sub-actor sections.
2025-03-27 13:24:25 -04:00
Tyler Goodlet 3345962253 Yield a boxed-maybe-error from `open_crash_handler()`
Along the lines of something like `pytest.raises()` where the handled
exception can be inspected from the `pdbp` REPL using its `.value` field
B)

This is super handy in particular for understanding
`BaseException[Group]`s without manually adding surrounding handler code
to assign the `except[*] Exception as exc_var:` particularly when trying
to understand multi-cancelled eg trees.
2025-03-27 13:24:25 -04:00
Tyler Goodlet 2fd9c0044b Drop extra nl from boxed error fmt 2025-03-27 13:24:25 -04:00
Tyler Goodlet 79f4197d26 Raise explicitly on missing `greenback` portal
When `.pause_from_sync()` is called from an `asyncio.Task` which was
never bestowed a portal we want to be mega pedantic about it; indicate
that the task was NOT spawned from our `.to_asyncio` API and likely by
some out-of-our-control code (normally using
`asyncio.ensure_future()/.create_task()`). Though `greenback` already
errors on such usage, it's not always clear why no portal exists;
explaining the situation of a 3rd-party-bg-spawned-task should avoid
dev confusion for most cases.

Impl deats,
- distinguish between an actor in infected mode versus the actual caller
  of `.pause_from_sync()` being an `asyncio.Task` with more explicit
  `asyncio_task` and `is_infected_aio` vars.
- ONLY in the case of being both an infected-mode-actor AND detecting
  that the caller is an `asyncio.Task`, check `greenback.has_portal()`
  such that when not bestowed we presume the aforementioned
  3rd-party-bg-task case above and raise a new explicit RTE with
  a detailed explanatory message.
- add some masked draft code for handling the speical case of a root
  actor `asyncio.Task` caller which could (in theory) not actually
  require gb portal use since the `Lock` can be acquired directly
  without IPC.
 |_this will likely require factoring of various pause machinery funcs
   into a `_pause_from_root_task()` to mk the impl sane XD

Other,
- expose a new `debug_filter: Callable` which can be provided by the
  caller of `_maybe_enter_pm()` to predicate whether to enter the
  debugger REPL based on the caught `BaseException|BaseExceptionGroup`;
  this is handy for customizing the meaning of "graceful cancellations"
  so as to avoid crash handling on expected egs of more then
  `trioCancelled`.
|_ make the default as it was implemented: `not is_multi_cancelled(err)`
- pass-through a new `ignore: set[BaseException]` as
  `open_crash_handler(ignore_nested=ignore)` to allow for the same
  silent-cancellation-egs-swallowing as desired from outside the actor
  runtime.
2025-03-27 13:24:25 -04:00
Tyler Goodlet 9af2a4e739 Add TODO for a tb frame "filterer" sys.. 2025-03-27 13:24:25 -04:00
Tyler Goodlet 61c5613943 Support custom `boxer_header: str` provided by `pformat_boxed_tb()` caller 2025-03-27 13:24:25 -04:00
Tyler Goodlet 5b29dd5d2b Expose a `_ctlc_ignore_header: str` for use in `sigint_shield()` 2025-03-27 13:24:25 -04:00
Tyler Goodlet bb60a6d623 Messy-teardown `DebugStatus` related fixes
Mostly fixing edge cases with `asyncio` and/or bg threads where the
`.repl_release: trio.Event` needs to be used from the main `trio`
thread OW confusing-but-valid teardown tracebacks can show under various
races.

Also improve,
- log reporting for such internal bugs to make them more obvious on
  console via `log.exception()`.
- only restore the SIGINT handler when runtime is (still) active.
- reporting when `tractor.pause(shield=True)` should be used and
  unhiding the internal frames from the tb in that case.
- for `pause_from_sync()` some deep fixes..
 |_add a `allow_no_runtime: bool = False` flag to allow
   **not** requiring the actor runtime to be active.
 |_fix the `greenback` case-branch to only trigger on `not
   is_trio_thread`.
 |_add a scope-global `repl_owner: Task|Thread|None = None` to
   avoid ref errors..
2025-03-27 13:24:25 -04:00
Tyler Goodlet 6ef06be6d0 More `.pause_from_sync()` in bg-threads "polish"
Various `try`/`except` blocks around external APIs that raise when not
running inside an `tractor` and/or some async framework (mostly to avoid
too-late/benign error tbs on certain classes of actor tree teardown):
- for the `log.pdb()` prompts emitted before REPL console entry.
- inside `DebugStatus.is_main_trio_thread()`'s call to `sniffio`.
- in `_post_mortem()` by catching `NoRuntime` when called from a thread
  still active after the `.open_root_actor()` has already exited.

Also,
- create a dedicated `DebugStateError` for raising instead of `assert`s
  when we have actual debug-request inconsistencies (as seem to be most
  likely with bg thread usage of `breakpoint()`).
- show the `open_crash_handler()` frame on `bdb.BdbQuit` (for now?)
2025-03-27 13:24:25 -04:00
Tyler Goodlet f8222356ce Hide `[maybe]_open_crash_handler()` frame by default 2025-03-27 13:24:25 -04:00
Tyler Goodlet 4b9d638be9 Use our `._post_mortem` from `open_crash_handler()`
Since it seems that `pdbp.xpm()` can sometimes lose the up-stack
traceback info/frames? Not sure why but ours seems to work just fine
from a `asyncio`-handler in `modden`'s use of `i3ipc` B)

Also call `DebugStatus.shield_sigint()` from `pause_from_sync()` in the
infected-`asyncio` case to get the same shielding behaviour as in all
other usage!
2025-03-27 13:24:25 -04:00
Tyler Goodlet 6b18fcd437 First draft, `asyncio`-task, sync-pausing Bo
Mostly due to magic from @oremanj where we slap in a little bit of
`.from_asyncio`-type stuff to run a `trio`-task from `asyncio.Task`
code!

I'm not gonna go into tooo too much detail but basically the primary
thing needed was a way to (blocking-ly) invoke a `trio.lowlevel.Task`
from an `asyncio` one (which we now have with a new
`run_trio_task_in_future()` thanks to draft code from the aforementioned
jefe) which we now invoke from a dedicated aio case-branch inside
`.devx._debug.pause_from_sync()`. Further include a case inside
`DebugStatus.release()` to handle using the same func to set the
`repl_release: trio.Event` from the aio side when releasing the REPL on
exit cmds.

Prolly more refinements to come ;{o
2025-03-27 13:24:25 -04:00
Tyler Goodlet 64d506970a Officially test proto-ed `stackscope` integration
By re-purposing our `pexpect`-based console matching with a new
`debugging/shield_hang_in_sub.py` example, this tests a few "hanging
actor" conditions more formally:

- that despite a hanging actor's task we can dump
  a `stackscope.extract()` tree on relay of `SIGUSR1`.
- the actor tree will terminate despite a shielded forever-sleep by our
  "T-800" zombie reaper machinery activating and hard killing the
  underlying subprocess.

Some test deats:
- simulates the expect actions of a real user by manually using
  `os.kill()` to send both signals to the actor-tree program.
- `pexpect`-matches against `log.devx()` emissions under normal
  `debug_mode == True` usage.
- ensure we get the actual "T-800 deployed" `log.error()` msg and
  that the actor tree eventually terminates!

Surrounding (re-org/impl/test-suite) changes:
- allow disabling usage via a `maybe_enable_greenback: bool` to
  `open_root_actor()` but enable by def.
- pretty up the actual `.devx()` content from `.devx._stackscope`
  including be extra pedantic about the conc-primitives for each signal
  event.
- try to avoid double handles of `SIGUSR1` even though it seems the
  original (what i thought was a) problem was actually just double
  logging in the handler..
  |_ avoid double applying the handler func via `signal.signal()`,
  |_ use a global to avoid double handle func calls and,
  |_ a `threading.RLock` around handling.
- move common fixtures and helper routines from `test_debugger` to
  `tests/devx/conftest.py` and import them for use in both test mods.
2025-03-27 13:24:25 -04:00
Tyler Goodlet 92713af63e Get multi-threaded sync-pausing fully workin!
The final issue was making sure we do the same thing on ctl-c/SIGINT
from the user. That is, if there's already a bg-thread in REPL, we
`log.pdb()` about SIGINT shielding and re-draw the prompt; the same UX
as normal actor-runtime-task behaviour.

Reasons this wasn't workin.. and the fix:
- `.pause_from_sync()` was overriding the local `repl` var with `None`
  delivered by (transitive) calls to `_pause(debug_func=None)`.. so
  remove all that and only assign it OAOO prior to thread-type case
  branching.
- always call `DebugStatus.shield_sigint()` as needed from all requesting
  threads/tasks:
  - in `_pause_from_bg_root_thread()` BEFORE calling `._pause()` AND BEFORE
    yielding back to the bg-thread via `.started(out)` to ensure we're
    definitely overriding the handler in the `trio`-main-thread task
    before unblocking the requesting bg-thread.
  - from any requesting bg-thread in the root actor such that both its
    main-`trio`-thread scheduled task (as per above bullet) AND it are
    SIGINT shielded.
  - always call `.shield_sigint()` BEFORE any `greenback._await()` case
    don't entirely grok why yet, but it works)?
  - for `greenback._await()` case always set `bg_task` to the current one..
- tweaks to the `SIGINT` handler, now renamed `sigint_shield()` so as
  not to name-collide with the methods when editor-searching:
  - always try to `repr()` the REPL thread/task "owner" as well as the
    active `PdbREPL` instance.
  - add `.devx()` notes around the prompt flushing deats and comments
    for any root-actor-bg-thread edge cases.

Related/supporting refinements:
- add `get_lock()`/`get_debug_req()` factory funcs since the plan is to
  eventually implement both as `@singleton` instances per actor.
- fix `acquire_debug_lock()`'s call-sig-bug for scheduling
  `request_root_stdio_lock()`..
- in `._pause()` only call `mk_pdb()` when `debug_func != None`.
- add some todo/warning notes around the `cls.repl = None` in
  `DebugStatus.release()`

`test_pause_from_sync()` tweaks:
- don't use a `attach_patts.copy()`, since we always `break` on match.
- do `pytest.fail()` on that ^ loop's fallthrough..
- pass `do_ctlc(child, patt=attach_key)` such that we always match the
  the current thread's name with the ctl-c triggered `.pdb()` emission.
- oh yeah, return the last `before: str` from `do_ctlc()`.
- in the script, flip `abandon_on_cancel=True` since when `False` it
  seems to cause `trio.run()` to hang on exit from the last bg-thread
  case?!?
2025-03-27 13:24:25 -04:00
Tyler Goodlet 4a08d586cd Another tweak to REPL entry `.pdb()` headers 2025-03-27 13:24:25 -04:00
Tyler Goodlet 607e1dcf45 More failed REPL-lock-request refinements
In `lock_stdio_for_peer()` better internal-error handling/reporting:
- only `Lock._blocked.remove(ctx.cid)` if that same cid was added on
  entry to avoid needless key-errors.
- drop all `Lock.release(force: bool)` usage remnants.
- if `req_ctx.cancel()` fails mention it with `ctx_err.add_note()`.
- add more explicit internal-failed-request log messaging via a new
  `fail_reason: str`.
- use and use new `x)<=\n|_` annots in any failure logging.

Other cleanups/niceties:
- drop `force: bool` flag entirely from the `Lock.release()`.
- use more supervisor-op-annots in `.pdb()` logging
  with both `_pause/crash_msg: str` instead of double '|' lines when
  `.pdb()`-reported from `._set_trace()`/`._post_mortem()`.
2025-03-27 13:24:25 -04:00
Tyler Goodlet 8ff682440d Further formalize `greenback` integration
Since we more or less require it for `tractor.pause_from_sync()` this
refines enable toggles and their relay down the actor tree as well as
more explicit logging around init and activation.

Tweaks summary:
- `.info()` report the module if discovered during root boot.
- use a `._state._runtime_vars['use_greenback']: bool` activation flag
  inside `Actor._from_parent()` to determine if the sub should try to
  use it and set to `False` if mod-loading fails / not installed.
- expose `maybe_init_greenback()` from `.devx` sugpkg.
- comment out RTE in `._pause()` for now since we already have it in
  `.pause_from_sync()`.
- always `.exception()` on `maybe_init_greenback()` import errors to
  clarify the underlying failure deats.
- always explicitly report if `._state._runtime_vars['use_greenback']`
  was NOT set when `.pause_from_sync()` is called.

Other `._runtime.async_main()` adjustments:
- combine the "internal error call ur parents" message and the failed
  registry contact status into one new `err_report: str`.
- drop the final exception handler's call to
  `Actor.lifetime_stack.close()` since we're already doing it in the
  `finally:` block and the earlier call has no currently known benefit.
- only report on the `.lifetime_stack()` callbacks if any are detected
  as registered.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 097101f8d3 Port debug request ep to use `@context(pld_spec)`
Namely passing the `.__pld_spec__` directly to the
`lock_stdio_for_peer()` decorator B)

Also, allows dropping `apply_debug_pldec()` (which was a todo) and
removing a `lock_stdio_for_peer()` indent level.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 6a5d33b7ed Make big TODO: for `devx._debug` refinements
Hopefully would make grok-ing this fairly sophisticated sub-sys possible
for any up-and-coming `tractor` hacker

XD

A lot of internal API and re-org ideas I discovered/realized as part of
finishing the `__pld_spec__` and multi-threaded support. Particularly
better isolation between root-actor vs subactor task APIs and generally
less globally-state-ful stuff like `DebugStatus` and `Lock` method APIs
would likely make a lot of the hard to follow edge cases more clear?
2025-03-24 14:04:52 -04:00
Tyler Goodlet 31cc33c66c First proto: multi-threaded synced `pdb`-REPLs
Functionally working for multi-threaded (via cpython threads spawned
from `to_trio.to_thread.run_sync()`) alongside subactors, tested (for
now) only with threads started inside the root actor (which seemed to
have the most issues in terms of the impl and special cases..) using the
new `tractor.pause_from_sync()` API!

Main implementation changes to `.pause_from_sync()`
------ - ------
- from the root actor, we need to ensure bg thread case is handled
  *specially* since no IPC is used to request the TTY stdio mutex and
  `Lock` (API) usage is conducted entirely from a local task or thread;
  dedicated `Lock` usage for the root-actor already is branched inside
  `._pause()` and needs similar handling from a root bg-thread:
 |_for the special case of a root bg thread we need to
   `trio`-main-thread schedule a bg task inside a new
   `_pause_from_bg_root_thread()`. The new task needs to implement most
   of what was is handled inside `._pause()` manually, mostly because in
   this root-actor-bg-thread case we have 2 constraints:
   1. to enter `PdbREPL.interaction()` **from the bg thread** directly,
   2. the task that `Lock._debug_lock.acquire()`s has to be the same
      that calls `.release() (a `trio.FIFOLock` constraint)
 |_impl deats of this `_pause_from_bg_root_thread()` include:
   - (for now) calling `._pause()` to acquire the `Lock._debug_lock`.
   - setting its own `DebugStatus.repl_release`.
   - calling `.DebugStatus.shield_sigint()` to ensure the root's
     main thread  uses the right handler when the bg one is REPL-ing.
   - wait manually on the `.repl_release()` to be set by the thread's
     dedicated `PdbREPL` exit.
   - manually calling `Lock.release()` from the **same task** that
     acquired it.
- expect calls to `._pause()` to deliver a `tuple[Task, PdbREPL]` such
  that we always get the handle both to any newly created REPl instance
  and the (maybe) the scheduled bg task within which is runs.
- add a single `message: str` style to `log.devx()` based on branching
  style for logging.
- ensure both `DebugStatus.repl` and `.repl_task` are set **just
  before** calling `._set_trace()` to ensure the correct `Task|Thread`
  is set when the REPL is finally entered from sync code.
- add a wrapping caller `_sync_pause_from_builtin()` which passes in the
  new `called_from_builtin=True` to indicate `breakpoint()` caller
  usage, obvi pass in `api_frame`.

Changes to `._pause()` in support of ^
------ - ------
- `TaskStatus.started()` and return the `tuple[Task, PdbREPL]` to
  callers / starters.
- only call `DebugStatus.shield_sigint()` when no `repl` passed bc some
  callers (like bg threads) may need to apply it at some specific point
  themselves.
- tweak some asserts for the `debug_func == None` / non-`trio`-thread
  case.
- add a mod-level `_repl_fail_msg: str` to be used when there's an
  internal `._pause()` failure for testing, easier to pexpect match.
- more comprehensive logging for the root-actor branched case to
  (attempt to) indicate any of the 3 cases:
  - remote ctx from subactor has the `Lock`,
  - already existing root task or thread has it or,
  - some kinda stale `.locked()` situation where the root has the lock
    but we don't know why.
- for root usage, revert to always `await Lock._debug_lock.acquire()`-ing
  despite `called_from_sync` since `.pause_from_sync()` was reworked to
  instead handle the special bg thread case in the new
  `_pause_from_bg_root_thread()` task.
- always do `return _enter_repl_sync(debug_func)`.
- try to report any `repl_task: Task|Thread` set by the caller
  (particularly for the bg thread cases) as being the thread or task
  `._pause()` was called "on behalf of"

Changes to `DebugStatus`/`Lock` in support of ^
------ - ------
- only call `Lock.release()` from `DebugStatus.set_[quit/continue]()`
  when called from the main `trio` thread and always call
  `DebugStatus.release()` **after** to ensure `.repl_released()` is set
  **after** `._debug_lock.release()`.
- only call `.repl_release.set()` from `trio` thread otherwise use
  `.from_thread.run()`.
- much more refinements in `Lock.release()` for threading cases:
  - return `bool` to indicate whether lock was released by caller.
  - mask (in prep to drop) `_pause()` usage of
    `Lock.release.force=True)` since forcing a release can't ever avoid
    the RTE from `trio`.. same task **must** acquire/release.
  - don't allow usage from non-`trio`-main-threads, ever; there's no
    point since the same-task-needs-to-manage-`FIFOLock` constraint.
  - much more detailed logging using `message`-building-style for all
    caller (edge) cases.
   |_ use a `we_released: bool` to determine failed-to-release edge
      cases which can happen if called from bg threads, ensure we
      `log.exception()` on any incorrect usage resulting in  release
      failure.
   |_ complain loudly if the release fails and some other task/thread
      still holds the lock.
   |_ be explicit about "who" (which task or thread) the release is "on
      behalf of" by reading `DebugStatus.repl_task` since the caller
      isn't the REPL operator in many sync cases.
  - more or less drop `force` support, as mentioned above.
  - ensure we unset `._owned_by_root` if the caller is a root task.

Other misc
------ - ------
- rename `lock_tty_for_child()` -> `lock_stdio_for_peer()`.
- rejig `Lock.repr()` to show lock and event stats.
- stage `Lock.stats` and `.owner` methods in prep for doing a singleton
  instance and `@property`s.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 2f1a97e73e Catch `.pause_from_sync()` in root bg thread bugs!
Originally discovered as while using `tractor.pause_from_sync()`
from the `i3ipc` client running in a bg-thread that uses `asyncio`
inside `modden`.

Turns out we definitely aren't correctly handling `.pause_from_sync()`
from the root actor when called from a `trio.to_thread.run_sync()`
bg thread:
- root-actor bg threads which can't `Lock._debug_lock.acquire()` since
  they aren't in `trio.Task`s.
- even if scheduled via `.to_thread.run_sync(_debug._pause)` the
  acquirer won't be the task/thread which calls `Lock.release()` from
  `PdbREPL` hooks; this results in a RTE raised by `trio`..
- multiple threads will step on each other's stdio since cpython's GIL
  seems to ctx switch threads on every input from the user to the REPL
  loop..

Reproduce via reworking our example and test so that they catch and fail
for all edge cases:
- rework the `/examples/debugging/sync_bp.py` example to demonstrate the
  above issues, namely the stdio clobbering in the REPL when multiple
  threads and/or a subactor try to debug simultaneously.
  |_ run one thread using a task nursery to ensure it runs conc with the
     nursery's parent task.
  |_ ensure the bg threads run conc a subactor usage of
     `.pause_from_sync()`.
  |_ gravely detail all the special cases inside a TODO comment.
  |_ add some control flags to `sync_pause()` helper and don't use
     `breakpoint()` by default.
- extend and adjust `test_debugger.test_pause_from_sync` to match (and
  thus currently fail) by ensuring exclusive `PdbREPL` attachment when
  the 2 bg root-actor threads are concurrently interacting alongside the
  subactor:
  |_ should only see one of the `_pause_msg` logs at a time for either
     one of the threads or the subactor.
  |_ ensure each attaches (in no particular order) before expecting the
     script to exit.

Impl adjustments to `.devx._debug`:
- drop `Lock.repl`, no longer used.
- add `Lock._owned_by_root: bool` for the `.ctx_in_debug == None`
  root-actor-task active case.
- always `log.exception()` for any `._debug_lock.release()` ownership
  RTE emitted by `trio`, like we used to..
- add special `Lock.release()` log message for the stale lock but
  `._owned_by_root == True` case; oh yeah and actually
  `log.devx(message)`..
- rename `Lock.acquire()` -> `.acquire_for_ctx()` since it's only ever
  used from subactor IPC usage; well that and for local root-task
  usage we should prolly add a `.acquire_from_root_task()`?
- buncha `._pause()` impl improvements:
 |_ type `._pause()`'s `debug_func` as a `partial` as well.
 |_ offer `called_from_sync: bool` and `called_from_bg_thread: bool`
    for the special case handling when called from `.pause_from_sync()`
 |_ only set `DebugStatus.repl/repl_task` when `debug_func != None`
   (OW ensure the `.repl_task` is not the current one).
 |_ handle error logging even when `debug_func is None`..
 |_ lotsa detailed commentary around root-actor-bg-thread special cases.
- when `._set_trace(hide_tb=False)` do `pdbp.set_trace(frame=currentframe())`
  so the `._debug` internal frames are always included.
- by default always hide tracebacks for `.pause[_from_sync]()` internals.
- improve `.pause_from_sync()` to avoid root-bg-thread crashes:
 |_ pass new `called_from_xxx_` flags and ensure `DebugStatus.repl_task`
    is actually set to the `threading.current_thread()` when needed.
 |_ manually call `Lock._debug_lock.acquire_nowait()` for the non-bg
    thread case.
 |_ TODO: still need to implement the bg-thread case using a bg
    `trio.Task`-in-thread with an `trio.Event` set by thread REPL exit.
2025-03-24 14:04:52 -04:00
Tyler Goodlet 15a47dc4f7 Finally, officially support shielded REPL-ing!
It's been a long time prepped and now finally implemented!

Offer a `shield: bool` argument from our async `._debug` APIs:
- `await tractor.pause(shield=True)`,
- `await tractor.post_mortem(shield=True)`

^-These-^ can now be used inside cancelled `trio.CancelScope`s,
something very handy when introspecting complex (distributed) system
tear/shut-downs particularly under remote error or (inter-peer)
cancellation conditions B)

Thanks to previous prepping in a prior attempt and various patches from
the rigorous rework of `.devx._debug` internals around typed msg specs,
there ain't much that was needed!

Impl deats
- obvi passthrough `shield` from the public API endpoints (was already
  done from a prior attempt).
- put ad-hoc internal `with trio.CancelScope(shield=shield):` around all
  checkpoints inside `._pause()` for both the root-process and subactor
  case branches.

Add a fairly rigorous example, `examples/debugging/shielded_pause.py`
with a wrapping `pexpect` test, `test_debugger.test_shield_pause()` and
ensure it covers as many cases as i can think of offhand:

- multiple `.pause()` entries in a loop despite parent scope
  cancellation in a subactor RPC task which itself spawns a sub-task.
- a `trio.Nursery.parent_task` which raises, is handled and
  tries to enter and unshielded `.post_mortem()`, which of course
  internally raises `Cancelled` in a `._pause()` checkpoint, so we catch
  the `Cancelled` again and then debug the debugger's internal
  cancellation with specific checks for the particular raising
  checkpoint-LOC.
- do ^- the latter -^ for both subactor and root cases to ensure we
  can debug `._pause()` itself when it tries to REPL engage from
  a cancelled task scope Bo
2025-03-24 14:04:52 -04:00
Tyler Goodlet defe34dec2 Move runtime frame hiding into helper func
Call it `hide_runtime_frames()` and stick all the lines from the top of
the `._debug` mod in there along with a little `log.devx()` emission on
what gets hidden by default ;)

Other,
- fix ref-error where internal-error handler might trigger despite the
  debug `req_ctx` not yet having init-ed, such that we don't try to
  cancel or log about it when it never was fully created/initialize..
- fix assignment typo iniside `_set_trace()` for `task`.. lel
2025-03-24 14:04:51 -04:00
Tyler Goodlet e1857413a3 Resolve remaining debug-request race causing hangs
More or less by pedantically separating and managing root and subactor
request syncing events to always be managed by the locking IPC context
task-funcs:
- for the root's "child"-side, `lock_tty_for_child()` directly creates
  and sets a new `Lock.req_handler_finished` inside a `finally:`
- for the sub's "parent"-side, `request_root_stdio_lock()` does the same
  with a new `DebugStatus.req_finished` event and separates it from
  the `.repl_release` event (which indicates a "c" or "q" from user and
  thus exit of the REPL session) as well as sets a new `.req_task:
  trio.Task` to explicitly distinguish from the app-user-task that
  enters the REPL vs. the paired bg task used to request the global
  root's stdio mutex alongside it.
- apply the `__pld_spec__` on "child"-side of the ctx using the new
  `Portal.open_context(pld_spec)` parameter support; drops use of any
  `ContextVar` malarky used prior for `PldRx` mgmt.
- removing `Lock.no_remote_has_tty` since it was a nebulous name and
  from the prior "everything is in a `Lock`" design..

------ - ------

More rigorous impl to handle various edge cases in `._pause()`:
- rejig `_enter_repl_sync()` to wrap the `debug_func == None` case
  inside maybe-internal-error handler blocks.
- better logic for recurrent vs. multi-task contention for REPL entry in
  subactors, by guarding using `DebugStatus.req_task` and by now waiting
  on the new `DebugStatus.req_finished` for the multi-task contention
  case.
- even better internal error handling and reporting for when this code
  is hacked on and possibly broken ;p

------ - ------

Updates to `.pause_from_sync()` support:
- add optional `actor`, `task` kwargs to `_set_trace()` to allow
  compat with the new explicit `debug_func` calling in `._pause()` and
  pass a `threading.Thread` for `task` in the `.to_thread()` usage case.
- add an `except` block that tries to show the frame on any internal
  error.

------ - ------

Relatedly includes a buncha cleanups/simplifications somewhat in
prep for some coming refinements (around `DebugStatus`):
- use all the new attrs mentioned above as needed in the SIGINT shielder.
- wait on `Lock.req_handler_finished` in `maybe_wait_for_debugger()`.
- dropping a ton of masked legacy code left in during the recent reworks.
- better comments, like on the use of `Context._scope` for shielding on
  the "child"-side to avoid the need to manage yet another cs.
- add/change-to lotsa `log.devx()` level emissions for those infos which
  are handy while hacking on the debugger but not ideal/necessary to be
  user visible.
- obvi add lotsa follow up todo notes!
2025-03-24 14:04:51 -04:00
Tyler Goodlet 7656326484 Make `request_root_stdio_lock()` post-mortem-able
Finally got this working so that if/when an internal bug is introduced
to this request task-func, we can actually REPL-debug the lock request
task itself B)

As in, if the subactor's lock request task internally errors we,
- ensure the task always terminates (by calling `DebugStatus.release()`)
  and explicitly reports (via a `log.exception()`) the internal error.
- capture the error instance and set as a new `DebugStatus.req_err` and
  always check for it on final teardown - in which case we also,
 - ensure it's reraised from a new `DebugRequestError`.
 - unhide the stack frames for `_pause()`, `_enter_repl_sync()` so that
   the dev can upward inspect the `_pause()` call stack sanely.

Supporting internal impl changes,
- add `DebugStatus.cancel()` and `.req_err`.
- don't ever cancel the request task from
  `PdbREPL.set_[continue/quit]()` only when there's some internal error
  that would likely result in a hang and stale lock state with the root.
- only release the root's lock when the current ask is also the owner
  (avoids bad release errors).
- also show internal `._pause()`-related frames on any `repl_err`.

Other temp-dev-tweaks,
- make pld-dec change log msgs info level again while solving this
  final context-vars race stuff..
- drop the debug pld-dec instance match asserts for now since
  the problem is already caught (and now debug-able B) by an attr-error
  on the decoded-as-`dict` started msg, and instead add in
  a `log.exception()` trace to see which task is triggering the case
  where the debug `MsgDec` isn't set correctly vs. when we think it's
  being applied.
2025-03-24 14:04:51 -04:00
Tyler Goodlet 8bab8e8bde Always release debug request from `._post_mortem()`
Since obviously the thread is likely expected to halt and raise after
the REPL session exits; this was a regression from the prior impl. The
main reason for this is that otherwise the request task will never
unblock if the user steps through the crashed task using 'next' since
the `.do_next()` handler doesn't by default release the request since in
the `.pause()` case this would end the session too early.

Other,
- toss in draft `Pdb.user_exception()`, though doesn't seem to ever
  trigger?
- only release `Lock._debug_lock` when already locked.
2025-03-24 14:04:51 -04:00