Much like other recent changes attempt to detect runtime-bug-causing
crashes and only show the runtime-endpoint frame when present.
Adds a `ActorNursery._scope_error: BaseException|None` attr to aid with
detection. Also toss in some todo notes for removing and replacing the
`.run_in_actor()` method API.
Just inside `._invoke()` after the `ctx: Context` is retrieved.
Also try our best to *not hide* internal frames when a non-user-code
crash happens, normally either due to a runtime RPC EP bug or
a transport failure.
Kinda like a "runtime"-y level for `.pdb()` (which is more or less like
an `.info()` for our debugger subsys) which can be used to report
internals info for those hacking on `.devx` tools.
Also, inject only the *last* 6 digits of the `id(Task)` in
`pformat_task_uid()` output by default.
Since the state mgmt becomes quite messy with multiple sub-tasks inside
an IPC ctx, AND bc generally speaking the payload-type-spec should map
1-to-1 with the `Context`, it doesn't make a lot of sense to be using
`ContextVar`s to modify the `Context.pld_rx: PldRx` instance.
Instead, always allocate a full instance inside `mk_context()` with the
default `.pld_rx: PldRx` set to use the `msg._ops._def_any_pldec: MsgDec`
In support, simplify the `.msg._ops` impl and APIs:
- drop `_ctxvar_PldRx`, `_def_pld_rx` and `current_pldrx()`.
- rename `PldRx._pldec` -> `._pld_dec`.
- rename the unused `PldRx.apply_to_ipc()` -> `.wraps_ipc()`.
- add a required `PldRx._ctx: Context` attr since it is needed
internally in some meths and each pld-rx now maps to a specific ctx.
- modify all recv methods to accept a `ipc: Context|MsgStream` (instead
of a `ctx` arg) since both have a ref to the same `._rx_chan` and there
are only a couple spots (in `.dec_msg()`) where we need the `ctx`
explicitly (which can now be easily accessed via a new `MsgStream.ctx`
property, see below).
- always show the `.dec_msg()` frame in tbs if there's a reference error
when calling `_raise_from_unexpected_msg()` in the fallthrough case.
- implement `limit_plds()` as light wrapper around getting the
`current_ipc_ctx()` and mutating its `MsgDec` via
`Context.pld_rx.limit_plds()`.
- add a `maybe_limit_plds()` which just provides an `@acm` equivalent of
`limit_plds()` handy for composing in a `async with ():` style block
(avoiding additional indent levels in the body of async funcs).
Obvi extend the `Context` and `MsgStream` interfaces as needed
to match the above:
- add a `Context.pld_rx` pub prop.
- new private refs to `Context._started_msg: Started` and
a `._started_pld` (mostly for internal debugging / testing / logging)
and set inside `.open_context()` immediately after the syncing phase.
- a `Context.has_outcome() -> bool:` predicate which can be used to more
easily determine if the ctx errored or has a final result.
- pub props for `MsgStream.ctx: Context` and `.chan: Channel` providing
full `ipc`-arg compat with the `PldRx` method signatures.
Finally got this working so that if/when an internal bug is introduced
to this request task-func, we can actually REPL-debug the lock request
task itself B)
As in, if the subactor's lock request task internally errors we,
- ensure the task always terminates (by calling `DebugStatus.release()`)
and explicitly reports (via a `log.exception()`) the internal error.
- capture the error instance and set as a new `DebugStatus.req_err` and
always check for it on final teardown - in which case we also,
- ensure it's reraised from a new `DebugRequestError`.
- unhide the stack frames for `_pause()`, `_enter_repl_sync()` so that
the dev can upward inspect the `_pause()` call stack sanely.
Supporting internal impl changes,
- add `DebugStatus.cancel()` and `.req_err`.
- don't ever cancel the request task from
`PdbREPL.set_[continue/quit]()` only when there's some internal error
that would likely result in a hang and stale lock state with the root.
- only release the root's lock when the current ask is also the owner
(avoids bad release errors).
- also show internal `._pause()`-related frames on any `repl_err`.
Other temp-dev-tweaks,
- make pld-dec change log msgs info level again while solving this
final context-vars race stuff..
- drop the debug pld-dec instance match asserts for now since
the problem is already caught (and now debug-able B) by an attr-error
on the decoded-as-`dict` started msg, and instead add in
a `log.exception()` trace to see which task is triggering the case
where the debug `MsgDec` isn't set correctly vs. when we think it's
being applied.
Since obviously the thread is likely expected to halt and raise after
the REPL session exits; this was a regression from the prior impl. The
main reason for this is that otherwise the request task will never
unblock if the user steps through the crashed task using 'next' since
the `.do_next()` handler doesn't by default release the request since in
the `.pause()` case this would end the session too early.
Other,
- toss in draft `Pdb.user_exception()`, though doesn't seem to ever
trigger?
- only release `Lock._debug_lock` when already locked.
Mostly adjustments for the new pld-receiver semantics/shim-layer which
results more often in the direct delivery of `RemoteActorError`s from
IPC API primitives (like `Portal.result()`) instead of being embedded in
an `ExceptionGroup` bundled from an embedded nursery.
Tossed usage of the `debug_mode: bool` fixture to a couple problematic
tests while i was working on them.
Also includes detailed assertion updates to the inter-peer cancellation
suite in terms of,
- `Context.canceller` state correctly matching the true src actor when
expecting a ctxc.
- any rxed `ContextCancelled` should instance match the `Context._local/remote_error`
as should the `.msgdata` and `._ipc_msg`.
- start tossing in `__tracebackhide__`s to various eps which don't need
to show in tbs or in the pdb REPL.
- port final `._maybe_enter_pm()` to pass a `api_frame`.
- start comment-marking up some API eps with `@api_frame`
in prep for actually using the new frame-stack tracing.
Woops, due to a `None` test against the `._final_result`, any actual
final `None` result would be received but not acked as such causing
a spawning test to hang. Fix it by instead receiving and assigning both
a `._final_result_msg: PayloadMsg` and `._final_result_pld`.
NB: as mentioned in many recent comments surrounding this API layer,
really this whole `Portal`-has-final-result interface/semantics should
be entirely removed as should the `ActorNursery.run_in_actor()` API(s).
Instead it should all be replaced by a wrapping "high level" API
(`tractor.hilevel` ?) which combines a task nursery, `Portal.open_context()`
and underlying `Context` APIs + an `outcome.Outcome` to accomplish the
same "run a single task in a spawned actor and return it's result"; aka
a "one-shot-task-actor".
- inside the `Actor.cancel()`'s maybe-wait-on-debugger delay,
report the full debug request status and it's affiliated lock request
IPC ctx.
- use the new `.req_ctx.chan.uid` to do the local nursery lookup during
channel teardown handling.
- another couple log fmt tweaks.
Proto-ing a little suite of call-stack-frame annotation-for-scanning
sub-systems for the purposes of both,
- the `.devx._debug`er and its
traceback and frame introspection needs when entering the REPL,
- detailed trace-style logging such that we can explicitly report
on "which and where" `tractor`'s APIs are used in the "app" code.
Deats:
- change mod name obvi from `._code` and adjust client mod imports.
- using `wrapt` (for perf) implement a `@api_frame` annot decorator
which both stashes per-call-stack-frame instances of `CallerInfo` in
a table and marks the function such that API endpoints can be easily
found via runtime stack scanning despite any internal impl changes.
- add a global `_frame2callerinfo_cache: dict[FrameType, CallerInfo]`
table for providing the per func-frame info caching.
- Re-implement `CallerInfo` to require less (types of) inputs:
|_ `_api_func: Callable`, a ref to the (singleton) func def.
|_ `_api_frame: FrameType` taken from the `@api_frame` marked `tractor`-API
func's runtime call-stack, from which we can determine the
app code's `.caller_frame`.
|_`_caller_frames_up: int|None` allowing the specific `@api_frame` to
determine "how many frames up" the application / calling code is.
And, a better set of derived attrs:
|_`caller_frame: FrameType` which finds and caches the API-eps calling
frame.
|_`caller_frame: FrameType` which finds and caches the API-eps calling
- add a new attempt at "getting a method ref from its runtime frame"
with `get_ns_and_func_from_frame()` using a heuristic that the
`CodeType.co_qualname: str` should have a "." in it for methods.
- main issue is still that the func-ref lookup will require searching
for the method's instance type by name, and that name isn't
guaranteed to be defined in any particular ns..
|_rn we try to read it from the `FrameType.f_locals` but that is
going to obvi fail any time the method is called in a module where
it's type is not also defined/imported.
- returns both the ns and the func ref FYI.
- set `._state._ctxvar_Context` just after `StartAck` inside
`open_context_from_portal()` so that `current_ipc_ctx()` always
works on the 'parent' side.
- always set `.canceller` to any `MsgTypeError.src_uid` and otherwise to
any maybe-detected `.src_uid` (i.e. for RAEs).
- always set `.canceller` to us when we rx a ctxc which reports us as
its canceller; this is a sanity check on definite "self cancellation".
- adjust `._is_self_cancelled()` logic to only be `True` when
`._remote_error` is both a ctxc with a `.canceller` set to us AND
when `Context.canceller` is also set to us (since the change above)
as a little bit of extra rigor.
- fill-in/fix some `.repr_state` edge cases:
- merge self-vs.-peer ctxc cases to one block and distinguish via
nested `._is_self_cancelled()` check.
- set 'errored' for all exception matched cases despite `.canceller`.
- add pre-`Return` phase statuses:
|_'pre-started' and 'syncing-to-child' depending on side and when
`._stream` has not (yet) been set.
|_'streaming' and 'streaming-finished' depending on side when
`._stream` is set and whether it was stopped/closed.
- tweak drainage log-message to use "outcome" instead of "result".
- use new `.devx.pformat.pformat_cs()` inside `_maybe_cancel_and_set_remote_error()`
but, IFF the log level is at least 'cancel'.
Since i was running into them (internal errors) during lock request
machinery dev and was getting all sorts of difficult to understand hangs
whenever i intro-ed a bug to either side of the ipc ctx; this all while
trying to get the msg-spec working for `Lock` requesting subactors..
Deats:
- hideframes for `@acm`s and `trio.Event.wait()`, `Lock.release()`.
- better detail out the `Lock.acquire/release()` impls
- drop `Lock.remote_task_in_debug`, use new `.ctx_in_debug`.
- add a `Lock.release(force: bool)`.
- move most of what was `_acquire_debug_lock_from_root_task()` and some
of the `lock_tty_for_child().__a[enter/exit]()` logic into
`Lock.[acquire/release]()` including bunch more logging.
- move `lock_tty_for_child()` up in the module to below `Lock`, with
some rework:
- drop `subactor_uid: tuple` arg since we can just use the `ctx`..
- add exception handler blocks for reporting internal (impl) errors
and always force release the lock in such cases.
- extend `DebugStatus` (prolly will rename to `DebugRequest` btw):
- add `.req_ctx: Context` for subactor side.
- add `.req_finished: trio.Event` to sub to signal request task exit.
- extend `.shield_sigint()` doc-str.
- add `.release()` to encaps all the state mgmt previously strewn
about inside `._pause()`..
- use new `DebugStatus.release()` to replace all the duplication:
- inside `PdbREPL.set_[continue/quit]()`.
- inside `._pause()` for the subactor branch on internal
repl-invocation error cases,
- in the `_enter_repl_sync()` closure on error,
- replace `apply_debug_codec()` -> `apply_debug_pldec()` in tandem with
the new `PldRx` sub-sys which handles the new `__pld_spec__`.
- add a new `pformat_cs()` helper orig to help debug cs stack
a corruption; going to move to `.devx.pformat` obvi.
- rename `wait_for_parent_stdin_hijack()` -> `request_root_stdio_lock()`
with improvements:
- better doc-str and add todos,
- use `DebugStatus` more stringently to encaps all subactor req state.
- error handling blocks for cancellation and straight up impl errors
directly around the `.open_context()` block with the latter doing
a `ctx.cancel()` to avoid hanging in the shielded `.req_cs` scope.
- similar exc blocks for the func's overall body with explicit
`log.exception()` reporting.
- only set the new `DebugStatus.req_finished: trio.Event` in `finally`.
- rename `mk_mpdb()` -> `mk_pdb()` and don't cal `.shield_sigint()`
implicitly since the caller usage does matter for this.
- factor out `any_connected_locker_child()` from the SIGINT handler.
- rework SIGINT handler to better handle any stale-lock/hang cases:
- use new `Lock.ctx_in_debug: Context` to detect subactor-in-debug.
and use it to cancel any lock request instead of the lower level
- use `problem: str` summary approach to log emissions.
- rework `_pause()` given all of the above, stuff not yet mentioned:
- don't take `shield: bool` input and proxy to `debug_func()` (for now).
- drop `extra_frames_up_when_async: int` usage, expect
`**debug_func_kwargs` to passthrough an `api_frame: Frametype` (more
on this later).
- lotsa asserts around the request ctx vs. task-in-debug ctx using new
`current_ipc_ctx()`.
- asserts around `DebugStatus` state.
- rework and simplify the `debug_func` hooks,
`_set_trace()`/`_post_mortem()`:
- make them accept a non-optional `repl: PdbRepl` and `api_frame:
FrameType` which should be used to set the current frame when the
REPL engages.
- always hide the hook frames.
- always accept a `tb: TracebackType` to `_post_mortem()`.
|_ copy and re-impl what was the delegation to
`pdbp.xpm()`/`pdbp.post_mortem()` and instead call the
underlying `Pdb.interaction()` ourselves with a `caller_frame`
and tb instance.
- adjust the public `.pause()` impl:
- accept optional `hide_tb` and `api_frame` inputs.
- mask opening a cancel-scope for now (can cause `trio` stack
corruption, see notes) and thus don't use the `shield` input other
then to eventually passthrough to `_post_mortem()`?
|_ thus drop `task_status` support for now as well.
|_ pretty sure correct soln is a debug-nursery around `._invoke()`.
- since no longer using `extra_frames_up_when_async` inside
`debug_func()`s ensure all public apis pass a `api_frame`.
- re-impl our `tractor.post_mortem()` to directly call into `._pause()`
instead of binding in via `partial` and mk it take similar input as
`.pause()`.
- drop `Lock.release()` from `_maybe_enter_pm()`, expose and pass
expected frame and tb.
- use necessary changes from all the above within
`maybe_wait_for_debugger()` and `acquire_debug_lock()`.
Lel, sorry thought that would be shorter..
There's still a lot more re-org to do particularly with `DebugStatus`
encapsulation but it's coming in follow up.
Since we need to allow it (at the least) inside
`drain_until_final_msg()` for handling stream-phase termination races
where we don't want to have to handle a raised error from something like
`Context.result()`. Expose the passthrough option via
a `passthrough_non_pld_msgs: bool` kwarg.
Add comprehensive comment to `current_pldrx()`.
Expose it from `._state.current_ipc_ctx()` and set it inside
`._rpc._invoke()` for child and inside `Portal.open_context()` for
parent.
Still need to write a few more tests (particularly demonstrating usage
throughout multiple nested nurseries on each side) but this suffices as
a proto for testing with some debugger request-from-subactor stuff.
Other,
- use new `.devx.pformat.add_div()` for ctxc messages.
- add a block to always traceback dump on corrupted cs stacks.
- better handle non-RAEs exception output-formatting in context
termination summary log message.
- use a summary for `start_status` for msg logging in RPC loop.
Since we usually want them raised from some (internal) call to
`Context.maybe_raise()` and NOT directly from the drainage call, make it
possible via a new `raise_error: bool` to both `PldRx.recv_msg_w_pld()`
and `.dec_msg()`.
In support,
- rename `return_msg` -> `result_msg` since we expect to return
`Error`s.
- do a `result_msg` assign and `break` in the `case Error()`.
- add `**dec_msg_kwargs` passthrough for other `.dec_msg()` calling
methods.
Other,
- drop/aggregate todo-notes around the main loop's
`ctx._pld_rx.recv_msg_w_pld()` call.
- add (configurable) frame hiding to most payload receive meths.
Since `._code` is prolly gonna get renamed (to something "frame & stack
tools" related) and to give a bit better organization.
Also adds a new `add_div()` helper, factored out of ctxc message
creation in `._rpc._invoke()`, for adding a little "header line" divider
under a given `message: str` with a little math to center it.
For more sane manual calls as needed in logging purposes. Obvi remap
the dunder methods to it.
Other:
- drop `hide_tb: bool` from `unpack_error()`, shouldn't need it since
frame won't ever be part of any tb raised from returned error.
- add a `is_invalid_payload: bool` to `_raise_from_unexpected_msg()` to
be used from `PldRx` where we don't need to decode the IPC
msg, just the payload; make the error message reflect this case.
- drop commented `._portal._unwrap_msg()` since we've replaced it with
`PldRx`'s delegation to newer `._raise_from_unexpected_msg()`.
- hide the `Portal.result()` frame by default, again.
A better spot for the pretty-formatting of frame text (and thus tracebacks)
is in the new `.devx._code` module:
- move from `._exceptions` -> `.devx._code.pformat_boxed_tb()`.
- add new `pformat_caller_frame()` factored out the use case in
`._exceptions._mk_msg_type_err()` where we dump a stack trace
for bad `.send()` side IPC msgs.
Add some new pretty-format methods to `Context`:
- explicitly implement `.pformat()` and allow an `extra_fields: dict`
which can be used to inject additional fields (maybe eventually by
default) such as is now used inside
`._maybe_cancel_and_set_remote_error()` when reporting the internal
`._scope` state in cancel logging.
- add a new `.repr_state -> str` which provides a single string status
depending on the internal state of the IPC ctx in terms of the shuttle
protocol's "phase"; use it from `.pformat()` for the `|_state:`.
- set `.started(complain_no_parity=False)` now since we presume decoding
with `.pld: Raw` now with the new `PldRx` design.
- use new `msgops.current_pldrx()` in `mk_context()`.
Not sure it's **that** useful (yet) but in theory would allow avoiding
certain log level usage around transient RPC requests for discovery methods
(like `.register_actor()` and friends); can't hurt to be able to
introspect that last message for other future cases I'd imagine as well.
Adjust the calling code in `._runtime` to match; other spots are using
the `trio.Nursery.start()` schedule style and are fine as is.
Improve a bunch more log messages throughout a few mods mostly by going
to a "summary" single-emission style where possible/appropriate:
- in `._runtime` more "single summary" status style log emissions:
|_mk `Actor.load_modules()` render a single mod loaded summary.
|_use a summary `con_status: str` for `Actor._stream_handler()` conn
setup and an equiv (`con_teardown_status`) for connection teardowns.
|_similar thing in `Actor.wait_for_actor()`.
- generally more usage of `.msg.pretty_struct` apis throughout `._runtime`.
Add new task-scope oriented `PldRx.pld_spec` management API similar to
`.msg._codec.limit_msg_spec()`, but obvi built to process and filter
`MsgType.pld` values.
New API related changes include:
- new per-task singleton getter `msg._ops.current_pldrx()` which
delivers the current (global) payload receiver via a new
`_ctxvar_PldRx: ContextVar` configured with a default
`_def_any_pldec: MsgDec[Any]` decoder.
- a `PldRx.limit_plds()` which sets the decoder (`.type` underneath)
for the specific payload rx instance.
- `.msg._ops.limit_plds()` which obtains the current task-scoped `PldRx`
and applies the pld spec via a new `PldRx.limit_plds()`.
- rename `PldRx._msgdec` -> `._pldec`.
- add `.pld_dec` as pub attr for -^
Unrelated adjustments:
- use `.msg.pretty_struct.pformat()` where handy.
- always pass `expect_msg: MsgType`.
- add a `case Stop()` to `PldRx.dec_msg()` which will `log.warning()`
when a stop is received by no stream was open on this receiving side
since we rarely want that to raise since it's prolly just a runtime
race or mistake in user code.
Other:
Particularly when logging around `MsgTypeError`s.
Other:
- make `_raise_from_unexpected_msg()`'s `expect_msg` a non-default value
arg, must always be passed by caller.
- drop `'canceller'` from `_body_fields` ow it shows up twice for ctxc.
- use `.msg.pretty_struct.pformat()`.
- parameterize `RemoteActorError.reprol()` (repr-one-line method) to
show `RemoteActorError[<self.boxed_type_str>]( ..` to make obvi
the boxed remote error type.
- re-impl `.boxed_type_str` as `str`-casting the `.boxed_type` value
which is guaranteed to render non-`None`.
Basically exact same as that for `MsgCodec` with the `.spec` displayed
via a better (maybe multi-line) `.spec_str: str` generated from a common
new set of helper mod funcs factored out msg-codec meths:
- `mk_msgspec_table()` to gen a `MsgType` name -> msg table.
- `pformat_msgspec()` to `str`-ify said table values nicely.q
Also add a new `MsgCodec.msg_spec_str: str` prop which delegates to the
above for the same.
- tweaking logging to include more `MsgType` dumps on IPC faults.
- removing some commented cruft.
- comment formatting / cleanups / add-ons.
- more type annots.
- fill out some TODO content.