runtime_to_msgspec.. #7

Open
goodboy wants to merge 174 commits from runtime_to_msgspec into msg_codecs

Needs ton of refinement via rebasing, cherry-picking and syncing with upstream github PRs/master branch.

Mostly sticking this up for now to ref the typed-IPC-msging stuff from piker gitea work that we want to get landed on a feature branch.

I will be coming back to the massive cleaning of this ideally asap XD

Needs ton of refinement via rebasing, cherry-picking and syncing with upstream github PRs/master branch. Mostly sticking this up for now to ref the typed-IPC-msging stuff from `piker` gitea work that we want to get landed on a feature branch. I will be coming back to the massive cleaning of this ideally asap XD
goodboy added 174 commits 2025-02-18 21:53:32 +00:00
70ab60ce7c Flip default codec to our `Msg`-spec
Yes, this is "the switch" and will likely cause the test suite to bail
until a few more fixes some in.

Tweaked a couple `.msg` pkg exports:
- remove `__spec__` (used by modules) and change it to `__msg_types:
  lists[Msg]` as well as add a new `__msg_spec__: TypeAlias`, being the
  default `Any` paramed spec.
- tweak the naming of `msg.types` lists of runtime vs payload msgs to:
  `._runtime_msgs` and `._payload_msgs`.
- just build `__msg_types__` out of the above 2 lists.
0fcd424d57 Start a new `._testing.fault_simulation`
Since I needed the `break_ipc()` helper from the
`examples/advanced_faults/ipc_failure_during_stream.py` used in the
`test_advanced_faults` suite, might as well move it into a pkg-wide
importable module. Also changed the default break method to be
`socket_close` which just calls `Stream.socket.close()` underneath in
`trio`.

Also tweak that example to not keep sending after the stream has been
broken since with new `trio` that will raise `ClosedResourceError` and
in the wrapping test we generally speaking want to see a hang and then
cancel via simulated user sent SIGINT/ctl-c.
10c98946bd Extend codec test to for msg-spec parameterizing
Set a diff `Msg.pld` spec per test and then send multiple types to
a child actor making sure the child can only send certain types over
a stream and fails with validation or decode errors ow. The test is also
param-ed both with and without hooks demonstrating how a custom type,
`NamespacePath`, needs them for effective use. The subactor IPC context
child is passed a `expect_ipc_send: dict` which relays the values along
with their expected `.send()`-ability.

Deats on technical refinements:
------ - ------
- added a `iter_maybe_sends()` send-value-as-msg-auditor and predicate
  generator (literally) so as to be able to pre-determine if given the
  current codec and `send_values` which values are expected to be IPC
  transmittable.
- as per ^, the diff value-msgs are first round-tripped inside
  a `Started` msg using the configured codec in the parent/root actor
  before bothering with using IPC primitives + a subactor; this is how
  the `expect_ipc_send` table is generated initially.
- for serializing the specs (`Union[Type]`s as required by `msgspec`),
  added a pair of codec hooks: `enc/dec_type_union()` (that ideally we
  move into a `.msg` submod eventually) which code the type-values as
  a `list[str]` of names.
  - the `dec_` hook had to be modified to NOT raise an error when an
    invalid/unhandled value arrives, this is because we do NOT want the
    RPC msg handling loop to raise on the `async for msg in chan:` and
    instead prefer to ignore and warn (for now, but eventually respond
    with error msg - see notes in hook body) these msgs when sent during
    a streaming phase; `Context.started()` will however error on a bad
    input for the current msg-spec since it is part of the "cheap"
    dialog (again see notes in `._context`) wherein the `Started` msg
    is always roundtripped prior to `Channel.send()` to guarantee
    the child adheres to its own spec.
- tossed in lotsa `print()`s for console groking of the run progress.

Further notes on typed-msging breaking cancellation:
------ - ------
- turns out since the runtime's cancellation implementation, being done
  with `Actor.cancel()` methods and friends will actually break when
  a stringent spec is applied (eg. a single type-spec) since the return
  values from said methods are generally `bool`s..
- this means we do indeed need special handling of "runtime RPC method
  invocations" since ideally a user's msg-spec choices do not break core
  functionality on them XD
=> The obvi solution is to add a/some special sub-`Msg` types for such
  cases, possibly just a `RuntimeReturn(Return)` type that will always
  include a `.pld: bool` for these cancel methods such that their
  results are always handled without msg type errors.

More to come on a (hopefully) elegant solution to that last bit!
7f1c2b8ecf Add buncha notes on `Start` field for "params"
Such that the current `kwargs: dict` field can eventually be strictly
msg-typed (eventually directly from a `@context` def) using modern typed
python's hippest syntactical approach B)

Also proto a new `CancelAck(Return)` subtype msg for supporting msg-spec
agnostic `Actor.cancel_xx()` method calls in the runtime such that
a user can't break cancellation (and thus SC) by dynamically setting
a codec that doesn't allow `bool` results (as an eg. in this case).
Note that the msg isn't used yet in `._rpc` but that's a comin!
b1fd8b2ec3 Make `Context.started()` a type checked IPC send
As detailed in the surrounding notes, it's pretty advantageous to always
have the child context task ensure the first msg it relays back is
msg-type checked against the current spec and thus `MsgCodec`. Implement
the check via a simple codec-roundtrip of the `Started` msg such that
the `.pld` payload is always validated before transit. This ensures the
child will fail early and notify the parent before any streaming takes
place (i.e. the "nasty" dialog protocol phase).

The main motivation here is to avoid inter-actor task syncing bugs that
are hard(er) to recover from and/or such as if an invalid typed msg is
sent to the parent, who then ignores it (depending on config), and then
the child thinks the parent is in some presumed state while the parent
is still thinking a first msg has yet to arrive. Doing the stringent
check on the sender side (i.e. the child is sending the "first"
application msg via `.started()`) avoids/sidesteps dealing with such
syncing/coordinated-state problems by keeping the entire IPC dialog in
a "cheap" or "control" style transaction up until a stream is opened.

Iow, the parent task's `.open_context()` block entry can't occur until
the child side is definitely (as much as is possible with IPC msg type
checking) in a correct state spec wise. During any streaming phase in
the dialog the msg-type-checking is NOT done for performance (the
"nasty" protocol phase) and instead any type errors are relayed back
from the receiving side. I'm still unsure whether to take the same
approach on the `Return` msg, since at that point erroring early doesn't
benefit the parent task if/when a msg-type error occurs? Definitely more
to ponder and tinker out here..

Impl notes:
- a gotcha with the roundtrip-codec-ed msg is that it often won't match
  the input `value` bc in the `msgpack` case many native python
  sequence/collection types will map to a common array type due to the
  surjection that `msgpack`'s type-sys imposes.
  - so we can't assert that `started == rt_started` but it may be useful
    to at least report the diff of the type-reduced payload so that the
    caller can at least be notified how the input `value` might be
    better type-casted prior to call, for ex. pre-casting to `list`s.
- added a `._strict_started: bool` that could provide the stringent
  checking if desired in the future.
- on any validation error raise our `MsgTypeError` from it.
- ALSO change over the lingering `.send_yield()` deprecated meth body
  to use a `Yield()`.
4cfe4979ff Factor `MsgpackTCPStream` msg-type checks
Add both the `.send()` and `.recv()` handling blocks to a common
`_raise_msg_type_err()` which includes detailed error msg formatting:

- the `.recv()` side case does introspection of the `Msg` fields and
  attempting to report the exact (field type related) issue
- `.send()` side does some boxed-error style tb formatting like
  `RemoteActorError`.
- add a `strict_types: bool` to `.send()` to allow for just
  warning on bad inputs versus raising, but always raise from any
  `Encoder` type error.
b9a61ded0a Drop `None`-sentinel cancels RPC loop mechanism
Pretty sure we haven't *needed it* for a while, it was always generally
hazardous in terms of IPC msg types, AND it's definitely incompatible
with a dynamically applied typed msg spec: you can't just expect
a `None` to be willy nilly handled all the time XD

For now I'm masking out all the code and leaving very detailed
surrounding notes but am not removing it quite yet in case for strange
reason it is needed by some edge case (though I haven't found according
to the test suite).

Backstory:
------ - ------
Originally (i'm pretty sure anyway) it was added as a super naive
"remote cancellation" mechanism (back before there were specific `Actor`
methods for such things) that was mostly (only?) used before IPC
`Channel` closures to "more gracefully cancel" the connection's parented
RPC tasks. Since we now have explicit runtime-RPC endpoints for
conducting remote cancellation of both tasks and full actors, it should
really be removed anyway, because:
- a `None`-msg setinel is inconsistent with other RPC endpoint handling
  input patterns which (even prior to typed msging) had specific
  msg-value triggers.
- the IPC endpoint's (block) implementation should use
  `Actor.cancel_rpc_tasks(parent_chan=chan)` instead of a manual loop
  through a `Actor._rpc_tasks.copy()`..

Deats:
- mask the `Channel.send(None)` calls from both the `Actor._stream_handler()` tail
  as well as from the `._portal.open_portal()` was connected block.
- mask the msg loop endpoint block and toss in lotsa notes.

Unrelated tweaks:
- drop `Actor._debug_mode`; unused.
- make `Actor.cancel_server()` return a `bool`.
- use `.msg.pretty_struct.Struct.pformat()` to show any msg that is
  ignored (bc invalid) in `._push_result()`.
aca6503fcd Flatten out RPC loop with `match:`/`case:`
Mainly expanding out the runtime endpoints for cancellation to separate
cases and flattening them with the main RPC-request-invoke block, moving
the non-cancel runtime case (where we call `getattr(actor, funcname)`)
inside the main `Start` case (for now) which branches on `ns=="self"`.

Also, add a new IPC msg `class CancelAck(Return):` which is always
included in the default msg-spec such that runtime cancellation (and
eventually all) endpoints return that msg (instead of a `Return`) and
thus sidestep any currently applied `MsgCodec` such that the results
(`bool`s for most cancel methods) are never violating the current type
limit(s) on `Msg.pld`. To support this expose a new variable
`return_msg: Return|CancelAck` param from
`_invoke()`/`_invoke_non_context)()` and set it to `CancelAck` in the
appropriate endpoint case-blocks of the msg loop.

Clean out all the lingering legacy `chan.send(<dict-msg>)` commented
codez from the invoker funcs, with more cleaning likely to come B)
aea5abdd70 Use `object()` when checking for error field value
Since the field value could be `None` or some other type with
truthy-ness evaluating to `False`..
2f451ab9a3 Caps-msging test tweaks to get correct failures
These are likely temporary changes but still needed to actually see the
desired/correct failures (of which 5 of 6 tests are supposed to fail rn)
mostly to do with `Start` and `Return` msgs which are invalid under each
test's applied msg-spec.

Tweak set here:
- bit more `print()`s in root and sub for grokin test flow.
- never use `pytes.fail()` in subactor.. should know this by now XD
- comment out some bits that can't ever pass rn and make the underlying
  expected failues harder to grok:
  - the sub's child-side-of-ctx task doing sends should only fail
    for certain msg types like `Started` + `Return`, `Yield`s are
    processed receiver/parent side.
  - don't expect `sent` list to match predicate set for the same reason
    as last bullet.

The outstanding msg-type-semantic validation questions are:
- how to handle `.open_context()` with an input `kwargs` set that
  doesn't adhere to the currently applied msg-spec?
  - should the initial `@acm` entry fail before sending to the child
    side?
- where should received `MsgTypeError`s be raised, at the `MsgStream`
  `.receive()` or lower in the stack?
  - i'm thinking we should mk `MsgTypeError` derive from
    `RemoteActorError` and then have it be delivered as an error to the
    `Context`/`MsgStream` for per-ctx-task handling; would lead to more
    flexible/modular policy overrides in user code outside any defaults
    we provide.
b341146bd1 Rename `Actor._push_result()` -> `._deliver_ctx_payload()`
Better describes the internal RPC impl/latest-architecture with the msgs
delivered being those which either define a `.pld: PayloadT` that gets
passed up to user code, or the error-msg subset that similarly is raised
in a ctx-linked task.
cf48fdecfe Unify `MsgTypeError` as a `RemoteActorError` subtype
Since in the receive-side error case the source of the exception is the
sender side (normally causing a local `TypeError` at decode time), might
as well bundle the error in remote-capture-style using boxing semantics
around the causing local type error raised from the
`msgspec.msgpack.Decoder.decode()` and with a traceback packed from
`msgspec`-specific knowledge of any field-type spec matching failure.

Deats on new `MsgTypeError` interface:
- includes a `.msg_dict` to get access to any `Decoder.type`-applied
  load of the original (underlying and offending) IPC msg into
  a `dict` form using a vanilla decoder which is normally packed into
  the instance as a `._msg_dict`.
- a public getter to the "supposed offending msg" via `.payload_msg`
  which attempts to take the above `.msg_dict` and load it manually into
  the corresponding `.msg.types.MsgType` struct.
- a constructor `.from_decode()` to make it simple to build out error
  instances from a failed decode scope where the aforementioned
  `msgdict: dict` from the vanilla decode can be provided directly.
- ALSO, we now pack into `MsgTypeError` directly just like ctxc in
  `unpack_error()`

This also completes the while-standing todo for `RemoteActorError` to
contain a ref to the underlying `Error` msg as `._ipc_msg` with public
`@property` access that `defstruct()`-creates a pretty struct version
via `.ipc_msg`.

Internal tweaks for this include:
- `._ipc_msg` is the internal literal `Error`-msg instance if provided
  with `.ipc_msg` the dynamic wrapper as mentioned above.
- `.__init__()` now can still take variable `**extra_msgdata` (similar
  to the `dict`-msgdata as before) to maintain support for subtypes
  which are constructed manually (not only by `pack_error()`) and insert
  their own attrs which get placed in a `._extra_msgdata: dict` if no
  `ipc_msg: Error` is provided as input.
- the `.msgdata` is now a merge of any `._extra_msgdata` and
  a `dict`-casted form of any `._ipc_msg`.
- adjust all previous `.msgdata` field lookups to try equivalent field
  reads on `._ipc_msg: Error`.
- drop default single ws indent from `.tb_str` and do a failover lookup
  to `.msgdata` when `._ipc_msg is None` for the manually constructed
  subtype-instance case.
- add a new class attr `.extra_body_fields: list[str]` to allow subtypes
  to declare attrs they want shown in the `.__repr__()` output, eg.
  `ContextCancelled.canceller`, `StreamOverrun.sender` and
  `MsgTypeError.payload_msg`.
- ^-rework defaults pertaining to-^ with rename from
  `_msgdata_keys` -> `_ipcmsg_keys` with latter now just loading directly
  from the `Error` fields def and `_body_fields: list[str]` just taking
  that value and removing the not-so-useful-in-REPL or already shown
  (i.e. `.tb_str: str`) field names.
- add a new mod level `.pack_from_raise()` helper for auto-boxing RAE
  subtypes constructed manually into `Error`s which is normally how
  `StreamOverrun` and `MsgTypeError` get created in the runtime.
- in support of the above expose a `src_uid: tuple` override to
  `pack_error()` such that the runtime can provide any remote actor id
  when packing a locally-created yet remotely-caused RAE subtype.
- adjust all typing to expect `Error`s over `dict`-msgs.

Adjust some tests to match these changes:
- context and inter-peer-cancel tests to make their `.msgdata` related
  checks against the new `.ipc_msg` as well and `.tb_str` directly.
- toss in an extra sleep to `sleep_a_bit_then_cancel_peer()` to keep the
  'canceller' ctx child task cancelled by it's parent in the 'root' for
  the rte-raised-during-ctxc-handling case (apparently now it's
  returning too fast, cool?).
15549f7c26 Expose `MsgType` and extend `MsgCodec` API a bit
Make a new `MsgType: TypeAlias` for the union of all msg types such that
it can be used in annots throughout the code base; just make
`.msg.__msg_spec__` delegate to it.

Add some new codec methods:
- `pld_spec_str`: for the `str`-casted value of the payload spec,
  generally useful in logging content.
- `msg_spec_items()`: to render a `dict` of msg types to their
  `str()`-casted values with support for singling out a specific
  `MsgType`, type by input `msg` instance.
- `pformat_msg_spec()`: for rendering the (partial) `.msg_spec` as
  a formatted `str` useful in logging.

Oh right, add a `Error._msg_dict: dict` in support of the previous
commit (for `MsgTypeError` packing as RAEs) such that our error msg type
can house a non-type-spec decoded wire-bytes for error
reporting/analysis purposes.
a35c1d40ab Refine `MsgTypeError` handling to relay-up-on-`.recv()`
Such that `Channel.recv()` + `MsgpackTCPStream.recv()` originating
msg-type-errors are not raised at the IPC transport layer but instead
relayed up the runtime stack for eventual handling by user-app code via
the `Context`/`MsgStream` layer APIs.

This design choice leads to a substantial amount of flexibility and
modularity, and avoids `MsgTypeError` handling policies from being
coupled to a particular backend IPC transport layer:
- receive-side msg-type errors, as can be raised and handled in the
  `.open_stream()` "nasty" phase of a ctx, whilst being packed at the
  `MsgCodec`/transport layer (keeping the underlying src decode error
  coupled to the specific transport + interchange lib) and then relayed
  upward to app code for custom handling like a normal Error` msg.
- the policy options for handling such cases could be implemented as
  `@acm` wrappers around `.open_context()`/`.open_stream()` blocks (and
  their respective delivered primitives) OR just plain old async
  generators around `MsgStream.receive()` such that both built-in policy
  handling and custom user-app solutions can be swapped without touching
  any `tractor` internals or providing specialized "registry APIs".
  -> eg. the ignore and relay-invalid-msg-to-sender approach can be more
   easily implemented as embedded `try: except MsgTypeError:` blocks
   around `MsgStream.receive()` possibly applied as either of an
   injected wrapper type around a stream or an async gen that `async
   for`s from the stream.
- any performance based AOT-lang extensions used to implement a policy
  for handling recv-side errors space can avoid knowledge of the lower
  level IPC `Channel` (and-downward) primitives.
- `Context` consuming code can choose to let all msg-type-errs
  bubble and handle them manually (like any other remote `Error`
  shuttled exception).
- we can keep (as before) send-side msg type checks can be raised
  locally and cause offending senders to error and adjust before the
  streaming phase of an IPC ctx.

Impl (related) deats:
- obvi make `MsgpackTCPStream.recv()` yield up any `MsgTypeError`
  constructed by `_mk_msg_type_err()` such that the exception will
  eventually be relayed up to `._rpc.process_messages()` and from
  their delivered to the corresponding ctx-task.
- in support of ^, make `Channel.recv()` detect said mtes and use the
  new `pack_from_raise()` to inject the far end `Actor.uid` for the
  `Error.src_uid`.
- keep raising the send side equivalent (when strict enabled) errors
  inline immediately with no upward `Error` packing or relay.
- improve `_mk_msg_type_err()` cases handling with far more detailed
  `MsgTypeError` "message" contents pertaining to `msgspec` specific
  failure-fixing-tips and type-spec mismatch info:
  * use `.from_decode()` constructor in recv-side case to inject the
    non-spec decoded `msg_dict: dict` and use the new
    `MsgCodec.pld_spec_str: str` when clarifying the type discrepancy
    with the offending field.
  * on send-side, if we detect that an unsupported field type was
    described in the original `src_type_error`, AND there is no
    `msgpack.Encoder.enc_hook()` set, that the real issue is likely
    that the user needs to extend the codec to support the
    non-std/custom type with a hook and link to `msgspec` docs.
  * if one of a `src_type/validation_error` is provided, set that
    error as the `.__cause__` in the new mte.
8839bb06a3 Start tidying up `._context`, use `pack_from_raise()`
Mostly removing commented (and replaced) code blocks lingering from the
ctxc semantics work and new typed-msg-spec `MsgType`s handling AND use
the new `._exceptions.pack_from_raise()` helper to construct
`StreamOverrun` msgs.

Deaterz:
- clean out the drain loop now that it's implemented to handle our
  struct msg types including the `dict`-msg bits left in as
  fallback-reminders, any notes/todos better summarized at the top of
  their blocks, remove any `_final_result_is_set()` related duplicate/legacy
  tidbits.
- use a `case Error()` block in drain loop with fallthrough to `_:`
  always resulting in an rte raise.
- move "XXX" notes into the doc-string for `._deliver_msg()` as
  a "rules" section.
- use `match:` syntax for logging the `result_or_err: MsgType` outcome
  from the final `.result()` call inside `open_context_from_portal()`.
- generally speaking use `MsgType` type annotations throughout!
0dcaf5f3b2 TO-CHERRY: Error on `breakpoint()` without `debug_mode=True`?
Not sure if this is a good tactic (yet) but it at least covers us from
getting user's confused by `breakpoint()` usage causing REPL clobbering.
Always set an explicit rte raising breakpoint hook such that the user
realizes they can't use `.pause_from_sync()` without enabling debug
mode.

** CHERRY-PICK into `pause_from_sync_w_greenback` branch! **
7aaa2a61ec Add msg-from-dict constructor helper
Handy for re-constructing a struct-`MsgType` from a `dict` decoded from
wire-bytes wherein the msg failed to decode normally due to a field type
error but you'd still like to show the "potential" msg in struct form,
say inside a `MsgTypeError`'s meta data.

Supporting deats:
- add a `.msg.types.from_dict_msg()` to implement it (the helper).
- also a `.msg.types._msg_table: dict[str, MsgType]` for supporting this
  func ^ as well as providing just a general `MsgType`-by-`str`-name
  lookup.

Unrelated:
- Drop commented idea for still supporting `dict`-msg set via
  `enc/dec_hook()`s that would translate to/from `MsgType`s, but that
  would require a duplicate impl in the runtime.. so eff that XD
322e015d32 Add custom `MsgCodec.__repr__()`
Sure makes console grokability a lot better by showing only the
customizeable fields.

Further, clean up `mk_codec()` a bunch by removing the `ipc_msg_spec`
param since we don't plan to support another msg-set (for now) which
allows cleaning out a buncha logic that was mostly just a source of
bugs..

Also,
- add temporary `log.info()` around codec application.
- throw in some sanity `assert`s to `limit_msg_spec()`.
- add but mask out the `extend_msg_spec()` idea since it seems `msgspec`
  won't allow `Decoder.type` extensions when using a custom `dec_hook()`
  for some extension type.. (not sure what approach to take here yet).
eec240a70a Tweak some `pformat_boxed_tb()` indent inputs
- add some `tb_str: str` indent-prefix args for diff indent levels for the
body vs. the surrounding "ascii box".
- ^-use it-^ from `RemoteActorError.__repr()__` obvi.
- use new `msg.types.from_dict_msg()` in impl of
  `MsgTypeError.payload_msg`, handy for showing what the message "would
  have looked like in `Struct` form" had it not failed it's type
  constraints.
3fb3608879 Extend recv-side `MsgTypeError` default message
Display the new `MsgCodec.pld_spec_str` and format the incorrect field
value to be placed entirely (txt block wise) right of the "type annot"
part of the line:

Iow if you had a bad `dict` value where something else should be it'd
look something like this:

<Started(
 |_pld: NamespacePath = {'cid': '3e0ca00c-7d32-4d2a-a0c2-ac2e12453871',
                         'locked': True,
                         'msg_type': 'LockStatus',
                         'subactor_uid': ['sub', 'af7ccb69-1dab-491f-84f7-2ec42c32d137']}
df548257ad IPC ctx refinements around `MsgTypeError` awareness
Add a bit of special handling for msg-type-errors with a dedicated
log-msg detailing which `.side: str` is the sender/causer and avoiding
a `._scope.cancel()` call in such cases since the local task might be
written to handle and tolerate the badly (typed) IPC msg.

As part of ^, change the ctx task-pair "side" semantics from "caller" ->
"callee" to be "parent" -> "child" which better matches the
cross-process SC-linked-task supervision hierarchy, and
`trio.Nursery.parent_task`; in `trio` the task that opens a nursery is
also named the "parent".

Impl deats / fixes around the `.side` semantics:
- ensure that `._portal: Portal` is set ASAP after
  `Actor.start_remote_task()` such that if the `Started` transaction
  fails, the parent-vs.-child sides are still denoted correctly (since
  `._portal` being set is the predicate for that).
- add a helper func `Context.peer_side(side: str) -> str:` which inverts
  from "child" to "parent" and vice versa, useful for logging info.

Other tweaks:
- make `_drain_to_final_msg()` return a tuple of a maybe-`Return` and
  the list of other `pre_result_drained: list[MsgType]` such that we
  don't ever have to warn about the return msg getting captured as
  a pre-"result" msg.
- Add some strictness flags to `.started()` which allow for toggling
  whether to error or warn log about mismatching roundtripped `Started`
  msgs prior to IPC transit.
2d22713806 Add `from_dict_msg(user_pretty: bool)` flag
Allows for optionally (and dynamically) constructing the "expected"
`MsgType` from a `dict` into a `pretty_struct.Struct`, mostly for
logging usage.
2edfed75eb Add `MsgTypeError.expected_msg_type`
Which matches with renaming `.payload_msg` -> `.expected_msg` which is
the value we attempt to construct from a vanilla-msgppack
decode-to-`dict` and then construct manually into a `MsgType` using
`.msg.types.from_dict_msg()`. Add a todo to use new `use_pretty` flag
which currently conflicts with `._exceptions.pformat_boxed_type()`
prefix formatting..
38a6483859 Use `_raise_from_no_key_in_msg(allow_msgs)`
Instead of `allow_msg_keys` since we've fully flipped over to
struct-types for msgs in the runtime.

- drop the loop from `MsgStream.receive_nowait()` since
  `Yield/Return.pld` getting will handle both (instead of a loop of
  `dict`-key reads).
60aa16adf6 Pass a `use_greenback: bool` runtime var to subs
Such that the top level `maybe_enable_greenback` from
`open_root_actor()` can toggle the entire actor tree's usage.
Read the rtv in `._rpc` tasks and only enable if set.

Also, rigor up the `._rpc.process_messages()` loop to handle `Error()`
and `case _:` separately such that we now raise an explicit rte for
unknown / invalid msgs. Use "parent" / "child" for side descriptions in
loop comments and put a fat comment before the `StartAck` in `_invoke()`.
3869e91b19 More msg-spec tests tidying
- Drop `test_msg_spec_xor_pld_spec()` since we no longer support
  `ipc_msg_spec` arg to `mk_codec()`.
- Expect `MsgTypeError`s around `.open_context()` calls when
  `add_codec_hooks == False`.
- toss in some `.pause()` points in the subactor ctx body whilst hacking
  out a `.pld` protocol for debug mode TTY locking.
d4155396bf Relay `SIGUSR1` to subactors for `stackscope` tracing
Since obvi we don't want to just only see the trace in the root most of
the time ;)

Currently the sig keeps firing twice in the root though, and i'm not
sure why yet..
a73b24cf4a First draft, sub-msg-spec for debugger `Lock` sys
Since it's totes possible to have a spec applied that won't permit
`str`s, might as well formalize a small msg set for subactors to request
the tree-wide TTY `Lock`.

BTW, I'm prolly not going into every single change here in this first
WIP since there's still a variety of broken stuff mostly to do with
races on the codec apply being done in a `trio.lowleve.RunVar`; it
should be re-done with a `ContextVar` such that each task does NOT
mutate the global setting..

New msg set and usage is simply:
- `LockStatus` which is the reponse msg delivered from `lock_tty_for_child()`
- `LockRelease` a one-off request msg from the subactor to drop the
  `Lock` from a `MsgStream.send()`.
- use these msgs throughout the root and sub sides of the locking
  ctx funcs: `lock_tty_for_child()` & `wait_for_parent_stdin_hijack()`

The codec is now applied in both the root and sub `Lock` request tasks:
- for root inside `lock_tty_for_child()` before the `.started()`.
- for subs, inside `wait_for_parent_stdin_hijack()` since we only want
  to affect the codec *for the locking task*.
  - (hence the need for ctx-var as mentioned above but currently this
    can cause races which will break against other app tasks competing
    for the codec setting).
- add a `apply_debug_codec()` helper for use in both cases.
- add more detailed logging to both the root and sub side of `Lock`
  requesting funcs including requiring that the sub-side task "uid" (a
  `tuple[str, int]` = (trio.Task.name, id(trio.Task)` be provided (more
  on this later).

A main issue discovered while proto-testing all this was the ability of
a sub to "double lock" (leading to self-deadlock) via an error in
`wait_for_parent_stdin_hijack()` which, for ex., can happen in debug
mode via crash handling of a `MsgTypeError` received from the root
during a codec applied msg-spec race! Originally I was attempting to
solve this by making the SIGINT override handler more resilient but this
case is somewhat impossible to detect by an external root task other
then checking for duplicate ownership via the new `subactor_task_uid`.
=> SO NOW, we always stick the current task uid in the
   `Lock._blocked: set` and raise an rte on a double request by the same
   remote task.

Included is a variety of small refinements:
- finally figured out how to mark a variety of `.__exit__()` frames with
  `pdbp.hideframe()` to actually hide them B)
- add cls methods around managing `Lock._locking_task_cs` from root only.
- re-org all the `Lock` attrs into those only used in root vs. subactors
  and proto-prep a new `DebugStatus` actor-singleton to be used in subs.
- add a `Lock.repr()` to contextually print the current conc primitives.
- rename our `Pdb`-subtype to `PdbREPL`.
- rigor out the SIGINT handler a bit, originally to try and hack-solve
  the double-lock issue mentioned above, but now just with better
  logging and logic for most (all?) possible hang cases that should be
  hang-recoverable after enough ctrl-c mashing by the user.. well
  hopefully:
  - using `Lock.repr()` for both root and sub cases.
  - lots more `log.warn()`s and handler reversions on stale lock or cs
    detection.
- factor `._pause()` impl a little better moving the actual repl entry
  to a new `_enter_repl_sync()` (originally for easier wrapping in the
  sub case with `apply_codec()`).
77a15ebf19 Use `DebugStatus` around subactor lock requests
Breaks out all the (sub)actor local conc primitives from `Lock` (which
is now only used in and by the root actor) such that there's an explicit
distinction between a task that's "consuming" the `Lock` (remotely) vs.
the root-side service tasks which do the actual acquire on behalf of the
requesters.

`DebugStatus` changeover deats:
------ - ------
- move all the actor-local vars over `DebugStatus` including:
  - move `_trio_handler` and `_orig_sigint_handler`
  - `local_task_in_debug` now `repl_task`
  - `_debugger_request_cs` now `req_cs`
  - `local_pdb_complete` now `repl_release`
- drop all ^ fields from `Lock.repr()` obvi..
- move over the `.[un]shield_sigint()` and
  `.is_main_trio_thread()` methods.
- add some new attrs/meths:
  - `DebugStatus.repl` for the currently running `Pdb` in-actor
    singleton.
  - `.repr()` for pprint of state (like `Lock`).
- Note: that even when a root-actor task is in REPL, the `DebugStatus`
  is still used for certain actor-local state mgmt, such as SIGINT
  handler shielding.
- obvi change all lock-requester code bits to now use a `DebugStatus` in
  their local actor-state instead of `Lock`, i.e. change usage from
  `Lock` in `._runtime` and `._root`.
- use new `Lock.get_locking_task_cs()` API in when checking for
  sub-in-debug from `._runtime.Actor._stream_handler()`.

Unrelated to topic-at-hand tweaks:
------ - ------
- drop the commented bits about hiding `@[a]cm` stack frames from
  `_debug.pause()` and simplify to only one block with the `shield`
  passthrough since we already solved the issue with cancel-scopes using
  `@pdbp.hideframe` B)
  - this includes all the extra logging about the extra frame for the
    user (good thing i put in that wasted effort back then eh..)
- put the `try/except BaseException` with `log.exception()` around the
  whole of `._pause()` to ensure we don't miss in-func errors which can
  cause hangs..
- allow passing in `portal: Portal` to
  `Actor.start_remote_task()` such that `Portal` task spawning methods
  are always denoted correctly in terms of `Context.side`.
- lotsa logging tweaks, decreasing a bit of noise from `.runtime()`s.
7372404d76 `NamespacePath._mk_fqnp()` handle `__mod__` for methods
Need to use `__self__.__mod__` in the method case i guess..
5439060cd3 Start a `devx._code` mod
Starting with a little sub-sys for tracing caller frames by marking them
with a dunder var (`__runtimeframe__` by default) and then scanning for
that frame such that code that is *calling* our APIs can be reported
easily in logging / tracing output.

New APIs:
- `find_caller_info()` which does the scan and delivers a,
- `CallerInfo` which (attempts) to expose both the runtime frame-info
  and frame of the caller func along with `NamespacePath` properties.

Probably going to re-implement the dunder var bit as a decorator later
so we can bind in the literal func-object ref instead of trying to look
it up with `get_class_from_frame()`, since it's kinda hacky/non-general
and def doesn't work for closure funcs..
d51be2a36a Proto in new `Context` refinements
As per some newly added features and APIs:

- pass `portal: Portal` to `Actor.start_remote_task()` from
  `open_context_from_portal()` marking `Portal.open_context()` as
  always being the "parent" task side.

- add caller tracing via `.devx._code.CallerInfo/.find_caller_info()`
  called in `mk_context()` and (for now) a `__runtimeframe__: int = 2`
  inside `open_context_from_portal()` such that any enter-er of
  `Portal.open_context()` will be reported.

- pass in a new `._caller_info` attr which is used in 2 new meths:
  - `.repr_caller: str` for showing the name of the app-code-func.
  - `.repr_api: str` for showing the API ep, which for now we just
    hardcode to `Portal.open_context()` since ow its gonna show the mod
    func name `open_context_from_portal()`.
  - use those new props ^ in the `._deliver_msg()` flow body log msg
    content for much clearer msg-flow tracing Bo

- add `Context._cancel_on_msgerr: bool` to toggle whether
  a delivered `MsgTypeError` should trigger a `._scope.cancel()` call.
  - also (temporarily) add separate `.cancel()` emissions for both cases
    as i work through hacking out the maybe `MsgType.pld: Raw` support.
dd6a4d49d8 Go back to `ContextVar` for codec mgmt
Turns out we do want per-task inheritance particularly if there's to be
per `Context` dynamic mutation of the spec; we don't want mutation in
some task to affect any parent/global setting.

Turns out since we use a common "feeder task" in the rpc loop, we need to
offer a per `Context` payload decoder sys anyway in order to enable
per-task controls for inter-actor multi-task-ctx scenarios.
0df7d557db Move `MsgTypeError` maker func to `._exceptions`
Since it's going to be used from the IPC primitive APIs
(`Context`/`MsgStream`) for similarly handling payload type spec
validation errors and bc it's really not well situation in the IPC
module XD

Summary of (impl) tweaks:
- obvi move `_mk_msg_type_err()` and import and use it in `._ipc`; ends
  up avoiding a lot of ad-hoc imports we had from `._exceptions` anyway!
- mask out "new codec" runtime log emission from `MsgpackTCPStream`.
- allow passing a (coming in next commit) `codec: MsgDec` (message
  decoder) which supports the same required `.pld_spec_str: str` attr.
- for send side logging use existing `MsgCodec..pformat_msg_spec()`.
- rename `_raise_from_no_key_in_msg()` to the now more appropriate
  `_raise_from_unexpected_msg()`, but leaving alias for now.
a51632ffa6 Add a `MsgDec` for receive-only decoding
In prep for a "payload receiver" abstraction that will wrap
`MsgType.pld`-IO delivery from `Context` and `MsgStream`, adds a small
`msgspec.msgpack.Decoder` shim which delegates an API similar to
`MsgCodec` and is offered via a `.msg._codec.mk_dec()` factory.

Detalles:
- move over the TODOs/comments from `.msg.types.Start` to to
  `MsgDec.spec` since it's probably the ideal spot to start thinking
  about it from a consumer code PoV.
- move codec reversion assert and log emit into `finally:` block.
- flip default `.types._tractor_codec = mk_codec_ipc_pld(ipc_pld_spec=Raw)`
  in prep for always doing payload-delayed decodes.
- make `MsgCodec._dec` private with public property getter.
- change `CancelAck` to NOT derive from `Return` so it's mutex in
  `match/case:` handling.
5eb9144921 First draft "payload receiver in a new `.msg._ops`
As per much tinkering, re-designs and preceding rubber-ducking via many
"commit msg novelas", **finally** this adds the (hopefully) final
missing layer for typed msg safety: `tractor.msg._ops.PldRx`

(or `PayloadReceiver`? haven't decided how verbose to go..)

Design justification summary:
      ------ - ------
- need a way to be as-close-as-possible to the `tractor`-application
  such that when `MsgType.pld: PayloadT` validation takes place, it is
  straightforward and obvious how user code can decide to handle any
  resulting `MsgTypeError`.
- there should be a common and optional-yet-modular way to modify
  **how** data delivered via IPC (possibly embedded as user defined,
  type-constrained `.pld: msgspec.Struct`s) can be handled and processed
  during fault conditions and/or IPC "msg attacks".
- support for nested type constraints within a `MsgType.pld` field
  should be simple to define, implement and understand at runtime.
- a layer between the app-level IPC primitive APIs
  (`Context`/`MsgStream`) and application-task code (consumer code of
  those APIs) should be easily customized and prove-to-be-as-such
  through demonstrably rigorous internal (sub-sys) use!
  -> eg. via seemless runtime RPC eps support like `Actor.cancel()`
  -> by correctly implementing our `.devx._debug.Lock` REPL TTY mgmt
    dialog prot, via a dead simple payload-as-ctl-msg-spec.

There are some fairly detailed doc strings included so I won't duplicate
that content, the majority of the work here is actually somewhat of
a factoring of many similar blocks that are doing more or less the same
`msg = await Context._rx_chan.receive()` with boilerplate for
`Error`/`Stop` handling via `_raise_from_no_key_in_msg()`. The new
`PldRx` basically provides a shim layer for this common "receive msg,
decode its payload, yield it up to the consuming app task" by pairing
the RPC feeder mem-chan with a msg-payload decoder and expecting IPC API
internals to use **one** API instead of re-implementing the same pattern
all over the place XD

`PldRx` breakdown
 ------ - ------
- for now only expects a `._msgdec: MsgDec` which allows for
  override-able `MsgType.pld` validation and most obviously used in
  the impl of `.dec_msg()`, the decode message method.
- provides multiple mem-chan receive options including:
 |_ `.recv_pld()` which does the e2e operation of receiving a payload
    item.
 |_ a sync `.recv_pld_nowait()` version.
 |_ a `.recv_msg_w_pld()` which optionally allows retreiving both the
    shuttling `MsgType` as well as it's `.pld` body for use cases where
    info on both is important (eg. draining a `MsgStream`).

Dirty internal changeover/implementation deatz:
             ------ - ------
- obvi move over all the IPC "primitives" that previously had the duplicate recv-n-yield
  logic:
 - `MsgStream.receive[_nowait]()` delegating instead to the equivalent
   `PldRx.recv_pld[_nowait]()`.
 - add `Context._pld_rx: PldRx`, created and passed in by
   `mk_context()`; use it for the `.started()` -> `first: Started`
   retrieval inside `open_context_from_portal()`.
 - all the relevant `Portal` invocation methods: `.result()`,
   `.run_from_ns()`, `.run()`; also allows for dropping `_unwrap_msg()`
   and `.Portal_return_once()` outright Bo
- rename `Context.ctx._recv_chan` -> `._rx_chan`.
- add detailed `Context._scope` info for logging whether or not it's
  cancelled inside `_maybe_cancel_and_set_remote_error()`.
- move `._context._drain_to_final_msg()` -> `._ops.drain_to_final_msg()`
  since it's really not necessarily ctx specific per say, and it does
  kinda fit with "msg operations" more abstractly ;)
18e97a8f9a Use `Context._stream` in `_raise_from_unexpected_msg()`
Instead of expecting it to be passed in (as it was prior), when
determining if a `Stop` msg is a valid end-of-channel signal use the
`ctx._stream: MsgStream|None` attr which **must** be set by any stream
opening API; either of:
- `Context.open_stream()`
- `Portal.open_stream_from()`

Adjust the case block logic to match with fallthrough from any EoC to
a closed error if necessary. Change the `_type: str` to match the
failing IPC-prim name in the tail case we raise a `MessagingError`.

Other:
- move `.sender: tuple` uid attr up to `RemoteActorError` since `Error`
  optionally defines it as a field and for boxed `StreamOverrun`s (an
  ignore case we check for in the runtime during cancellation) we want
  it readable from the boxing rae.
- drop still unused `InternalActorError`.
6b30c86eca Try out `msgspec` encode-buffer optimization
As per the reco:
https://jcristharif.com/msgspec/perf-tips.html#reusing-an-output-buffe

BUT, seems to cause this error in `pikerd`..

`BufferError: Existing exports of data: object cannot be re-sized`

Soo no idea? Maybe there's a tweak needed that we can glean from
tests/examples in the `msgspec` repo?

Disabling for now.
188ff0e0e5 Another `._rpc` mod passthrough
- tweaking logging to include more `MsgType` dumps on IPC faults.
- removing some commented cruft.
- comment formatting / cleanups / add-ons.
- more type annots.
- fill out some TODO content.
08fcd3fb03 Mk `.msg.pretty_struct.Struct.pformat()` a mod func
More along the lines of `msgspec.struct` and also far more useful
internally for pprinting `MsgTypes`. Of course add method aliases.
c383978402 Add more useful `MsgDec.__repr__()`
Basically exact same as that for `MsgCodec` with the `.spec` displayed
via a better (maybe multi-line) `.spec_str: str` generated from a common
new set of helper mod funcs factored out msg-codec meths:
- `mk_msgspec_table()` to gen a `MsgType` name -> msg table.
- `pformat_msgspec()` to `str`-ify said table values nicely.q

Also add a new `MsgCodec.msg_spec_str: str` prop which delegates to the
above for the same.
a5a0e6854b Use new `Msg[Co]Dec` repr meths in `._exceptions`
Particularly when logging around `MsgTypeError`s.

Other:
- make `_raise_from_unexpected_msg()`'s `expect_msg` a non-default value
  arg, must always be passed by caller.
- drop `'canceller'` from `_body_fields` ow it shows up twice for ctxc.
- use `.msg.pretty_struct.pformat()`.
- parameterize `RemoteActorError.reprol()` (repr-one-line method) to
  show `RemoteActorError[<self.boxed_type_str>]( ..` to make obvi
  the boxed remote error type.
- re-impl `.boxed_type_str` as `str`-casting the `.boxed_type` value
  which is guaranteed to render non-`None`.
a3429268ea First draft payload-spec limit API
Add new task-scope oriented `PldRx.pld_spec` management API similar to
`.msg._codec.limit_msg_spec()`, but obvi built to process and filter
`MsgType.pld` values.

New API related changes include:
- new per-task singleton getter `msg._ops.current_pldrx()` which
  delivers the current (global) payload receiver via a new
  `_ctxvar_PldRx: ContextVar` configured with a default
  `_def_any_pldec: MsgDec[Any]` decoder.
- a `PldRx.limit_plds()` which sets the decoder (`.type` underneath)
  for the specific payload rx instance.
- `.msg._ops.limit_plds()` which obtains the current task-scoped `PldRx`
  and applies the pld spec via a new `PldRx.limit_plds()`.
- rename `PldRx._msgdec` -> `._pldec`.
- add `.pld_dec` as pub attr for -^

Unrelated adjustments:
- use `.msg.pretty_struct.pformat()` where handy.
- always pass `expect_msg: MsgType`.
- add a `case Stop()` to `PldRx.dec_msg()` which will `log.warning()`
  when a stop is received by no stream was open on this receiving side
  since we rarely want that to raise since it's prolly just a runtime
  race or mistake in user code.

Other:
40c972f0ec Mk `process_messages()` return last msg; summary logging
Not sure it's **that** useful (yet) but in theory would allow avoiding
certain log level usage around transient RPC requests for discovery methods
(like `.register_actor()` and friends); can't hurt to be able to
introspect that last message for other future cases I'd imagine as well.
Adjust the calling code in `._runtime` to match; other spots are using
the `trio.Nursery.start()` schedule style and are fine as is.

Improve a bunch more log messages throughout a few mods mostly by going
to a "summary" single-emission style where possible/appropriate:
- in `._runtime` more "single summary" status style log emissions:
 |_mk `Actor.load_modules()` render a single mod loaded summary.
 |_use a summary `con_status: str` for `Actor._stream_handler()` conn
   setup and an equiv (`con_teardown_status`) for connection teardowns.
 |_similar thing in `Actor.wait_for_actor()`.
- generally more usage of `.msg.pretty_struct` apis throughout `._runtime`.
88a0e90f82 Reorg frames pformatters, add `Context.repr_state`
A better spot for the pretty-formatting of frame text (and thus tracebacks)
is in the new `.devx._code` module:
- move from `._exceptions` -> `.devx._code.pformat_boxed_tb()`.
- add new `pformat_caller_frame()` factored out the use case in
  `._exceptions._mk_msg_type_err()` where we dump a stack trace
  for bad `.send()` side IPC msgs.

Add some new pretty-format methods to `Context`:
- explicitly implement `.pformat()` and allow an `extra_fields: dict`
  which can be used to inject additional fields (maybe eventually by
  default) such as is now used inside
  `._maybe_cancel_and_set_remote_error()` when reporting the internal
  `._scope` state in cancel logging.
- add a new `.repr_state -> str` which provides a single string status
  depending on the internal state of the IPC ctx in terms of the shuttle
  protocol's "phase"; use it from `.pformat()` for the `|_state:`.
- set `.started(complain_no_parity=False)` now since we presume decoding
  with `.pld: Raw` now with the new `PldRx` design.
- use new `msgops.current_pldrx()` in `mk_context()`.
544ff5ab4c Change to `RemoteActorError.pformat()`
For more sane manual calls as needed in logging purposes. Obvi remap
the dunder methods to it.

Other:
- drop `hide_tb: bool` from `unpack_error()`, shouldn't need it since
  frame won't ever be part of any tb raised from returned error.
- add a `is_invalid_payload: bool` to `_raise_from_unexpected_msg()` to
  be used from `PldRx` where we don't need to decode the IPC
  msg, just the payload; make the error message reflect this case.
- drop commented `._portal._unwrap_msg()` since we've replaced it with
  `PldRx`'s delegation to newer `._raise_from_unexpected_msg()`.
- hide the `Portal.result()` frame by default, again.
523c24eb72 Move pformatters into new `.devx.pformat`
Since `._code` is prolly gonna get renamed (to something "frame & stack
tools" related) and to give a bit better organization.

Also adds a new `add_div()` helper, factored out of ctxc message
creation in `._rpc._invoke()`, for adding a little "header line" divider
under a given `message: str` with a little math to center it.
8ffa6a5e68 "Icons" in `._entry`'s subactor `.info()` messages
Add a little `>` or `X` supervision icon indicating the spawning or
termination of each sub-actor respectively.
b278164f83 Mk `drain_to_final_msg()` never raise from `Error`
Since we usually want them raised from some (internal) call to
`Context.maybe_raise()` and NOT directly from the drainage call, make it
possible via a new `raise_error: bool` to both `PldRx.recv_msg_w_pld()`
and `.dec_msg()`.

In support,
- rename `return_msg` -> `result_msg` since we expect to return
  `Error`s.
- do a `result_msg` assign and `break` in the `case Error()`.
- add `**dec_msg_kwargs` passthrough for other `.dec_msg()` calling
  methods.

Other,
- drop/aggregate todo-notes around the main loop's
  `ctx._pld_rx.recv_msg_w_pld()` call.
- add (configurable) frame hiding to most payload receive meths.
fbc21a1dec Add a "current IPC `Context`" `ContextVar`
Expose it from `._state.current_ipc_ctx()` and set it inside
`._rpc._invoke()` for child and inside `Portal.open_context()` for
parent.

Still need to write a few more tests (particularly demonstrating usage
throughout multiple nested nurseries on each side) but this suffices as
a proto for testing with some debugger request-from-subactor stuff.

Other,
- use new `.devx.pformat.add_div()` for ctxc messages.
- add a block to always traceback dump on corrupted cs stacks.
- better handle non-RAEs exception output-formatting in context
  termination summary log message.
- use a summary for `start_status` for msg logging in RPC loop.
a354732a9e Allow `Stop` passthrough from `PldRx.recv_msg_w_pld()`
Since we need to allow it (at the least) inside
`drain_until_final_msg()` for handling stream-phase termination races
where we don't want to have to handle a raised error from something like
`Context.result()`. Expose the passthrough option via
a `passthrough_non_pld_msgs: bool` kwarg.

Add comprehensive comment to `current_pldrx()`.
05b143d9ef Big debugger rework, more tolerance for internal err-hangs
Since i was running into them (internal errors) during lock request
machinery dev and was getting all sorts of difficult to understand hangs
whenever i intro-ed a bug to either side of the ipc ctx; this all while
trying to get the msg-spec working for `Lock` requesting subactors..

Deats:
- hideframes for `@acm`s and `trio.Event.wait()`, `Lock.release()`.
- better detail out the `Lock.acquire/release()` impls
- drop `Lock.remote_task_in_debug`, use new `.ctx_in_debug`.
- add a `Lock.release(force: bool)`.
- move most of what was `_acquire_debug_lock_from_root_task()` and some
  of the `lock_tty_for_child().__a[enter/exit]()` logic into
  `Lock.[acquire/release]()`  including bunch more logging.
- move `lock_tty_for_child()` up in the module to below `Lock`, with
  some rework:
  - drop `subactor_uid: tuple` arg since we can just use the `ctx`..
  - add exception handler blocks for reporting internal (impl) errors
    and always force release the lock in such cases.
- extend `DebugStatus` (prolly will rename to `DebugRequest` btw):
  - add `.req_ctx: Context` for subactor side.
  - add `.req_finished: trio.Event` to sub to signal request task exit.
  - extend `.shield_sigint()` doc-str.
  - add `.release()` to encaps all the state mgmt previously strewn
    about inside `._pause()`..
- use new `DebugStatus.release()` to replace all the duplication:
  - inside `PdbREPL.set_[continue/quit]()`.
  - inside `._pause()` for the subactor branch on internal
    repl-invocation error cases,
  - in the `_enter_repl_sync()` closure on error,
- replace `apply_debug_codec()` -> `apply_debug_pldec()` in tandem with
  the new `PldRx` sub-sys  which handles the new `__pld_spec__`.
- add a new `pformat_cs()` helper orig to help debug cs stack
  a corruption; going to move to `.devx.pformat` obvi.
- rename `wait_for_parent_stdin_hijack()` -> `request_root_stdio_lock()`
  with improvements:
  - better doc-str and add todos,
  - use `DebugStatus` more stringently to encaps all subactor req state.
  - error handling blocks for cancellation and straight up impl errors
    directly around the `.open_context()` block with the latter doing
    a `ctx.cancel()` to avoid hanging in the shielded `.req_cs` scope.
  - similar exc blocks for the func's overall body with explicit
    `log.exception()` reporting.
  - only set the new `DebugStatus.req_finished: trio.Event` in `finally`.
- rename `mk_mpdb()` -> `mk_pdb()` and don't cal `.shield_sigint()`
  implicitly since the caller usage does matter for this.
- factor out `any_connected_locker_child()` from the SIGINT handler.
- rework SIGINT handler to better handle any stale-lock/hang cases:
  - use new `Lock.ctx_in_debug: Context` to detect subactor-in-debug.
    and use it to cancel any lock request instead of the lower level
  - use `problem: str` summary approach to log emissions.
- rework `_pause()` given all of the above, stuff not yet mentioned:
  - don't take `shield: bool` input and proxy to `debug_func()` (for now).
  - drop `extra_frames_up_when_async: int` usage, expect
    `**debug_func_kwargs` to passthrough an `api_frame: Frametype` (more
    on this later).
  - lotsa asserts around the request ctx vs. task-in-debug ctx using new
    `current_ipc_ctx()`.
  - asserts around `DebugStatus` state.
- rework and simplify the `debug_func` hooks,
  `_set_trace()`/`_post_mortem()`:
  - make them accept a non-optional `repl: PdbRepl` and `api_frame:
    FrameType` which should be used to set the current frame when the
    REPL engages.
  - always hide the hook frames.
  - always accept a `tb: TracebackType` to `_post_mortem()`.
   |_ copy and re-impl what was the delegation to
     `pdbp.xpm()`/`pdbp.post_mortem()` and instead call the
     underlying `Pdb.interaction()` ourselves with a `caller_frame`
     and tb instance.
- adjust the public `.pause()` impl:
  - accept optional `hide_tb` and `api_frame` inputs.
  - mask opening a cancel-scope for now (can cause `trio` stack
    corruption, see notes) and thus don't use the `shield` input other
    then to eventually passthrough to `_post_mortem()`?
   |_ thus drop `task_status` support for now as well.
   |_ pretty sure correct soln is a debug-nursery around `._invoke()`.
- since no longer using `extra_frames_up_when_async` inside
  `debug_func()`s ensure all public apis pass a `api_frame`.
- re-impl our `tractor.post_mortem()` to directly call into `._pause()`
  instead of binding in via `partial` and mk it take similar input as
  `.pause()`.
- drop `Lock.release()` from `_maybe_enter_pm()`, expose and pass
  expected frame and tb.
- use necessary changes from all the above within
  `maybe_wait_for_debugger()` and `acquire_debug_lock()`.

Lel, sorry thought that would be shorter..
There's still a lot more re-org to do particularly with `DebugStatus`
encapsulation but it's coming in follow up.
343b7c9712 Even moar bitty `Context` refinements
- set `._state._ctxvar_Context` just after `StartAck` inside
  `open_context_from_portal()` so that `current_ipc_ctx()` always
  works on the 'parent' side.
- always set `.canceller` to any `MsgTypeError.src_uid` and otherwise to
  any maybe-detected `.src_uid` (i.e. for RAEs).
- always set `.canceller` to us when we rx a ctxc which reports us as
  its canceller; this is a sanity check on definite "self cancellation".
- adjust `._is_self_cancelled()` logic to only be `True` when
  `._remote_error` is both a ctxc with a `.canceller` set to us AND
  when `Context.canceller` is also set to us (since the change above)
  as a little bit of extra rigor.
- fill-in/fix some `.repr_state` edge cases:
  - merge self-vs.-peer ctxc cases to one block and distinguish via
    nested `._is_self_cancelled()` check.
  - set 'errored' for all exception matched cases despite `.canceller`.
  - add pre-`Return` phase statuses:
   |_'pre-started' and 'syncing-to-child' depending on side and when
     `._stream` has not (yet) been set.
   |_'streaming' and 'streaming-finished' depending on side when
     `._stream` is set and whether it was stopped/closed.
- tweak drainage log-message to use "outcome" instead of "result".
- use new `.devx.pformat.pformat_cs()` inside `_maybe_cancel_and_set_remote_error()`
  but, IFF the log level is at least 'cancel'.
6690968236 Rework and first draft of `.devx._frame_stack.py`
Proto-ing a little suite of call-stack-frame annotation-for-scanning
sub-systems for the purposes of both,
- the `.devx._debug`er and its
  traceback and frame introspection needs when entering the REPL,
- detailed trace-style logging such that we can explicitly report
  on "which and where" `tractor`'s APIs are used in the "app" code.

Deats:
- change mod name obvi from `._code` and adjust client mod imports.
- using `wrapt` (for perf) implement a `@api_frame` annot decorator
  which both stashes per-call-stack-frame instances of `CallerInfo` in
  a table and marks the function such that API endpoints can be easily
  found via runtime stack scanning despite any internal impl changes.
- add a global `_frame2callerinfo_cache: dict[FrameType, CallerInfo]`
  table for providing the per func-frame info caching.
- Re-implement `CallerInfo` to require less (types of) inputs:
  |_ `_api_func: Callable`, a ref to the (singleton) func def.
  |_ `_api_frame: FrameType` taken from the `@api_frame` marked `tractor`-API
     func's runtime call-stack, from which we can determine the
     app code's `.caller_frame`.
  |_`_caller_frames_up: int|None` allowing the specific `@api_frame` to
    determine "how many frames up" the application / calling code is.
  And, a better set of derived attrs:
  |_`caller_frame: FrameType` which finds and caches the API-eps calling
    frame.
  |_`caller_frame: FrameType` which finds and caches the API-eps calling
- add a new attempt at "getting a method ref from its runtime frame"
  with `get_ns_and_func_from_frame()` using a heuristic that the
  `CodeType.co_qualname: str` should have a "." in it for methods.
  - main issue is still that the func-ref lookup will require searching
    for the method's instance type by name, and that name isn't
    guaranteed to be defined in any particular ns..
   |_rn we try to read it from the `FrameType.f_locals` but that is
     going to obvi fail any time the method is called in a module where
     it's type is not also defined/imported.
  - returns both the ns and the func ref FYI.
f85314ecab Adjust `._runtime` to report `DebugStatus.req_ctx`
- inside the `Actor.cancel()`'s maybe-wait-on-debugger delay,
  report the full debug request status and it's affiliated lock request
  IPC ctx.
- use the new `.req_ctx.chan.uid` to do the local nursery lookup during
  channel teardown handling.
- another couple log fmt tweaks.
d6ca4771ce Use `.recv_msg_w_pld()` for final `Portal.result()`
Woops, due to a `None` test against the `._final_result`, any actual
final `None` result would be received but not acked as such causing
a spawning test to hang. Fix it by instead receiving and assigning both
a `._final_result_msg: PayloadMsg` and `._final_result_pld`.

NB: as mentioned in many recent comments surrounding this API layer,
really this whole `Portal`-has-final-result interface/semantics should
be entirely removed as should the `ActorNursery.run_in_actor()` API(s).
Instead it should all be replaced by a wrapping "high level" API
(`tractor.hilevel` ?) which combines a task nursery, `Portal.open_context()`
and underlying `Context` APIs + an `outcome.Outcome` to accomplish the
same "run a single task in a spawned actor and return it's result"; aka
a "one-shot-task-actor".
fc075e96c6 Hide some API frames, port to new `._debug` apis
- start tossing in `__tracebackhide__`s to various eps which don't need
  to show in tbs or in the pdb REPL.
- port final `._maybe_enter_pm()` to pass a `api_frame`.
- start comment-marking up some API eps with `@api_frame`
  in prep for actually using the new frame-stack tracing.
5cb0cc0f0b Update tests for `PldRx` and `Context` changes
Mostly adjustments for the new pld-receiver semantics/shim-layer which
results more often in the direct delivery of `RemoteActorError`s from
IPC API primitives (like `Portal.result()`) instead of being embedded in
an `ExceptionGroup` bundled from an embedded nursery.

Tossed usage of the `debug_mode: bool` fixture to a couple problematic
tests while i was working on them.

Also includes detailed assertion updates to the inter-peer cancellation
suite in terms of,
- `Context.canceller` state correctly matching the true src actor when
  expecting a ctxc.
- any rxed `ContextCancelled` should instance match the `Context._local/remote_error`
  as should the `.msgdata` and `._ipc_msg`.
d2dee87b36 Modernize streaming example script
- add typing,
- apply multi-line call style,
- use 'cancel' log level,
- enable debug mode.
31de5f6648 Always release debug request from `._post_mortem()`
Since obviously the thread is likely expected to halt and raise after
the REPL session exits; this was a regression from the prior impl. The
main reason for this is that otherwise the request task will never
unblock if the user steps through the crashed task using 'next' since
the `.do_next()` handler doesn't by default release the request since in
the `.pause()` case this would end the session too early.

Other,
- toss in draft `Pdb.user_exception()`, though doesn't seem to ever
  trigger?
- only release `Lock._debug_lock` when already locked.
b23780c102 Make `request_root_stdio_lock()` post-mortem-able
Finally got this working so that if/when an internal bug is introduced
to this request task-func, we can actually REPL-debug the lock request
task itself B)

As in, if the subactor's lock request task internally errors we,
- ensure the task always terminates (by calling `DebugStatus.release()`)
  and explicitly reports (via a `log.exception()`) the internal error.
- capture the error instance and set as a new `DebugStatus.req_err` and
  always check for it on final teardown - in which case we also,
 - ensure it's reraised from a new `DebugRequestError`.
 - unhide the stack frames for `_pause()`, `_enter_repl_sync()` so that
   the dev can upward inspect the `_pause()` call stack sanely.

Supporting internal impl changes,
- add `DebugStatus.cancel()` and `.req_err`.
- don't ever cancel the request task from
  `PdbREPL.set_[continue/quit]()` only when there's some internal error
  that would likely result in a hang and stale lock state with the root.
- only release the root's lock when the current ask is also the owner
  (avoids bad release errors).
- also show internal `._pause()`-related frames on any `repl_err`.

Other temp-dev-tweaks,
- make pld-dec change log msgs info level again while solving this
  final context-vars race stuff..
- drop the debug pld-dec instance match asserts for now since
  the problem is already caught (and now debug-able B) by an attr-error
  on the decoded-as-`dict` started msg, and instead add in
  a `log.exception()` trace to see which task is triggering the case
  where the debug `MsgDec` isn't set correctly vs. when we think it's
  being applied.
262a0e36c6 Allocate a `PldRx` per `Context`, new pld-spec API
Since the state mgmt becomes quite messy with multiple sub-tasks inside
an IPC ctx, AND bc generally speaking the payload-type-spec should map
1-to-1 with the `Context`, it doesn't make a lot of sense to be using
`ContextVar`s to modify the `Context.pld_rx: PldRx` instance.

Instead, always allocate a full instance inside `mk_context()` with the
default `.pld_rx: PldRx` set to use the `msg._ops._def_any_pldec: MsgDec`

In support, simplify the `.msg._ops` impl and APIs:
- drop `_ctxvar_PldRx`, `_def_pld_rx` and `current_pldrx()`.
- rename `PldRx._pldec` -> `._pld_dec`.
- rename the unused `PldRx.apply_to_ipc()` -> `.wraps_ipc()`.
- add a required `PldRx._ctx: Context` attr since it is needed
  internally in some meths and each pld-rx now maps to a specific ctx.
- modify all recv methods to accept a `ipc: Context|MsgStream` (instead
  of a `ctx` arg) since both have a ref to the same `._rx_chan` and there
  are only a couple spots (in `.dec_msg()`) where we need the `ctx`
  explicitly (which can now be easily accessed via a new `MsgStream.ctx`
  property, see below).
- always show the `.dec_msg()` frame in tbs if there's a reference error
  when calling `_raise_from_unexpected_msg()` in the fallthrough case.
- implement `limit_plds()` as light wrapper around getting the
  `current_ipc_ctx()` and mutating its `MsgDec` via
  `Context.pld_rx.limit_plds()`.
- add a `maybe_limit_plds()` which just provides an `@acm` equivalent of
  `limit_plds()` handy for composing in a `async with ():` style block
  (avoiding additional indent levels in the body of async funcs).

Obvi extend the `Context` and `MsgStream` interfaces as needed
to match the above:
- add a `Context.pld_rx` pub prop.
- new private refs to `Context._started_msg: Started` and
  a `._started_pld` (mostly for internal debugging / testing / logging)
  and set inside `.open_context()` immediately after the syncing phase.
- a `Context.has_outcome() -> bool:` predicate which can be used to more
  easily determine if the ctx errored or has a final result.
- pub props for `MsgStream.ctx: Context` and `.chan: Channel` providing
  full `ipc`-arg compat with the `PldRx` method signatures.
30afcd2b6b Adjust `Portal` usage of `Context.pld_rx`
Pass the new `ipc` arg and try to show api frames when an unexpected
internal error is detected.
e78fdf2f69 Make `log.devx()` level below `.pdb()`
Kinda like a "runtime"-y level for `.pdb()` (which is more or less like
an `.info()` for our debugger subsys) which can be used to report
internals info for those hacking on `.devx` tools.

Also, inject only the *last* 6 digits of the `id(Task)` in
`pformat_task_uid()` output by default.
4ef77bb64f Set `_ctxvar_Context` for child-side RPC tasks
Just inside `._invoke()` after the `ctx: Context` is retrieved.

Also try our best to *not hide* internal frames when a non-user-code
crash happens, normally either due to a runtime RPC EP bug or
a transport failure.
fde62c72be Show runtime nursery frames on internal errors
Much like other recent changes attempt to detect runtime-bug-causing
crashes and only show the runtime-endpoint frame when present.

Adds a `ActorNursery._scope_error: BaseException|None` attr to aid with
detection. Also toss in some todo notes for removing and replacing the
`.run_in_actor()` method API.
b22f7dcae0 Resolve remaining debug-request race causing hangs
More or less by pedantically separating and managing root and subactor
request syncing events to always be managed by the locking IPC context
task-funcs:
- for the root's "child"-side, `lock_tty_for_child()` directly creates
  and sets a new `Lock.req_handler_finished` inside a `finally:`
- for the sub's "parent"-side, `request_root_stdio_lock()` does the same
  with a new `DebugStatus.req_finished` event and separates it from
  the `.repl_release` event (which indicates a "c" or "q" from user and
  thus exit of the REPL session) as well as sets a new `.req_task:
  trio.Task` to explicitly distinguish from the app-user-task that
  enters the REPL vs. the paired bg task used to request the global
  root's stdio mutex alongside it.
- apply the `__pld_spec__` on "child"-side of the ctx using the new
  `Portal.open_context(pld_spec)` parameter support; drops use of any
  `ContextVar` malarky used prior for `PldRx` mgmt.
- removing `Lock.no_remote_has_tty` since it was a nebulous name and
  from the prior "everything is in a `Lock`" design..

------ - ------

More rigorous impl to handle various edge cases in `._pause()`:
- rejig `_enter_repl_sync()` to wrap the `debug_func == None` case
  inside maybe-internal-error handler blocks.
- better logic for recurrent vs. multi-task contention for REPL entry in
  subactors, by guarding using `DebugStatus.req_task` and by now waiting
  on the new `DebugStatus.req_finished` for the multi-task contention
  case.
- even better internal error handling and reporting for when this code
  is hacked on and possibly broken ;p

------ - ------

Updates to `.pause_from_sync()` support:
- add optional `actor`, `task` kwargs to `_set_trace()` to allow
  compat with the new explicit `debug_func` calling in `._pause()` and
  pass a `threading.Thread` for `task` in the `.to_thread()` usage case.
- add an `except` block that tries to show the frame on any internal
  error.

------ - ------

Relatedly includes a buncha cleanups/simplifications somewhat in
prep for some coming refinements (around `DebugStatus`):
- use all the new attrs mentioned above as needed in the SIGINT shielder.
- wait on `Lock.req_handler_finished` in `maybe_wait_for_debugger()`.
- dropping a ton of masked legacy code left in during the recent reworks.
- better comments, like on the use of `Context._scope` for shielding on
  the "child"-side to avoid the need to manage yet another cs.
- add/change-to lotsa `log.devx()` level emissions for those infos which
  are handy while hacking on the debugger but not ideal/necessary to be
  user visible.
- obvi add lotsa follow up todo notes!
3538ccd799 Better context aware `RemoteActorError.pformat()`
Such that when displaying with `.__str__()` we do not show the type
header (style) since normally python's raising machinery already prints
the type path like `'tractor._exceptions.RemoteActorError:'`, so doing
it 2x is a bit ugly ;p

In support,
- include `.relay_uid` in `RemoteActorError.extra_body_fields`.
- offer a `with_type_header: bool` to `.pformat()` and only put the
  opening type path and closing `')>'` tail line when `True`.
- add `.is_inception() -> bool:` for an easy way to determine if the
  error is multi-hop relayed.
- only repr the `'|_relay_uid=<uid>'` field when an error is an inception.
- tweak the invalid-payload case in `_mk_msg_type_err()` to explicitly
  state in the `message` how the `any_pld` value does not match the `MsgDec.pld_spec`
  by decoding the invalid `.pld` with an any-dec.
- allow `_mk_msg_type_err(**mte_kwargs)` passthrough.
- pass `boxed_type=cls` inside `MsgTypeError.from_decode()`.
d15e73557a Move runtime frame hiding into helper func
Call it `hide_runtime_frames()` and stick all the lines from the top of
the `._debug` mod in there along with a little `log.devx()` emission on
what gets hidden by default ;)

Other,
- fix ref-error where internal-error handler might trigger despite the
  debug `req_ctx` not yet having init-ed, such that we don't try to
  cancel or log about it when it never was fully created/initialize..
- fix assignment typo iniside `_set_trace()` for `task`.. lel
702dfe47d5 Update debugger tests to expect new pformatting
Mostly the result of the `RemoteActorError.pformat()` and our
new `_pause/crash_msg: str`s which include the `trio.Task.__repr__()`
in the `log.pdb()` message.

Obvi use the `in_prompt_msg()` to accomplish where not used prior.

ToDo later:
-[ ] still some outstanding questions on how detailed inceptions
   should look, eg. in `test_multi_nested_subactors_error_through_nurseries()`
  |_maybe we should be more pedantic at checking `.src_uid` vs.
    `.relay_uid` fields?
-[ ] staged a placeholder test for verifying correct call-stack frame on
   crash handler REPL entry.
-[ ] also need a test to verify that you can't pause from an already paused actor task
   such as can happen if you try to step through runtime code that has
   a recurrent entry to `._debug.pause()`.
c6f599b1be Call `.devx._debug.hide_runtime_frames()` by default
From both `open_root_actor()` and `._entry._trio_main()`.

Other `breakpoint()`-from-sync-func fixes:
- properly disable the default hook using `"0"` XD
- offer a `hide_tb: bool` from `open_root_actor()`.
- disable hiding the `._trio_main()` frame, bc pretty sure it doesn't
  help anyone (either way) when REPL-ing/tb-ing from a subactor..?
9ce958cb4a Add debug check-n-wait inside `._spawn.soft_kill()`
And IFF the `await wait_func(proc)` is cancelled such that we avoid
clobbering some subactor that might be REPL-ing even though its parent
actor is in the midst of (gracefully) cancelling it.
e4ec6b7b0c Even smarter `RemoteActorError.pformat()`-ing
Related to the prior patch, re the new `with_type_header: bool`:
- in the `with_type_header == True` use case make sure we keep the first
  `._message: str` line non-indented since it'll show just after the
  header-line's type path with ':'.
- when `False` drop the `)>` `repr()`-instance style as well so that we
  just get the ascii boxed traceback as though it's the error
  message-`str` not the `repr()` of the error obj.

Other,
- hide `pack_from_raise()` call frame since it'll show in debug mode
  crash handling..
- mk `MsgTypeError.from_decode()` explicitly accept and proxy an
  optional `ipc_msg` and change `msgdict` to also be optional, only
  reading out the `**extra_msgdata` when provided.
- expose a `_mk_msg_type_err(src_err_msg: Error|None = None,)` for
  callers who which to inject a `._ipc_msg: Msgtype` to the MTE.
  |_ add a note how we can't use it due to a causality-dilemma when pld
     validating `Started` on the send side..
c2cc12e14f Add basic payload-spec test suite
Starts with some very basic cases:
- verify both subactor-as-child-ctx-task send side validation (failures)
  as well as relay and raise on root-parent-side-task.
- wrap failure expectation cases that bubble out of `@acm`s with
  a `maybe_expect_raises()` equiv wrapper with an embedded timeout.
- add `Return` cases including invalid by `str` and valid by a `None`.

Still ToDo:
- commit impl changes to make the bulk of this suite pass.
- adjust how `MsgTypeError`s format the local (`.started()`) send side
  `.tb_str` such that we don't do a "boxed" error prior to
  `pack_error()` being called normally prior to `Error` transit.
42ba855d1b More correct/explicit `.started()` send-side validation
In the sense that we handle it as a special case that exposed
through to `RxPld.dec_msg()` with a new `is_started_send_side: bool`.

(Non-ideal) `Context.started()` impl deats:
- only do send-side pld-spec validation when a new `validate_pld_spec`
  is set (by default it's not).
- call `self.pld_rx.dec_msg(is_started_send_side=True)` to validate the
  payload field from the just codec-ed `Started` msg's `msg_bytes` by
  passing the `roundtripped` msg (with it's `.pld: Raw`) directly.
- add a `hide_tb: bool` param and proxy it to the `.dec_msg()` call.

(Non-ideal) `PldRx.dec_msg()` impl deats:
- for now we're packing the MTE inside an `Error` via a manual call to
  `pack_error()` and then setting that as the `msg` passed to
  `_raise_from_unexpected_msg()` (though really we should just raise
  inline?).
- manually set the `MsgTypeError._ipc_msg` to the above..

Other,
- more comprehensive `Context` type doc string.
- various `hide_tb: bool` kwarg additions through `._ops.PldRx` meths.
- proto a `.msg._ops.validate_payload_msg()` helper planned to get the
  logic from this version of `.started()`'s send-side validation so as
  to be useful more generally elsewhere.. (like for raising back
  `Return` values on the child side?).

Warning: this commit may have been made out of order from required
changes to `._exceptions` which will come in a follow up!
eee4c61b51 Add `MsgTypeError` "bad msg" capture
Such that if caught by user code and/or the runtime we can introspect
the original msg which caused the type error. Previously this was kinda
half-baked with a `.msg_dict` which was delivered from an `Any`-decode
of the shuttle msg in `_mk_msg_type_err()` but now this more explicitly
refines the API and supports both `PayloadMsg`-instance or the msg-dict
style injection:
- allow passing either of `bad_msg: PayloadMsg|None` or
  `bad_msg_as_dict: dict|None` to `MsgTypeError.from_decode()`.
- expose public props for both ^ whilst dropping prior `.msgdict`.
- rework `.from_decode()` to explicitly accept `**extra_msgdata: dict`
  |_ only overriding it from any `bad_msg_as_dict` if the keys are found in
    `_ipcmsg_keys`, **except** for `_bad_msg` when `bad_msg` is passed.
  |_ drop `.ipc_msg` passthrough.
  |_ drop `msgdict` input.
- adjust `.cid` to only pull from the `.bad_msg` if set.

Related fixes/adjustments:
- `pack_from_raise()` should pull `boxed_type_str` from
  `boxed_type.__name__`, not the `type()` of it.. also add a
  `hide_tb: bool` flag.
- don't include `_msg_dict` and `_bad_msg` in the `_body_fields` set.
- allow more granular boxed traceback-str controls:
  |_ allow passing a `tb_str: str` explicitly in which case we use it
    verbatim and presume caller knows what they're doing.
  |_ when not provided, use the more explicit
    `traceback.format_exception(exc)` since the error instance is
    a required input (we still fail back to the old `.format_exc()` call
    if for some reason the caller passes `None`; but that should be
    a bug right?).
  |_ if a `tb: TracebackType` and a `tb_str` is passed, concat them.
- in `RemoteActorError.pformat()` don't indent the `._message` part used
  for the `body` when `with_type_header == False`.
- update `_mk_msg_type_err()` to use `bad_msg`/`bad_msg_as_dict`
  appropriately and drop passing `ipc_msg`.
27fd96729a Tweaks to debugger examples
Light stuff like comments, typing, and a couple API usage updates.
582144830f Parameterize the `return_msg_type` in `._invoke()`
Since we also handle a runtime-specific `CancelAck`, allow the
caller-scheduler to pass in the expected return-type msg per the RPC msg
endpoint loop.
7ac730e326 Drop `msg.types.Msg` for new replacement types
The `TypeAlias` for the msg type-group is now `MsgType` and any user
touching shuttle messages can now be typed as `PayloadMsg`.

Relatedly, add MTE specific `Error._bad_msg[_as_dict]` fields which are
handy for introspection of remote decode failures.
f7fd8278af Fix `test_basic_payload_spec` bad msg matching
Expecting `Started` or `Return` with respective bad `.pld` values
depending on what type of failure is test parametrized.

This makes the suite run green it seems B)
6c2efc96dc Factor `.started()` validation into `.msg._ops`
Filling out the helper `validate_payload_msg()` staged in a prior commit
and adjusting all imports to match.

Also add a `raise_mte: bool` flag for potential usage where the caller
wants to handle the MTE instance themselves.
59ca256183 Set remote errors in `_raise_from_unexpected_msg()`
By calling `Context._maybe_cancel_and_set_remote_error(exc)` on any
unpacked `Error` msg; provides for `Context.maybe_error` consistency to
match all other error delivery cases.
a1b124b62b Raise send-side MTEs inline in `PldRx.dec_msg()`
So when `is_started_send_side is True` we raise the newly created
`MsgTypeError` (MTE) directly instead of doing all the `Error`-msg pack
and unpack to raise stuff via `_raise_from_unexpected_msg()` since the
raise should happen send side anyway and so doesn't emulate any remote
fault like in a bad `Return` or `Started` without send-side pld-spec
validation.

Oh, and proxy-through the `hide_tb: bool` input from `.drain_to_final_msg()`
to `.recv_msg_w_pld()`.
2db03444f7 Don't (noisly) log about runtime cancel RPC tasks
Since in the case of the `Actor._cancel_task()` related runtime eps we
actually don't EVER register them in `Actor._rpc_tasks`.. logging about
them is just needless noise, though maybe we should track them in a diff
table; something like a `._runtime_rpc_tasks`?

Drop the cancel-request-for-stale-RPC-task (`KeyError` case in
`Actor._cancel_task()`) log-emit level in to `.runtime()`; it's
generally not useful info other then for granular race condition eval
when hacking the runtime.
6a4ee461f5 Raise remote errors rxed during `Context` child-sync
More specifically, if `.open_context()` is cancelled when awaiting the
first `Context.started()` during the child task sync phase, check to see
if it was due to `._scope.cancel_called` and raise any remote error via
`.maybe_raise()` instead the `trio.Cancelled` like in every other
remote-error handling case. Ensure we set `._scope[_nursery]` only after
the `Started` has arrived and audited.
4fa71cc01c Ensure ctx error-state matches the MTE scenario
Namely checking that `Context._remote_error` is set to the raised MTE
in the invalid started and return value cases since prior to the recent
underlying changes to the `Context.result()` impl, it would not match.

Further,
- do asserts for non-MTE raising cases in both the parent and child.
- add todos for testing ctx-outcomes for per-side-validation policies
  i anticipate supporting and implied msg-dialog race cases therein.
1db5d4def2 Add `Error.message: str`
Allows passing a custom error msg other then the traceback-str over the
wire. Make `.tb_str` optional (in the blank `''` sense) since it's
treated that way thus far in `._exceptions.pack_error()`.
0e8c60ee4a Better RAE `.pformat()`-ing for send-side MTEs
Send-side `MsgTypeError`s actually shouldn't have any "boxed" traceback
per say since they're raised in the transmitting actor's local task env
and we (normally) don't want the ascii decoration added around the
error's `._message: str`, that is not until the exc is `pack_error()`-ed
before transit. As such, the presentation of an embedded traceback (and
its ascii box) gets bypassed when only a `._message: str` is set (as we
now do for pld-spec failures in `_mk_msg_type_err()`).

Further this tweaks the `.pformat()` output to include the `._message`
part to look like `<RemoteActorError( <._message> ) ..` instead of
jamming it implicitly to the end of the embedded `.tb_str` (as was done
implicitly by `unpack_error()`) and also adds better handling for the
`with_type_header == False` case including forcing that case when we
detect that the currently handled exc is the RAE in `.pformat()`.
Toss in a lengthier doc-str explaining it all.

Surrounding/supporting changes,
- better `unpack_error()` message which just briefly reports the remote
  task's error type.
- add public `.message: str` prop.
- always set a `._extra_msgdata: dict` since some MTE props rely on it.
- handle `.boxed_type == None` for `.boxed_type_str`.
- maybe pack any detected input or `exc.message` in `pack_error()`.
- comment cruft cleanup in `_mk_msg_type_err()`.
bbb4d4e52c Add `from_src_exc: BaseException` to maybe raisers
That is as a control to `Context._maybe_raise_remote_err()` such that
if set to anything other then the default (`False` value), we do
`raise remote_error from from_src_exc` such that caller can choose to
suppress or override the `.__cause__` tb.

Also tidy up and old masked TODO regarding calling `.maybe_raise()`
after the caller exits from the `yield` in `.open_context()`..
993281882b Pass `boxed_type` from `_mk_msg_type_err()`
Such that we're boxing the interchanged lib's specific error
`msgspec.ValidationError` in this case) type much like how
a `ContextCancelled[trio.Cancelled]` is composed; allows for seemless
multi-backend-codec support later as well B)

Pass `ctx.maybe_raise(from_src_exc=src_err)` where needed in a couple
spots; as `None` in the send-side `Started` MTE case to avoid showing
the `._scope1.cancel_called` result in the traceback from the
`.open_context()` child-sync phase.
2f854a3e86 Add a `tractor.post_mortem()` API test + example
Since turns out we didn't have a single example using that API Bo

The test granular-ly checks all use cases:
- `.post_mortem()` manual calls in both subactor and root.
- ensuring built-in RPC crash handling activates after each manual one
  from ^.
- drafted some call-stack frame checking that i commented out for now
  since we need to first do ANSI escape code removal due to the
  colorization that `pdbp` does by default.
  |_ added a TODO with SO link on `assert_before()`.

Also todo-staged a shielded-pause test to match with the already
existing-but-needs-refinement example B)
13ea500a44 Rename `PldRx.dec_msg()` -> `.decode_pld()`
Keep the old alias, but i think it's better form to use longer names for
internal public APIs and this name better reflects the functionality:
decoding and returning a `PayloadMsg.pld` field.
8ea0f08386 Finally, officially support shielded REPL-ing!
It's been a long time prepped and now finally implemented!

Offer a `shield: bool` argument from our async `._debug` APIs:
- `await tractor.pause(shield=True)`,
- `await tractor.post_mortem(shield=True)`

^-These-^ can now be used inside cancelled `trio.CancelScope`s,
something very handy when introspecting complex (distributed) system
tear/shut-downs particularly under remote error or (inter-peer)
cancellation conditions B)

Thanks to previous prepping in a prior attempt and various patches from
the rigorous rework of `.devx._debug` internals around typed msg specs,
there ain't much that was needed!

Impl deats
- obvi passthrough `shield` from the public API endpoints (was already
  done from a prior attempt).
- put ad-hoc internal `with trio.CancelScope(shield=shield):` around all
  checkpoints inside `._pause()` for both the root-process and subactor
  case branches.

Add a fairly rigorous example, `examples/debugging/shielded_pause.py`
with a wrapping `pexpect` test, `test_debugger.test_shield_pause()` and
ensure it covers as many cases as i can think of offhand:

- multiple `.pause()` entries in a loop despite parent scope
  cancellation in a subactor RPC task which itself spawns a sub-task.
- a `trio.Nursery.parent_task` which raises, is handled and
  tries to enter and unshielded `.post_mortem()`, which of course
  internally raises `Cancelled` in a `._pause()` checkpoint, so we catch
  the `Cancelled` again and then debug the debugger's internal
  cancellation with specific checks for the particular raising
  checkpoint-LOC.
- do ^- the latter -^ for both subactor and root cases to ensure we
  can debug `._pause()` itself when it tries to REPL engage from
  a cancelled task scope Bo
4a270f85ca Drop sub-decoder proto-cruft from `.msg._codec`
It ended up getting necessarily implemented as the `PldRx` though at
a different layer and won't be needed as part of `MsgCodec` most likely,
though this original idea did provide the source of inspiration for how
things work now!

Also Move the commented TODO proto for a codec hook factory from
`.types` to `._codec` where it prolly better fits and update some msg
related todo/questions.
21f633a900 Use `Context` repr APIs for RPC outcome logs
Delegate to the new `.repr_state: str` and adjust log level based on
error vs. cancel vs. result.
f0342d6ae3 Move `Context.open_stream()` impl to `._streaming`
Exactly like how it's organized for `Portal.open_context()`, put the
main streaming API `@acm` with the `MsgStream` code and bind the method
to the new module func.

Other,
- rename `Context.result()` -> `.wait_for_result()` to better match the
  blocking semantics and rebind `.result()` as deprecated.
- add doc-str for `Context.maybe_raise()`.
408a74784e Catch `.pause_from_sync()` in root bg thread bugs!
Originally discovered as while using `tractor.pause_from_sync()`
from the `i3ipc` client running in a bg-thread that uses `asyncio`
inside `modden`.

Turns out we definitely aren't correctly handling `.pause_from_sync()`
from the root actor when called from a `trio.to_thread.run_sync()`
bg thread:
- root-actor bg threads which can't `Lock._debug_lock.acquire()` since
  they aren't in `trio.Task`s.
- even if scheduled via `.to_thread.run_sync(_debug._pause)` the
  acquirer won't be the task/thread which calls `Lock.release()` from
  `PdbREPL` hooks; this results in a RTE raised by `trio`..
- multiple threads will step on each other's stdio since cpython's GIL
  seems to ctx switch threads on every input from the user to the REPL
  loop..

Reproduce via reworking our example and test so that they catch and fail
for all edge cases:
- rework the `/examples/debugging/sync_bp.py` example to demonstrate the
  above issues, namely the stdio clobbering in the REPL when multiple
  threads and/or a subactor try to debug simultaneously.
  |_ run one thread using a task nursery to ensure it runs conc with the
     nursery's parent task.
  |_ ensure the bg threads run conc a subactor usage of
     `.pause_from_sync()`.
  |_ gravely detail all the special cases inside a TODO comment.
  |_ add some control flags to `sync_pause()` helper and don't use
     `breakpoint()` by default.
- extend and adjust `test_debugger.test_pause_from_sync` to match (and
  thus currently fail) by ensuring exclusive `PdbREPL` attachment when
  the 2 bg root-actor threads are concurrently interacting alongside the
  subactor:
  |_ should only see one of the `_pause_msg` logs at a time for either
     one of the threads or the subactor.
  |_ ensure each attaches (in no particular order) before expecting the
     script to exit.

Impl adjustments to `.devx._debug`:
- drop `Lock.repl`, no longer used.
- add `Lock._owned_by_root: bool` for the `.ctx_in_debug == None`
  root-actor-task active case.
- always `log.exception()` for any `._debug_lock.release()` ownership
  RTE emitted by `trio`, like we used to..
- add special `Lock.release()` log message for the stale lock but
  `._owned_by_root == True` case; oh yeah and actually
  `log.devx(message)`..
- rename `Lock.acquire()` -> `.acquire_for_ctx()` since it's only ever
  used from subactor IPC usage; well that and for local root-task
  usage we should prolly add a `.acquire_from_root_task()`?
- buncha `._pause()` impl improvements:
 |_ type `._pause()`'s `debug_func` as a `partial` as well.
 |_ offer `called_from_sync: bool` and `called_from_bg_thread: bool`
    for the special case handling when called from `.pause_from_sync()`
 |_ only set `DebugStatus.repl/repl_task` when `debug_func != None`
   (OW ensure the `.repl_task` is not the current one).
 |_ handle error logging even when `debug_func is None`..
 |_ lotsa detailed commentary around root-actor-bg-thread special cases.
- when `._set_trace(hide_tb=False)` do `pdbp.set_trace(frame=currentframe())`
  so the `._debug` internal frames are always included.
- by default always hide tracebacks for `.pause[_from_sync]()` internals.
- improve `.pause_from_sync()` to avoid root-bg-thread crashes:
 |_ pass new `called_from_xxx_` flags and ensure `DebugStatus.repl_task`
    is actually set to the `threading.current_thread()` when needed.
 |_ manually call `Lock._debug_lock.acquire_nowait()` for the non-bg
    thread case.
 |_ TODO: still need to implement the bg-thread case using a bg
    `trio.Task`-in-thread with an `trio.Event` set by thread REPL exit.
6534a363a5 First proto: multi-threaded synced `pdb`-REPLs
Functionally working for multi-threaded (via cpython threads spawned
from `to_trio.to_thread.run_sync()`) alongside subactors, tested (for
now) only with threads started inside the root actor (which seemed to
have the most issues in terms of the impl and special cases..) using the
new `tractor.pause_from_sync()` API!

Main implementation changes to `.pause_from_sync()`
------ - ------
- from the root actor, we need to ensure bg thread case is handled
  *specially* since no IPC is used to request the TTY stdio mutex and
  `Lock` (API) usage is conducted entirely from a local task or thread;
  dedicated `Lock` usage for the root-actor already is branched inside
  `._pause()` and needs similar handling from a root bg-thread:
 |_for the special case of a root bg thread we need to
   `trio`-main-thread schedule a bg task inside a new
   `_pause_from_bg_root_thread()`. The new task needs to implement most
   of what was is handled inside `._pause()` manually, mostly because in
   this root-actor-bg-thread case we have 2 constraints:
   1. to enter `PdbREPL.interaction()` **from the bg thread** directly,
   2. the task that `Lock._debug_lock.acquire()`s has to be the same
      that calls `.release() (a `trio.FIFOLock` constraint)
 |_impl deats of this `_pause_from_bg_root_thread()` include:
   - (for now) calling `._pause()` to acquire the `Lock._debug_lock`.
   - setting its own `DebugStatus.repl_release`.
   - calling `.DebugStatus.shield_sigint()` to ensure the root's
     main thread  uses the right handler when the bg one is REPL-ing.
   - wait manually on the `.repl_release()` to be set by the thread's
     dedicated `PdbREPL` exit.
   - manually calling `Lock.release()` from the **same task** that
     acquired it.
- expect calls to `._pause()` to deliver a `tuple[Task, PdbREPL]` such
  that we always get the handle both to any newly created REPl instance
  and the (maybe) the scheduled bg task within which is runs.
- add a single `message: str` style to `log.devx()` based on branching
  style for logging.
- ensure both `DebugStatus.repl` and `.repl_task` are set **just
  before** calling `._set_trace()` to ensure the correct `Task|Thread`
  is set when the REPL is finally entered from sync code.
- add a wrapping caller `_sync_pause_from_builtin()` which passes in the
  new `called_from_builtin=True` to indicate `breakpoint()` caller
  usage, obvi pass in `api_frame`.

Changes to `._pause()` in support of ^
------ - ------
- `TaskStatus.started()` and return the `tuple[Task, PdbREPL]` to
  callers / starters.
- only call `DebugStatus.shield_sigint()` when no `repl` passed bc some
  callers (like bg threads) may need to apply it at some specific point
  themselves.
- tweak some asserts for the `debug_func == None` / non-`trio`-thread
  case.
- add a mod-level `_repl_fail_msg: str` to be used when there's an
  internal `._pause()` failure for testing, easier to pexpect match.
- more comprehensive logging for the root-actor branched case to
  (attempt to) indicate any of the 3 cases:
  - remote ctx from subactor has the `Lock`,
  - already existing root task or thread has it or,
  - some kinda stale `.locked()` situation where the root has the lock
    but we don't know why.
- for root usage, revert to always `await Lock._debug_lock.acquire()`-ing
  despite `called_from_sync` since `.pause_from_sync()` was reworked to
  instead handle the special bg thread case in the new
  `_pause_from_bg_root_thread()` task.
- always do `return _enter_repl_sync(debug_func)`.
- try to report any `repl_task: Task|Thread` set by the caller
  (particularly for the bg thread cases) as being the thread or task
  `._pause()` was called "on behalf of"

Changes to `DebugStatus`/`Lock` in support of ^
------ - ------
- only call `Lock.release()` from `DebugStatus.set_[quit/continue]()`
  when called from the main `trio` thread and always call
  `DebugStatus.release()` **after** to ensure `.repl_released()` is set
  **after** `._debug_lock.release()`.
- only call `.repl_release.set()` from `trio` thread otherwise use
  `.from_thread.run()`.
- much more refinements in `Lock.release()` for threading cases:
  - return `bool` to indicate whether lock was released by caller.
  - mask (in prep to drop) `_pause()` usage of
    `Lock.release.force=True)` since forcing a release can't ever avoid
    the RTE from `trio`.. same task **must** acquire/release.
  - don't allow usage from non-`trio`-main-threads, ever; there's no
    point since the same-task-needs-to-manage-`FIFOLock` constraint.
  - much more detailed logging using `message`-building-style for all
    caller (edge) cases.
   |_ use a `we_released: bool` to determine failed-to-release edge
      cases which can happen if called from bg threads, ensure we
      `log.exception()` on any incorrect usage resulting in  release
      failure.
   |_ complain loudly if the release fails and some other task/thread
      still holds the lock.
   |_ be explicit about "who" (which task or thread) the release is "on
      behalf of" by reading `DebugStatus.repl_task` since the caller
      isn't the REPL operator in many sync cases.
  - more or less drop `force` support, as mentioned above.
  - ensure we unset `._owned_by_root` if the caller is a root task.

Other misc
------ - ------
- rename `lock_tty_for_child()` -> `lock_stdio_for_peer()`.
- rejig `Lock.repr()` to show lock and event stats.
- stage `Lock.stats` and `.owner` methods in prep for doing a singleton
  instance and `@property`s.
43a8cf4be1 Make big TODO: for `devx._debug` refinements
Hopefully would make grok-ing this fairly sophisticated sub-sys possible
for any up-and-coming `tractor` hacker

XD

A lot of internal API and re-org ideas I discovered/realized as part of
finishing the `__pld_spec__` and multi-threaded support. Particularly
better isolation between root-actor vs subactor task APIs and generally
less globally-state-ful stuff like `DebugStatus` and `Lock` method APIs
would likely make a lot of the hard to follow edge cases more clear?
d528e7ab4d Add `@context(pld_spec=<TypeAlias>)` TODO list
Longer run we don't want `tractor` app devs having to call
`msg._ops.limit_plds()` from every child endpoint.. so this starts
a list of decorator API ideas and obviously ties in with an ideal final
API design that will come with py3.13 and typed funcs. Obviously this
is directly fueled by,

- https://github.com/goodboy/tractor/issues/365

Other,
- type with direct `trio.lowlevel.Task` import.
- use `log.exception()` to show tbs for all error-terminations in
  `.open_context()` (for now) and always explicitly mention the `.side`.
e6d4ec43b9 Log tbs from non-RAE `._invoke()`-RPC-task errors
`RemoteActorError`s show this by default in their `.__repr__()`, and we
obvi capture and embed the src traceback in an `Error` msg prior to
transit, but for logging it's also handy to see the tb of any set
`Context._remote_error` on console especially when trying to decipher
remote error details at their origin actor. Also improve the log message
description using `ctx.repr_state` and show any `ctx.outcome`.
5449bd5673 Offer a `@context(pld_spec=<TypeAlias>)` API
Instead of the WIP/prototyped `Portal.open_context()` offering
a `pld_spec` input arg, this changes to a proper decorator API for
specifying the "payload spec" on `@context` endpoints.

The impl change details actually cover 2-birds:
- monkey patch decorated functions with a new
  `._tractor_context_meta: dict[str, Any]` and insert any provided input
  `@context` kwargs: `_pld_spec`, `enc_hook`, `enc_hook`.
- use `inspect.get_annotations()` to scan for a `func` arg
  type-annotated with `tractor.Context` and use the name of that arg as
  the RPC task-side injected `Context`, thus injecting the needed arg
  by type instead of by name (a longstanding TODO); raise a type-error
  when not found.
- pull the `pld_spec` from the `._tractor_context_meta` attr both in the
  `.open_context()` parent-side and child-side `._invoke()`-cation of
  the RPC task and use the `msg._ops.maybe_limit_plds()` API to apply it
  internally in the runtime for each case.
a0ee0cc713 Port debug request ep to use `@context(pld_spec)`
Namely passing the `.__pld_spec__` directly to the
`lock_stdio_for_peer()` decorator B)

Also, allows dropping `apply_debug_pldec()` (which was a todo) and
removing a `lock_stdio_for_peer()` indent level.
affc210033 Update pld-rx limiting test(s) to use deco input
The tests only use one input spec (conveniently) so there's not much to
change in the logic,
- only pass the `maybe_msg_spec` to the child-side decorator and obvi
  drop the surrounding `msgops.limit_plds()` block in the child.
- tweak a few `MsgDec` asserts, mostly dropping the
  `msg._ops._def_any_spec` state checks since the child-side won't have
  any pre pld-spec state given the runtime now applies the `pld_spec`
  before running the task's func body.
  - also allowed dropping the `finally:` which did a similar check
    outside the `.limit_plds()` block.
8477919fc9 Don't pass `ipc_msg` for send side MTEs
Just pass `_bad_msg` such that it get's injected to `.msgdata` since
with a send-side `MsgTypeError` we don't have a remote `._ipc_msg:
Error` per say to include.
711f639fc5 Break `_mk_msg_type_err()` into recv/send side funcs
Name them `_mk_send_mte()`/`_mk_recv_mte()` and change the runtime to
call each appropriately depending on location/usage.

Also add some dynamic call-frame "unhide" blocks such that when we
expect raised MTE from the aboves calls but we get a different
unexpected error from the runtime, we ensure the call stack downward is
shown in tbs/pdb.
|_ ideally in the longer run we come up with a fancier dynamic sys for
   this, prolly something in `.devx._frame_stack`?
9292d73b40 Avoid actor-nursery-exit warns on registrees
Since a local-actor-nursery-parented subactor might also use the root as
its registry, we need to avoid warning when short lived IPC `Channel`
connections establish and then disconnect (quickly, bc the apparently
the subactor isn't re-using an already cached parente-peer<->child conn
as you'd expect efficiency..) since such cases currently considered
normal operation of our super shoddy/naive "discovery sys" XD

As such, (un)guard the whole local-actor-nursery OR channel-draining
waiting blocks with the additional `or Actor._cancel_called` branch
since really we should also be waiting on the parent nurse to exit (at
least, for sure and always) when the local `Actor` indeed has been
"globally" cancelled-called. Further add separate timeout warnings for
channel-draining vs. local-actor-nursery-exit waiting since they are
technically orthogonal cases (at least, afaik).

Also,
- adjust the `Actor._stream_handler()` connection status log-emit to
  `.runtime()`, especially to reduce noise around the aforementioned
  ephemeral registree connection-requests.
- if we do wait on a local actor-nurse to exit, report its `._children`
  table (which should help figure out going forward how useful the
  warning is, if at all).
This pull request can be merged automatically.
You are not authorized to merge this pull request.
You can also view command line instructions.

Step 1:

From your project repository, check out a new branch and test the changes.
git checkout -b runtime_to_msgspec msg_codecs
git pull origin runtime_to_msgspec

Step 2:

Merge the changes and update on Gitea.
git checkout msg_codecs
git merge --no-ff runtime_to_msgspec
git push origin msg_codecs
Sign in to join this conversation.
No reviewers
No Label
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: goodboy/tractor#7
There is no content yet.