forked from goodboy/tractor
1
0
Fork 0
Commit Graph

13 Commits (af013912acbb5aabadae26d168ae5c45599c9f71)

Author SHA1 Message Date
Tyler Goodlet cf48fdecfe Unify `MsgTypeError` as a `RemoteActorError` subtype
Since in the receive-side error case the source of the exception is the
sender side (normally causing a local `TypeError` at decode time), might
as well bundle the error in remote-capture-style using boxing semantics
around the causing local type error raised from the
`msgspec.msgpack.Decoder.decode()` and with a traceback packed from
`msgspec`-specific knowledge of any field-type spec matching failure.

Deats on new `MsgTypeError` interface:
- includes a `.msg_dict` to get access to any `Decoder.type`-applied
  load of the original (underlying and offending) IPC msg into
  a `dict` form using a vanilla decoder which is normally packed into
  the instance as a `._msg_dict`.
- a public getter to the "supposed offending msg" via `.payload_msg`
  which attempts to take the above `.msg_dict` and load it manually into
  the corresponding `.msg.types.MsgType` struct.
- a constructor `.from_decode()` to make it simple to build out error
  instances from a failed decode scope where the aforementioned
  `msgdict: dict` from the vanilla decode can be provided directly.
- ALSO, we now pack into `MsgTypeError` directly just like ctxc in
  `unpack_error()`

This also completes the while-standing todo for `RemoteActorError` to
contain a ref to the underlying `Error` msg as `._ipc_msg` with public
`@property` access that `defstruct()`-creates a pretty struct version
via `.ipc_msg`.

Internal tweaks for this include:
- `._ipc_msg` is the internal literal `Error`-msg instance if provided
  with `.ipc_msg` the dynamic wrapper as mentioned above.
- `.__init__()` now can still take variable `**extra_msgdata` (similar
  to the `dict`-msgdata as before) to maintain support for subtypes
  which are constructed manually (not only by `pack_error()`) and insert
  their own attrs which get placed in a `._extra_msgdata: dict` if no
  `ipc_msg: Error` is provided as input.
- the `.msgdata` is now a merge of any `._extra_msgdata` and
  a `dict`-casted form of any `._ipc_msg`.
- adjust all previous `.msgdata` field lookups to try equivalent field
  reads on `._ipc_msg: Error`.
- drop default single ws indent from `.tb_str` and do a failover lookup
  to `.msgdata` when `._ipc_msg is None` for the manually constructed
  subtype-instance case.
- add a new class attr `.extra_body_fields: list[str]` to allow subtypes
  to declare attrs they want shown in the `.__repr__()` output, eg.
  `ContextCancelled.canceller`, `StreamOverrun.sender` and
  `MsgTypeError.payload_msg`.
- ^-rework defaults pertaining to-^ with rename from
  `_msgdata_keys` -> `_ipcmsg_keys` with latter now just loading directly
  from the `Error` fields def and `_body_fields: list[str]` just taking
  that value and removing the not-so-useful-in-REPL or already shown
  (i.e. `.tb_str: str`) field names.
- add a new mod level `.pack_from_raise()` helper for auto-boxing RAE
  subtypes constructed manually into `Error`s which is normally how
  `StreamOverrun` and `MsgTypeError` get created in the runtime.
- in support of the above expose a `src_uid: tuple` override to
  `pack_error()` such that the runtime can provide any remote actor id
  when packing a locally-created yet remotely-caused RAE subtype.
- adjust all typing to expect `Error`s over `dict`-msgs.

Adjust some tests to match these changes:
- context and inter-peer-cancel tests to make their `.msgdata` related
  checks against the new `.ipc_msg` as well and `.tb_str` directly.
- toss in an extra sleep to `sleep_a_bit_then_cancel_peer()` to keep the
  'canceller' ctx child task cancelled by it's parent in the 'root' for
  the rte-raised-during-ctxc-handling case (apparently now it's
  returning too fast, cool?).
2024-04-09 10:07:10 -04:00
Tyler Goodlet d5e5174d97 Extend inter-peer cancel tests for "inceptions"
Use new `RemoteActorError` fields in various assertions particularly
ensuring that an RTE relayed through the spawner from the little_bro
shows up at the client with the right number of entries in the
`.relay_path` and that the error is raised in the client as desired in
the original use case from `modden`'s remote spawn spawn request API
(which was kinda the whole original motivation to finally get all this
multi-actor error relay stuff workin).

Case extensions:
- RTE relayed from little_bro through spawner to client when
  `raise_sub_spawn_error_after` is set; in this case test should raise
  the relayed and RAE boxed RTE right up to the `trio.run()`.
  -> ensure the `rae.src_uid`, `.relay_uid` are set correctly.
  -> ensure ctx cancels are no acked.
- use `expect_ctxc()` around root's `tell_little_bro()` usage.
- do `debug_mode` assertions when enabled by test harness in each actor
  layer.
- obvi use new `.src_type`/`.boxed_type` for final error propagation
  assertions.
2024-03-20 10:29:40 -04:00
Tyler Goodlet 7458f99733 Add a `._state._runtime_vars['_registry_addrs']`
Such that it's set to whatever `Actor.reg_addrs: list[tuple]` is during
the actor's init-after-spawn guaranteeing each actor has at least the
registry infos from its parent. Ensure we read this if defined over
`_root._default_lo_addrs` in `._discovery` routines, namely
`.find_actor()` since it's the one API normally used without expecting
the runtime's `current_actor()` to be up.

Update the latest inter-peer cancellation test to use the `reg_addr`
fixture (and thus test this new runtime-vars value via `find_actor()`
usage) since it was failing if run *after* the infected `asyncio` suite
due to registry contact failure.
2024-03-08 15:34:20 -05:00
Tyler Goodlet 2e797ef7ee Update ctx test suites to stricter semantics
Including mostly tweaking asserts on relayed `ContextCancelled`s and
the new pub ctx properties: `.outcome`, `.maybe_error`, etc. as it
pertains to graceful (absorbed) remote cancellation vs. loud ctxc cases
expected to be raised by any `Portal.cancel_actor()` style teardown.

Start checking a variety internals like `._remote/local_error`,
`._is_self_cancelled()`, `._is_final_result_set()`, `._cancel_msg`
where applicable.

Also factor out the new `expect_ctxc()` checker to our `conftest.py` for
use in other suites.
2024-03-07 21:26:57 -05:00
Tyler Goodlet 7ae9b5319b Tweak inter-peer `._scope` state asserts
We don't expect `._scope.cancelled_caught` to be set really ever on
inter-peer cancellation since no ctx is ever cancelling itself, a peer
cancels some other and then bubbles back to all other peers.

Also add `ids: lambda` for `error_during_ctxerr_handling` param to
`test_peer_canceller()`
2024-03-06 16:09:38 -05:00
Tyler Goodlet 9e3f41a5b1 Tweak inter-peer tests for new/refined semantics
Buncha subtle details changed mostly to do with when `Context.cancel()`
gets called on "real" remote errors vs. (peer requested) cancellation
and then local side handling of `ContextCancelled`.

Specific changes to make tests pass:
- due to raciness with `sleeper_ctx.result()` raising the ctxc locally
  vs. the child-peers receiving similar ctxcs themselves (and then
  erroring and propagating back to the root parent), we might not see
  `._remote_error` set during the sub-ctx loops (except for the sleeper
  itself obvi).
- do not expect `.cancel_called`/`.cancel_caught` to be set on any
  sub-ctx since currently `Context.cancel()` is only called non-shielded
  and thus is not in invoked when `._scope.cancel()` is called as part
  of each root-side ctx ref/block handling the inter-peer ctxc.
- do not expect `Context._scope.cancelled_caught` to be set in most cases
  (even the sleeper)

TODO Outstanding adjustments not fixed yet:
-[ ] `_scope.cancelled_caught` checks outside the `.open_context()`
  blocks.
2024-03-06 10:13:41 -05:00
Tyler Goodlet 3ed309f019 Add test for `modden` sub-spawner-server hangs on cancel
As per a lot of the recent refinements to `Context` cancellation, add
a new test case to replicate the original hang-on-cancel found with
`modden` when using a client actor to spawn a subactor in some other
tree where despite `Context.cancel()` being called the requesting client
would hang on the opened context with the server.

The specific scenario added here is to have,
- root actor spawns 2 children: a client and a spawn server.
- the spawn server opens with a spawn-request serve loop and begins to
  wait for the client.
- client spawns and connects to the sibling spawn server, requests to
  spawn a sub-actor, the "little bro", connects to it then does some
  echo streaming, cancels the request with it's sibling (the spawn
  server) which should in turn cancel the root's-grandchild and result
  in a cancel-ack back to the client's `.open_context()`.
- root ensures that it can also connect to the grandchild (little bro),
  do the same echo streaming, then ensure everything tears down
  correctly after cancelling all the children.

More refinements to come here obvi in the specific cancellation
semantics and possibly causes.

Also tweaks the other tests in suite to use the new `Context` properties
recently introduced and similarly updated in the previous patch to the
ctx-semantics suite.
2024-02-29 15:45:55 -05:00
Tyler Goodlet 6c9bc627d8 Make ctx tests support `debug_mode: bool` fixture
Such that with `--tpdb` passed (sub)actors will engage the `pdbp` REPL
automatically and so that we can use the new `stackscope` support when
complex cases hang Bo

Also,
- simplified some type-annots (ns paths),
- doc-ed an inter-peer test func with some ascii msg flows,
- added a bottom #TODO for replicating the scenario i hit in `modden`
  where a separate client actor-tree was hanging on cancelling a `bigd`
  sub-workspace..
2024-02-20 15:14:58 -05:00
Tyler Goodlet d651f3d8e9 Tons of interpeer test cleanup
Drop all the nested `@acm` blocks and defunct comments from initial
validations. Add some todos for cases that are still unclear such as
whether the caller / streamer should have `.cancelled_caught == True` in
it's teardown.
2023-10-25 15:21:41 -04:00
Tyler Goodlet ef0cfc4b20 Get inter-peer suite passing with all `Context` state checks!
Definitely needs some cleaning and refinement but this gets us to stage
1 of being pretty frickin correct i'd say 💃
2023-10-23 18:24:23 -04:00
Tyler Goodlet ca3f7a1b6b Add a first serious inter-peer remote cancel suite
Tests that appropriate `Context` exit state, the relay of
a `ContextCancelled` error and its `.canceller: tuple[str, str]` value
are set when an inter-peer cancellation happens via an "out of band"
request method (in this case using `Portal.cancel_actor()` and that
cancellation is propagated "horizontally" to other peers. Verify that
any such cancellation scenario which also experiences an "error during
`ContextCancelled` handling" DOES NOT result in that further error being
suppressed and that the user's exception bubbles out of the
`Context.open_context()` block(s) appropriately!

Likely more tests to come as well as some factoring of the teardown
state checks where possible.

Pertains to serious testing the major work landing in #357
2023-10-18 13:59:08 -04:00
Tyler Goodlet c4cd573b26 Drop pause line from ctx cancel handler block in test 2023-10-07 18:51:59 -04:00
Tyler Goodlet 78c0d2b234 Start inter-peer cancellation test mod
Move over relevant test from the "context semantics" test module which
was already verifying peer-caused-`ContextCancelled.canceller: tuple`
error info and propagation during an inter-peer cancellation scenario.

Also begin a more general set of inter-peer cancellation tests starting
with the simplest case where when a peer is cancelled the parent should
NOT get an "muted" `trio.Cancelled` and instead
a `tractor.ContextCancelled` with a `.canceller: tuple` which points to
the sibling actor which requested the peer cancel.
2023-10-06 15:44:26 -04:00