Commit Graph

1742 Commits (6628fa00d965c66e7e0732ae6aea0b96c265bb42)

Author SHA1 Message Date
Tyler Goodlet b28df738fe Drop extra "
" when logging actor nursery errors
2025-03-14 21:49:15 -04:00
Tyler Goodlet 5fa040c7db Add `NamespacePath._ns` todo for `self:<ns.meth>` support 2025-03-14 21:49:15 -04:00
Tyler Goodlet 27b750e907 Emit warning on any `ContextCancelled.canceller == None` 2025-03-14 21:49:15 -04:00
Tyler Goodlet 96150600fb Make ctx tests support `debug_mode: bool` fixture
Such that with `--tpdb` passed (sub)actors will engage the `pdbp` REPL
automatically and so that we can use the new `stackscope` support when
complex cases hang Bo

Also,
- simplified some type-annots (ns paths),
- doc-ed an inter-peer test func with some ascii msg flows,
- added a bottom #TODO for replicating the scenario i hit in `modden`
  where a separate client actor-tree was hanging on cancelling a `bigd`
  sub-workspace..
2025-03-14 21:49:15 -04:00
Tyler Goodlet 338ea5529c .log: more multi-line styling 2025-03-14 16:41:08 -04:00
Tyler Goodlet 6bc67338cf Better subproc supervisor logging, todo for #320
Given i just similarly revamped a buncha `._runtime` log msg formatting,
might as well do something similar inside the spawning machinery such
that groking teardown sequences of each supervising task is much more
sane XD

Mostly this includes doing similar `'<field>: <value>\n'` multi-line
formatting when reporting various subproc supervision steps as well as
showing a detailed `trio.Process.__repr__()` as appropriate.

Also adds a detailed #TODO according to the needs of #320 for which
we're going to need some internal mechanism for intermediary parent
actors to determine if a given debug tty locker (sub-actor) is one of
*their* (transitive) children and thus stall the normal
cancellation/teardown sequence until that locker is complete.
2025-03-14 16:41:06 -04:00
Tyler Goodlet fd20004757 _supervise: iter nice expanded multi-line `._children` tups with typing 2025-03-14 16:34:17 -04:00
Tyler Goodlet ddc2e5f0f8 WIP: solved the modden client hang.. 2025-03-14 16:34:10 -04:00
Tyler Goodlet 4b0aa5e379 Baboso! fix `chan.send(None)` indent.. 2025-03-14 15:49:37 -04:00
Tyler Goodlet 6a303358df Improved log msg formatting in core
As part of solving some final edge cases todo with inter-peer remote
cancellation (particularly a remote cancel from a separate actor
tree-client hanging on the request side in `modden`..) I needed less
dense, more line-delimited log msg formats when understanding ipc
channel and context cancels from console logging; this adds a ton of
that to:
- `._invoke()` which now does,
  - better formatting of `Context`-task info as multi-line
    `'<field>: <value>\n'` messages,
  - use of `trio.Task` (from `.lowlevel.current_task()` for full
    rpc-func namespace-path info,
  - better "msg flow annotations" with `<=` for understanding
    `ContextCancelled` flow.
- `Actor._stream_handler()` where in we break down IPC peers reporting
  better as multi-line `|_<Channel>` log msgs instead of all jammed on
  one line..
- `._ipc.Channel.send()` use `pformat()` for repr of packet.

Also tweak some optional deps imports for debug mode:
- add `maybe_import_gb()` for attempting to import `greenback`.
- maybe enable `stackscope` tree pprinter on `SIGUSR1` if installed.

Add a further stale-debugger-lock guard before removal:
- read the `._debug.Lock.global_actor_in_debug: tuple` uid and possibly
  `maybe_wait_for_debugger()` when the child-user is known to have
  a live process in our tree.
- only cancel `Lock._root_local_task_cs_in_debug: CancelScope` when
  the disconnected channel maps to the `Lock.global_actor_in_debug`,
  though not sure this is correct yet?

Started adding missing type annots in sections that were modified.
2025-03-14 15:49:36 -04:00
Tyler Goodlet c85757aee1 Let `pack_error()` take a msg injected `cid: str|None` 2025-03-14 15:31:16 -04:00
Tyler Goodlet 9fc9b10b53 Add `StreamOverrun.sender: tuple` for better handling
Since it's generally useful to know who is the cause of an overrun (say
bc you want your system to then adjust the writer side to slow tf down)
might as well pack an extra `.sender: tuple[str, str]` actor uid field
which can be relayed through `RemoteActorError` boxing. Add an extra
case for the exc-type to `unpack_error()` to match B)
2025-03-14 14:14:54 -04:00
Tyler Goodlet a86275996c Offer `unpack_error(hid_tb: bool)` for `pdbp` REPL config 2025-03-14 14:14:54 -04:00
Tyler Goodlet b5431c0343 Never mask original `KeyError` in portal-error unwrapper, for now? 2025-03-14 14:14:54 -04:00
Tyler Goodlet cdee6f9354 Try allowing multi-pops of `_Cache.locks` for now? 2025-03-14 14:14:53 -04:00
Tyler Goodlet a2f1bcc23f Use `import <blah> as blah` over `__all__` in `.trionics` 2025-03-14 14:14:53 -04:00
Tyler Goodlet 4aa89bf391 Bump timeout on resource cache test a bitty bit. 2025-03-14 14:14:53 -04:00
Tyler Goodlet 45e9cb4d09 `_root`: drop unused `typing` import 2025-03-14 14:14:53 -04:00
Tyler Goodlet 27c5ffe5a7 Move missing-key-in-msg raiser to `._exceptions`
Since we use basically the exact same set of logic in
`Portal.open_context()` when expecting the first `'started'` msg factor
and generalize `._streaming._raise_from_no_yield_msg()` into a new
`._exceptions._raise_from_no_key_in_msg()` (as per the lingering todo)
which obvi requires a more generalized / optional signature including
a caller specific `log` obj. Obvi call the new func from all the other
modules X)
2025-03-14 14:14:50 -04:00
Tyler Goodlet 914efd80eb Fmt repr as multi-line style call 2025-03-14 14:14:11 -04:00
Tyler Goodlet 2d2d1ca1c4 Drop unused walrus assign of `re` 2025-03-14 14:14:11 -04:00
Tyler Goodlet 74aa5aa9cd `StackLevelAdapter._log(stacklevel: int)` for custom levels..
Apparently (and i don't know if this was always broken [i feel like no?]
or is a recent change to stdlib's `logging` stuff) we need increment the
`stacklevel` input by one for our custom level methods now? Without this
you're going to see the path to the method's-callstack-frame on every
emission instead of to the caller's. I first noticed this when debugging
the workspace layer spawning in `modden.bigd` and then verified it in
other depended projects..

I guess we should add some tests for this as well XD
2025-03-14 14:14:11 -04:00
Tyler Goodlet 44e386dd99 ._child: remove some unused imports.. 2025-03-14 13:56:25 -04:00
Tyler Goodlet 13fbcc723f Guarding for IPC failures in `._runtime._invoke()`
Took me longer then i wanted to figure out the source of
a failed-response to a remote-cancellation (in this case in `modden`
where a client was cancelling a workspace layer.. but disconnects before
receiving the ack msg) that was triggering an IPC error when sending the
error msg for the cancellation of a `Actor._cancel_task()`, but since
this (non-rpc) `._invoke()` task was trying to send to a now
disconnected canceller it was resulting in a `BrokenPipeError` (or similar)
error.

Now, we except for such IPC errors and only raise them when,
1. the transport `Channel` is for sure up (bc ow what's the point of
   trying to send an error on the thing that caused it..)
2. it's definitely for handling an RPC task

Similarly if the entire main invoke `try:` excepts,
- we only hide the call-stack frame from the debugger (with
  `__tracebackhide__: bool`) if it's an RPC task that has a connected
  channel since we always want to see the frame when debugging internal
  task or IPC failures.
- we don't bother trying to send errors to the context caller (actor)
  when it's a non-RPC request since failures on actor-runtime-internal
  tasks shouldn't really ever be reported remotely, only maybe raised
  locally.

Also some other tidying,
- this properly corrects for the self-cancel case where an RPC context
  is cancelled due to a local (runtime) task calling a method like
  `Actor.cancel_soon()`. We now set our own `.uid` as the
  `ContextCancelled.canceller` value so that other-end tasks know that
  the cancellation was due to a self-cancellation by the actor itself.
  We still need to properly test for this though!
- add a more detailed module doc-str.
- more explicit imports for `trio` core types throughout.
2025-03-14 13:56:23 -04:00
Tyler Goodlet 315f0fc7eb More thurough hard kill doc strings 2025-03-14 13:48:35 -04:00
Tyler Goodlet fea111e882 Tons of interpeer test cleanup
Drop all the nested `@acm` blocks and defunct comments from initial
validations. Add some todos for cases that are still unclear such as
whether the caller / streamer should have `.cancelled_caught == True` in
it's teardown.
2025-03-14 13:44:09 -04:00
Tyler Goodlet a1bf4db1e3 Get inter-peer suite passing with all `Context` state checks!
Definitely needs some cleaning and refinement but this gets us to stage
1 of being pretty frickin correct i'd say 💃
2025-03-14 13:44:09 -04:00
Tyler Goodlet bac9523ecf Adjust test details where `Context.cancel()` is called
We can now make asserts on `.cancelled_caught` and `_remote_error` vs.
`_local_error`. Expect a runtime error when `Context.open_stream()` is
called AFTER `.cancel()` and the remote `ContextCancelled` hasn't
arrived (yet). Adjust to `'itself'` string in self-cancel case.
2025-03-14 13:44:09 -04:00
Tyler Goodlet abe31e9e2c Fix `Context.result()` call to be in runtime scope 2025-03-14 13:44:09 -04:00
Tyler Goodlet 0222180c11 Tweak `Channel._cancel_called` comment 2025-03-14 13:44:09 -04:00
Tyler Goodlet 7d5fda4485 Be ultra-correct in `Portal.open_context()`
This took way too long to get right but hopefully will give us grok-able
and correct context exit semantics going forward B)

The main fixes were:
- always shielding the `MsgStream.aclose()` call on teardown to avoid
  bubbling a `Cancelled`.
- properly absorbing any `ContextCancelled` in cases due to "self
  cancellation" using the new `Context.canceller` in the logic.
- capturing any error raised by the `Context.result()` call in the
  "normal exit, result received" case and setting it as the
  `Context._local_error` so that self-cancels can be easily measured via
  `Context.cancelled_caught` in same way as remote-error caused
  cancellations.
- extremely detailed comments around all of the cancellation-error cases
  to avoid ever getting confused about the control flow in the future XD
2025-03-14 13:44:08 -04:00
Tyler Goodlet f5fcd8ca2e Be mega-pedantic with `ContextCancelled` semantics
As part of extremely detailed inter-peer-actor testing, add much more
granular `Context` cancellation state tracking via the following (new)
fields:
- `.canceller: tuple[str, str]` the uuid of the actor responsible for
  the cancellation condition - always set by
  `Context._maybe_cancel_and_set_remote_error()` and replaces
  `._cancelled_remote` and `.cancel_called_remote`. If set, this value
  should normally always match a value from some `ContextCancelled`
  raised or caught by one side of the context.
- `._local_error` which is always set to the locally raised (and caller
  or callee task's scope-internal) error which caused any
  eventual cancellation/error condition and thus any closure of the
  context's per-task-side-`trio.Nursery`.
- `.cancelled_caught: bool` is now always `True` whenever the local task
  catches (or "silently absorbs") a `ContextCancelled` (a `ctxc`) that
  indeed originated from one of the context's linked tasks or any other
  context which raised its own `ctxc` in the current `.open_context()` scope.
  => whenever there is a case that no `ContextCancelled` was raised
  **in** the `.open_context().__aexit__()` (eg. `ctx.result()` called
  after a call `ctx.cancel()`), we still consider the context's as
  having "caught a cancellation" since the `ctxc` was indeed silently
  handled by the cancel requester; all other error cases are already
  represented by mirroring the state of the `._scope: trio.CancelScope`
  => IOW there should be **no case** where an error is **not raised** in
  the context's scope and `.cancelled_caught: bool == False`, i.e. no
  case where `._scope.cancelled_caught == False and ._local_error is not
  None`!
- always raise any `ctxc` from `.open_stream()` if `._cancel_called ==
  True` - if the cancellation request has not already resulted in
  a `._remote_error: ContextCancelled` we raise a `RuntimeError` to
  indicate improper usage to the guilty side's task code.
- make `._maybe_raise_remote_err()` a sync func and don't raise
  any `ctxc` which is matched against a `.canceller` determined to
  be the current actor, aka a "self cancel", and always set the
  `._local_error` to any such `ctxc`.
- `.side: str` taken from inside `.cancel()` and unused as of now since
  it might be better re-written as a similar `.is_opener() -> bool`?
- drop unused `._started_received: bool`..
- TONS and TONS of detailed comments/docs to attempt to explain all the
  possible cancellation/exit cases and how they should exhibit as either
  silent closes or raises from the `Context` API!

Adjust the `._runtime._invoke()` code to match:
- use `ctx._maybe_raise_remote_err()` in `._invoke()`.
- adjust to new `.canceller` property.
- more type hints.
- better `log.cancel()` msging around self-cancels vs. peer-cancels.
- always set the `._local_error: BaseException` for the "callee" task
  just like `Portal.open_context()` now will do B)

Prior we were raising any `Context._remote_error` directly and doing
(more or less) the same `ContextCancelled` "absorbing" logic (well
kinda) in block; instead delegate to the method
2025-03-14 13:42:55 -04:00
Tyler Goodlet 04217f319a Raise a `MessagingError` from the src error on msging edge cases 2025-03-14 13:42:15 -04:00
Tyler Goodlet 8cb8390201 Move `MessagingError` into `._exceptions` set 2025-03-14 13:42:15 -04:00
Tyler Goodlet 5035617adf Dump `.msgdata` in `RemoteActorError.__repr__()` 2025-03-14 13:42:15 -04:00
Tyler Goodlet 715348c5c2 Port all tests to new `reg_addr` fixture name 2025-03-14 13:42:15 -04:00
Tyler Goodlet fdf0c43bfa Type out the full-fledged streaming ex. 2025-03-14 13:40:19 -04:00
Tyler Goodlet f895c96600 Add masked super timeout line to `do_hard_kill()` for would-be runtime hackers 2025-03-14 13:40:19 -04:00
Tyler Goodlet ca1a1476bb Add a first serious inter-peer remote cancel suite
Tests that appropriate `Context` exit state, the relay of
a `ContextCancelled` error and its `.canceller: tuple[str, str]` value
are set when an inter-peer cancellation happens via an "out of band"
request method (in this case using `Portal.cancel_actor()` and that
cancellation is propagated "horizontally" to other peers. Verify that
any such cancellation scenario which also experiences an "error during
`ContextCancelled` handling" DOES NOT result in that further error being
suppressed and that the user's exception bubbles out of the
`Context.open_context()` block(s) appropriately!

Likely more tests to come as well as some factoring of the teardown
state checks where possible.

Pertains to serious testing the major work landing in #357
2025-03-14 13:40:19 -04:00
Tyler Goodlet a7c36a9cbe Tidy/clarify another `._runtime` comment 2025-03-14 13:40:19 -04:00
Tyler Goodlet 22e4b324b1 Get mega-pedantic in `Portal.open_context()`
Specifically in the `.__aexit__()` phase to ensure remote,
runtime-internal, and locally raised error-during-cancelled-handling
exceptions are NEVER masked by a local `ContextCancelled` or any
exception group of `trio.Cancelled`s.

Also adds a ton of details to doc strings including extreme detail
surrounding the `ContextCancelled` raising cases and their processing
inside `.open_context()`'s exception handler blocks.

Details, details:
- internal rename `err`/`_err` stuff to just be `scope_err` since it's
  effectively the error bubbled up from the context's surrounding (and
  cross-actor) "scope".
- always shield `._recv_chan.aclose()` to avoid any `Cancelled` from
  masking the `scope_err` with a runtime related `trio.Cancelled`.
- explicitly catch the specific set of `scope_err: BaseException` that
  we can reasonably expect to handle instead of the catch-all parent
  type including exception groups, cancels and KBIs.
2025-03-14 13:40:18 -04:00
Tyler Goodlet 89ed8b67ff Drop `msg` kwarg from `Context.cancel()`
Well first off, turns out it's never used and generally speaking
doesn't seem to help much with "runtime hacking/debugging"; why would
we need to "fabricate" a msg when `.cancel()` is called to self-cancel?

Also (and since `._maybe_cancel_and_set_remote_error()` now takes an
`error: BaseException` as input and thus expects error-msg unpacking
prior to being called), we now manually set `Context._cancel_msg: dict`
just prior to any remote error assignment - so any case where we would
have fabbed a "cancel msg" near calling `.cancel()`, just do the manual
assign.

In this vein some other subtle changes:
- obviously don't set `._cancel_msg` in `.cancel()` since it's no longer
  an input.
- generally do walrus-style `error := unpack_error()` before applying
  and setting remote error-msg state.
- always raise any `._remote_error` in `.result()` instead of returning
  the exception instance and check before AND after the underlying mem
  chan read.
- add notes/todos around `raise self._remote_error from None` masking of
  (runtime) errors in `._maybe_raise_remote_err()` and use it inside
  `.result()` since we had the inverse duplicate logic there anyway..

Further, this adds and extends a ton of (internal) interface docs and
details comments around the `Context` API including many subtleties
pertaining to calling `._maybe_cancel_and_set_remote_error()`.
2025-03-14 13:37:55 -04:00
Tyler Goodlet 11bbf15817 `._exceptions`: typing and error unpacking updates
Bump type annotations to 3.10+ style throughout module as well as fill
out doc strings a bit. Inside `unpack_error()` pop any `error_dict: dict`
and,
- return `None` early if not found,
- versus pass directly as `**error_dict` to the error constructor
  instead of a double field read.
2025-03-14 13:36:16 -04:00
Tyler Goodlet a18663213a Add comments around diff between `C/context` refs 2025-03-14 13:36:16 -04:00
Tyler Goodlet d4d09b6071 Factor non-yield stream msg processing into helper
Since both `MsgStream.receive()` and `.receive_nowait()` need the same
raising logic when a non-stream msg arrives (so that maybe an
appropriate IPC translated error can be raised) move the `KeyError`
handler code into a new `._streaming._raise_from_no_yield_msg()` func
and call it from both methods to make the error-interface-raising
symmetrical across both methods.
2025-03-14 13:36:16 -04:00
Tyler Goodlet 6d10f0c516 Always raise remote (cancelled) error if set
Previously we weren't raising a remote error if the local scope was
cancelled during a call to `Context.result()` which is problematic if
the caller WAS NOT the requester for said remote cancellation; in that
case we still want a `ContextCancelled` raised with the `.canceller:
str` set to the cancelling actor uid.

Further fix a naming bug where the (seemingly older) `._remote_err` was
being set to such an error instead of `._remote_error` XD
2025-03-14 13:36:16 -04:00
Tyler Goodlet fa9b57bae0 Write more comprehensive `Portal.cancel_actor()` doc str 2025-03-14 13:36:16 -04:00
Tyler Goodlet 81776a6238 Drop pause line from ctx cancel handler block in test 2025-03-14 13:36:16 -04:00
Tyler Goodlet 144d1f4d94 Msg-ified `ContextCancelled`s sub-error type should always be just, its type.. 2025-03-14 13:36:16 -04:00
Tyler Goodlet 51fdf3524c Start inter-peer cancellation test mod
Move over relevant test from the "context semantics" test module which
was already verifying peer-caused-`ContextCancelled.canceller: tuple`
error info and propagation during an inter-peer cancellation scenario.

Also begin a more general set of inter-peer cancellation tests starting
with the simplest case where when a peer is cancelled the parent should
NOT get an "muted" `trio.Cancelled` and instead
a `tractor.ContextCancelled` with a `.canceller: tuple` which points to
the sibling actor which requested the peer cancel.
2025-03-14 13:36:16 -04:00