Commit Graph

1017 Commits (aecebd0562a801698bc9586fdbd7a103aae977e2)

Author SHA1 Message Date
Tyler Goodlet a87df3009f Drop now-deprecated deps on modern `trio`/Python
- `trio_typing` is nearly obsolete since `trio >= 0.23`
- `exceptiongroup` is built-in to python 3.11
- `async_generator` primitives have lived in `contextlib` for quite
  a while!
2025-03-16 16:06:24 -04:00
Tyler Goodlet 85825cdd76 Add `.trionics._broadcast` todos for py 3.12 2025-03-16 15:52:55 -04:00
Tyler Goodlet a5bc113fde Start a `._rpc` module
Since `._runtime` was getting pretty long (> 2k LOC) and much of the RPC
low-level machinery is fairly isolated to a handful of task-funcs, it
makes sense to re-org the RPC task scheduling and driving msg loop to
its own code space.

The move includes:
- `process_messages()` which is the main IPC business logic.
- `try_ship_error_to_remote()` helper, to box local errors for the wire.
- `_invoke()`, the core task scheduler entrypoing used in the msg loop.
- `_invoke_non_context()`, holds impls for non-`@context` task starts.
- `_errors_relayed_via_ipc()` which does all error catch-n-boxing for
   wire-msg shipment using `try_ship_error_to_remote()` internally.

Also inside `._runtime` improve some `Actor` methods docs.
2025-03-16 15:52:53 -04:00
Tyler Goodlet 4f7823cf55 Move `Portal.open_context()` impl to `._context`
Finally, since normally you need the content from `._context.Context`
and surroundings in order to effectively grok `Portal.open_context()`
anyways, might as well move the impl to the ctx module as
`open_context_from_portal()` and just bind it on the `Portal` class def.

Associated/required tweaks:
- avoid circ import on `.devx` by only import
  `.maybe_wait_for_debugger()` when debug mode is set.
- drop `async_generator` usage, not sure why this hadn't already been
  changed to `contextlib`?
- use `@acm` alias throughout `._portal`
2025-03-16 15:32:13 -04:00
Tyler Goodlet 544cb40533 Attempt at better internal traceback hiding
Previously i was trying to approach this using lots of
`__tracebackhide__`'s in various internal funcs but since it's not
exactly straight forward to do this inside core deps like `trio` and the
stdlib, it makes a bit more sense to optionally catch and re-raise
certain classes of errors from their originals using `raise from` syntax
as per:
https://docs.python.org/3/library/exceptions.html#exception-context

Deats:
- litter `._context` methods with `__tracebackhide__`/`hide_tb` which
  were previously being shown but that don't need to be to application
  code now that cancel semantics testing is finished up.
- i originally did the same but later commented it all out in `._ipc`
  since error catch and re-raise instead in higher level layers
  (above the transport) seems to be a much saner approach.
- add catch-n-reraise-from in `MsgStream.send()`/.`receive()` to avoid
  seeing the depths of `trio` and/or our `._ipc` layers on comms errors.

Further this patch adds some refactoring to use the
same remote-error shipper routine from both the actor-core in the RPC
invoker:
- rename it as `try_ship_error_to_remote()` and call it from
  `._invoke()` as well as it's prior usage.
- make it optionally accept `cid: str` a `remote_descr: str` and of
  course a `hide_tb: bool`.

Other misc tweaks:
- add some todo notes around `Actor.load_modules()` debug hooking.
- tweak the zombie reaper log msg and timeout value ;)
2025-03-16 15:30:08 -04:00
Tyler Goodlet 389b305d3b Add (back) a `tractor._testing` sub-pkg
Since importing from our top level `conftest.py` is not scaleable
or as "future forward thinking" in terms of:
- LoC-wise (it's only one file),
- prevents "external" (aka non-test) example scripts from importing
  content easily,
- seemingly(?) can't be used via abs-import if using
  a `[tool.pytest.ini_options]` in a `pyproject.toml` vs.
  a `pytest.ini`, see:
  https://docs.pytest.org/en/8.0.x/reference/customize.html#pyproject-toml)

=> Go back to having an internal "testing" pkg like `trio` (kinda) does.

Deats:
- move generic top level helpers into pkg-mod including the new
  `expect_ctxc()` (which i needed in the advanced faults testing script.
- move `@tractor_test` into `._testing.pytest` sub-mod.
- adjust all the helper imports to be a `from tractor._testing import <..>`

Rework `test_ipc_channel_break_during_stream()` and backing script:
- make test(s) pull `debug_mode` from new fixture (which is now
  controlled manually from `--tpdb` flag) and drop the previous
  parametrized input.
- update logic in ^ test for "which-side-fails" cases to better match
  recently updated/stricter cancel/failure semantics in terms of
  `ClosedResouruceError` vs. `EndOfChannel` expectations.
- handle `ExceptionGroup`s with expected embedded errors in test.
- better pendantics around whether to expect a user simulated KBI.
- for `examples/advanced_faults/ipc_failure_during_stream.py` script:
  - generalize ipc breakage in new `break_ipc()` with support for diff
    internal `trio` methods and a #TODO for future disti frameworks
  - only make one sub-actor task break and the other just stream.
  - use new `._testing.expect_ctxc()` around ctx block.
  - add a bit of exception handling with `print()`s around ctxc (unused
    except if 'msg' break method is set) and eoc cases.
  - don't break parent side ipc in loop any more then once
    after first break, checked via flag var.
  - add a `pre_close: bool` flag to control whether
    `MsgStreama.aclose()` is called *before* any ipc breakage method.

Still TODO:
- drop `pytest.ini` and add the alt section to `pyproject.py`.
 -> currently can't get `--rootdir=` opt to work.. not showing in
   console header.
 -> ^ also breaks on 'tests' `enable_modules` imports in subactors
   during discovery tests?
2025-03-16 15:28:28 -04:00
Tyler Goodlet 1975b92dba Add `an: ActorNursery` var placeholder for final log msg 2025-03-16 15:22:01 -04:00
Tyler Goodlet cbaf4fc05b Add a open-ctx-with-self test
Found exactly why trying this won't work when playing around with
opening workspaces in `modden` using a `Portal.open_context()` back to
the 'bigd' root actor: the RPC machinery only registers one entry in
`Actor._contexts` which will get overwritten by each task's side and
then experience race-based IPC msging errors (eg. rxing `{'started': _}`
on the callee side..). Instead make opening a ctx back to the self-actor
a runtime error describing it as an invalid op.

To match:
- add a new test `test_ctx_with_self_actor()` to the context semantics
  suite.
- tried out adding a new `side: str` to the `Actor.get_context()` (and
  callers) but ran into not being able to determine the value from in
  `._push_result()` where it's needed to figure out which side to push
  to.. So, just leaving the commented arg (passing) in the runtime core
  for now in case we can come back to trying to make it work, tho i'm
  thinking it's not the right hack anyway XD
2025-03-16 15:19:51 -04:00
Tyler Goodlet 68a3969585 Let `MsgStream.receive_nowait()` take in msg key list
Call it `allow_msg_keys: list[str] = ['yield']` and set it to accept
`['yield', 'return']` from the drain loop in `.aclose()`. Only pass the
last key error to `_raise_from_no_key_in_msg()` in the fall-through
case.

Somehow this seems to prevent all the intermittent test failures i was
seeing in local runs including when running the entire suite all in
sequence; i ain't complaining B)
2025-03-16 14:01:50 -04:00
Tyler Goodlet cf68e075c9 Unify some log msgs in `.to_asyncio`
Much like similar recent changes throughout the core, build out `msg:
str` depending on error cases and emit with `.cancel()` level as
appropes. Also mute (via level) some duplication in the cancel case
inside `_run_asyncio_task()` for console noise reduction.
2025-03-16 14:01:50 -04:00
Tyler Goodlet f730749dc9 Assign `ctx._local_error` ASAP from `.open_context()`
Such that `.outcome` related fields render nicely asap for logging
withing `Portal.open_context()` itself.
2025-03-16 14:01:50 -04:00
Tyler Goodlet c8775dee41 Tweak `Context.repr_outcome()` for KBIs
Since apparently `str(KeyboardInterrupt()) == ''`? So instead add little
`<str> or repr(merr)` expressions throughout to avoid blank strings
rendering if various `repr()`/`.__str__()` outputs..
2025-03-16 14:01:50 -04:00
Tyler Goodlet fd2391539e Support a `._state.last_actor()` getter
Not sure if it's really that useful other then for reporting errors from
`current_actor()` but at least it alerts `tractor` devs and/or users
when the runtime has already terminated vs. hasn't been started
yet/correctly.

Set the `._last_actor_terminated: tuple` in the root's final block which
allows testing for an already terminated tree which is the case where
`._state._current_actor == None` and the last is set.
2025-03-16 14:01:50 -04:00
Tyler Goodlet 8e3a2a9297 Make `Actor._cancel_task(requesting_uid: tuple)` required arg 2025-03-16 14:01:50 -04:00
Tyler Goodlet f90ca0668b Woops, fix one last `ctx._cancelled_caught` in drain loop 2025-03-16 14:01:50 -04:00
Tyler Goodlet 7b1528abed (Event) more pedantic `.cancel_acked: bool` def
Changes the condition logic to be more strict and moves it to a private
`._is_self_cancelled() -> bool` predicate which can be used elsewhere
(instead of having almost similar duplicate checks all over the
place..) and allows taking in a specific `remote_error` just for
verification purposes (like for tests).

Main strictness distinctions are now:
- obvi that `.cancel_called` is set (this filters any
  `Portal.cancel_actor()` or other out-of-band RPC),
- the received `ContextCancelled` **must** have its `.canceller` set to
  this side's `Actor.uid` (indicating we are the requester).
- `.src_actor_uid` **must** be the same as the `.chan.uid` (so the error
  must have originated from the opposite side's task.
- `ContextCancelled.canceller` should be already set to the `.chan.uid`
  indicating we received the msg via the runtime calling
  `._deliver_msg()` -> `_maybe_cancel_and_set_remote_error()` which
  ensures the error is specifically destined for this ctx-task exactly
  the same as how `Actor._cancel_task()` sets it from an input
  `requesting_uid` arg.

In support of the above adjust some impl deats:
- add `Context._actor: Actor` which is set once in `mk_context()` to
  avoid issues (particularly in testing) where `current_actor()` raises
  after the root actor / runtime is already exited. Use `._actor.uid` in
  both `.cancel_acked` (obvi) and '_maybe_cancel_and_set_remote_error()`
  when deciding whether to call `._scope.cancel()`.
- always cast `.canceller` to `tuple` if not null.
- delegate `.cancel_acked` directly to new private predicate (obvi).
- always set `._canceller` from any `RemoteActorError.src_actor_uid` or
  failing over to the `.chan.uid` when a non-remote error (tho that
  shouldn't ever happen right?).
- more extensive doc-string for `.cancel()` detailing the new strictness
  rules about whether an eventual `.cancel_acked` might be set.

Also tossed in even more logging format tweaks by adding a
`type_only: bool` to `.repr_outcome()` as desired for simpler output in
the `state: <outcome-repr-here>` and `.repr_rpc()` sections of the
`.__str__()`.
2025-03-16 14:01:50 -04:00
Tyler Goodlet c5228e7be5 Set `._cancel_msg` to RPC `{cmd: 'self._cancel_task', ..}` msg
Like how we set `Context._cancel_msg` in `._deliver_msg()` (in
which case normally it's an `{'error': ..}` msg), do the same when any
RPC task is remotely cancelled via `Actor._cancel_task` where that task
doesn't yet have a cancel msg set yet.

This makes is much easier to distinguish between ctx cancellations due
to some remote error vs. Explicit remote requests via any of
`Actor.cancel()`, `Portal.cancel_actor()` or `Context.cancel()`.
2025-03-16 14:01:50 -04:00
Tyler Goodlet ffed35e263 `._entry`: use same msg info in start/terminate log 2025-03-16 14:01:50 -04:00
Tyler Goodlet 885ba04908 Tweak `._portal` log content to use `Context.repr_outcome()` 2025-03-16 14:01:50 -04:00
Tyler Goodlet 4fb34772e7 Mega-refactor on `._invoke()` targeting `@context`s
Since eventually we want to implement all other RPC "func types" as
contexts underneath this starts the rework to move all the other cases
into a separate func not only to simplify the main `._invoke()` body but
also as a reminder of the intention to do it XD

Details of re-factor:
- add a new `._invoke_non_context()` which just moves all the old blocks
  for non-context handling to a single def.
- factor what was basically just the `finally:` block handler (doing all
  the task bookkeeping) into a new `@acm`: `_errors_relayed_via_ipc()`
  with that content packed into the post-`yield` (also with a `hide_tb:
  bool` flag added of course).
  * include a `debug_kbis: bool` for when needed.
- since the `@context` block is the only type left in the main
  `_invoke()` body, de-dent it so it's more grok-able B)

Obviously this patch also includes a few improvements regarding
context-cancellation-semantics (for the `context` RPC case) on the
callee side in order to match previous changes to the `Context` api:
- always setting any ctxc as the `Context._local_error`.
- using the new convenience `.maybe_raise()` topically (for now).
- avoiding any previous reliance on `Context.cancelled_caught` for
  anything public of meaning.

Further included is more logging content updates:
- being pedantic in `.cancel()` msgs about whether termination is caused
  by error or ctxc.
- optional `._invoke()` traceback hiding via a `hide_tb: bool`.
- simpler log headers throughout instead leveraging new `.__repr__()` on
  primitives.
- buncha `<= <actor-uid>` sent some message emissions.
- simplified handshake statuses reporting.

Other subsys api changes we need to match:
- change to `Channel.transport`.
- avoiding any `local_nursery: ActorNursery` waiting when the
  `._implicit_runtime_started` is set.

And yes, lotsa more comments for #TODOs dawg.. since there's always
somethin!
2025-03-16 14:01:48 -04:00
Tyler Goodlet 1c9589cfc4 Avoid `ctx.cancel()` after ctxc rxed in `.open_context()`
In the case where the callee side delivers us a ctxc with `.canceller`
set we can presume that remote cancellation already has taken place and
thus we don't need to do the normal call-`Context.cancel()`-on-error
step. Further, in the case where we do call it also handle any
`trio.CloseResourceError` gracefully with a `.warning()`.

Also, originally I had added a post-`yield`-maybe-raise to attempt
handling any remote ctxc the same as for the local case (i.e. raised
from `yield` line) wherein if we get a remote ctxc the same handler
branch-path would trigger, thus avoiding different behaviour in that
case. I ended up masking it out (but can't member why.. ) as it seems
the normal `.result()` call and its internal handling gets the same
behaviour? I've left in the heavily commented code in case it ends up
being the better way to go; likely making the move to having a single
code in both cases is better even if it is just a matter of deciding
whether to swallow the ctxc or not in the `.cancel_acked` case.

Further teensie improvements:
- obvi improve/simplify log msg contents as in prior patches.
- use the new `maybe_wait_for_debugger(header_msg: str)` if/when waiting
  to exit in debug mode.
- another `hide_tb: bool` frame hider flag.
- rando type-annot updates of course :)
2025-03-15 00:08:13 -04:00
Tyler Goodlet 910c07db06 Deep `Context` refinements
Spanning from the pub API, to instance `repr()` customization (for
logging/REPL content), to the impl details around the notion of a "final
outcome" and surrounding IPC msg draining mechanics during teardown.

A few API and field updates:

- new `.cancel_acked: bool` to replace what we were mostly using
  `.cancelled_caught: bool` for but, for purposes of better mapping the
  semantics of remote cancellation of parallel executing tasks; it's set
  only when `.cancel_called` is set and a ctxc arrives with
  a `.canceller` field set to the current actor uid indicating we
  requested and received acknowledgement from the other side's task
  that is cancelled gracefully.

- strongly document and delegate (and prolly eventually remove as a pub
  attr) the `.cancelled_caught` property entirely to the underlying
  `._scope: trio.CancelScope`; the `trio` semantics don't really map
  well to the "parallel with IPC msging"  case in the sense that for
  us it breaks the concept of the ctx/scope closure having "caught"
  something instead of having "received" a msg that the other side has
  "acknowledged" (i.e. which for us is the completion of cancellation).

- new `.__repr__()`/`.__str__()` format that tries to tersely yet
  comprehensively as possible display everything you need to know about
  the 3 main layers of an SC-linked-IPC-context:
  * ipc: the transport + runtime layers net-addressing and prot info.
  * rpc: the specific linked caller-callee task signature details
    including task and msg-stream instances.
  * state: current execution and final outcome state of the task pair.
  * a teensie extra `.repr_rpc` for a condensed rpc signature.

- new `.dst_maddr` to get a `libp2p` style "multi-address" (though right
  now it's just showing the transport layers so maybe we should move to
  to our `Channel`?)

- new public instance-var fields supporting more granular remote
  cancellation/result/error state:
  * `.maybe_error: Exception|None` for any final (remote) error/ctxc
    which computes logic on the values of `._remote_error`/`._local_error`
    to determine the "final error" (if any) on termination.
  * `.outcome` to the final error or result (or `None` if un-terminated)
  * `.repr_outcome()` for a console/logging friendly version of the
    final result or error as needed for the `.__str__()`.

- new private interface bits to support all of ^:
  * a new "no result yet" sentinel value, `Unresolved`, using a module
    level class singleton that `._result` is set too (instead of
    `id(self)`) to both determine if and present when no final result
    from the callee has-yet-been/was delivered (ever).
    => really we should get rid of `.result()` and change it to
    `.wait_for_result()` (or something)u
  * `_final_result_is_set()` predicate to avoid waiting for an already
    delivered result.
  * `._maybe_raise()` proto-impl that we should use to replace all the
    `if re:` blocks it can XD
  * new `._stream: MsgStream|None` for when a stream is opened to aid
    with the state repr mentioned above.

Tweaks to the termination drain loop `_drain_to_final_msg()`:

- obviously (obvi) use all the changes above when determining whether or
  not a "final outcome" has arrived and thus breaking from the loop ;)
  * like the `.outcome` `.maybe_error`  and `._final_ctx_is_set()` in
    the `while` pred expression.

- drop the `_recv_chan.receive_nowait()` + guard logic since it seems
  with all the surrounding (and coming soon) changes to
  `Portal.open_context()` using all the new API stuff (mentioned in
  first bullet set above) we never hit the case of inf-block?

Oh right and obviously a ton of (hopefully improved) logging msg content
changes, commented code removal and detailed comment-docs strewn about!
2025-03-15 00:08:13 -04:00
Tyler Goodlet d8d206b93f Make stream draining status logs `.debug()` level 2025-03-15 00:08:11 -04:00
Tyler Goodlet fb55784798 Add `._implicit_runtime_started` mark, better logs
After some deep logging improvements to many parts of `._runtime`,
I realized a silly detail where we are always waiting on any opened
`local_nursery: ActorNursery` to signal exit from
`Actor._stream_handler()` even in the case of being an implicitly opened
root actor (`open_root_actor()` wasn't called by user/app code) via
`._supervise.open_nursery()`..

So, to address this add a `ActorNursery._implicit_runtime_started: bool`
that can be set and then checked to avoid doing the unnecessary
`.exited.wait()` (and any subsequent warn logging on an exit timeout) in
that special but most common case XD

Matching with other subsys log format refinements, improve readability
and simplicity of the actor-nursery supervisory log msgs, including:
- simplify and/or remove any content that more or less duplicates msg
  content found in emissions from lower-level primitives and sub-systems
  (like `._runtime`, `_context`, `_portal` etc.).
- add a specific `._open_and_supervise_one_cancels_all_nursery()`
  handler block for `ContextCancelled` to log with `.cancel()` level
  noting that the case is a "remote cancellation".
- put the nursery-exit and actor-tree shutdown status into a single msg
  in the `implicit_runtime` case.
2025-03-15 00:06:15 -04:00
Tyler Goodlet 1bc858cd00 Spawn naming and log format tweaks
- rename `.soft_wait()` -> `.soft_kill()`
- rename `.do_hard_kill()` -> `.hard_kill()`
- adjust any `trio.Process.__repr__()` log msg contents to have the
  little tree branch prefix: `'|_'`
2025-03-15 00:06:15 -04:00
Tyler Goodlet 04aea5c4db Add field-first subproca `.info()` to `._entry` 2025-03-15 00:06:13 -04:00
Tyler Goodlet 7bb44e6930 Add "fancier" remote-error `.__repr__()`-ing
Our remote error box types `RemoteActorError`, `ContextCancelled` and
`StreamOverrun` needed a console display makeover particularly for
logging content and `repr()` in higher level primitives like `Context`.

This adds a more "dramatic" str-representation to showcase the
underlying boxed traceback content more sensationally (via ascii-art
emphasis) as well as support a more terse `.reprol()` (representation
for one-line) format that can be used for types that track remote
errors/cancels like with `Context._remote_error`.

Impl deats:
- change `RemoteActorError.__repr__()` formatting to show (sub-type
  specific) `.msgdata` fields in a multi-line format (similar to our new
  `.msg.types.Struct` style) followed by some ascii accented delimiter
  lines to emphasize any `.msgdata["tb_str"]` packed by the remote
- for rme and subtypes allow picking the specifically relevant fields
  via a type defined `.reprol_fields: list[str]` and pick for each
  subtype:
   |_ `RemoteActorError.src_actor_uid`
   |_ `ContextCancelled.canceller`
   |_ `StreamOverrun.sender`

- add `.reprol()` to show a `repr()`-on-one-line formatted string that
  can be used by other multi-line-field-`repr()` styled composite types
  as needed in (high level) logging info.
- toss in some mod level `_body_fields: list[str]` for summary of such
  fields (if needed).
- add some new rae (remote-actor-error) props:
  - `.type` around a newly named `.boxed_type`
  - `.type_str: str`
  - `.tb_str: str`
2025-03-15 00:05:31 -04:00
Tyler Goodlet 2cc712cd81 Fix `Channel.__repr__()` safety, renames to `._transport`
Hit a reallly weird bug in the `._runtime` IPC msg handling loop where
it seems that by `str.format()`-ing a `Channel` before initializing it
would put the `._MsgTransport._agen()` in an already started state
causing an irrecoverable core startup failure..

I presume it's something to do with delegating to the
`MsgpackTCPStream.__repr__()` and, something something.. the
`.set_msg_transport(stream)` getting called to too early such that
`.msgstream.__init__()` is called thus init-ing the `._agen()` before
necessary? I'm sure there's a design lesson to be learned in here
somewhere XD

This was discovered while trying to add more "fancy" logging throughout
said core for the purposes of cobbling together an init attempt at
libp2p style multi-address representations for our IPC primitives. Thus
I also tinker here with adding some new fields to `MsgpackTCPStream`:
- `layer_key`: int = 4
- `name_key`: str = 'tcp'
- `codec_key`: str = 'msgpack'

Anyway, just changed it so that if `.msgstream` ain't set then we just
return a little "null repr" `str` value thinger.

Also renames `Channel.msgstream` internally to `._transport` with
appropriate pub `@property`s added such that everything else won't break
;p

Also drops `Optional` typing vis-a-vi modern union syntax B)
2025-03-15 00:05:31 -04:00
Tyler Goodlet c421f7e722 Make `NamespacePath` kinda support methods..
Obviously we can't deterministic-ally call `.load_ref()` (since you'd
have to point to an `id()` or something and presume a particular
py-runtime + virt-mem space for it to exist?) but it at least helps with
the `str` formatting for logging purposes (like `._cancel_rpc_tasks()`)
when `repr`-ing ctxs and their specific "rpc signatures".

Maybe in the future getting this working at least for singleton types
per process (like `Actor` XD ) will be a thing we can support and make
some sense of.. Bo
2025-03-15 00:05:31 -04:00
Tyler Goodlet 1c217ef36f Add #TODO for generating func-sig type-annots as `str` for pprinting 2025-03-14 22:49:38 -04:00
Tyler Goodlet d7f2f51f7f Bring in pretty-ified `msgspec.Struct` extension
Originally designed and used throughout `piker`, the subtype adds some
handy pprinting and field diffing extras often handy when viewing struct
types in logging or REPL console interfaces B)

Obvi this rejigs the `tractor.msg` mod into a sub-pkg and moves the
existing namespace obj-pointer stuff into a new `.msg.ptr` sub mod.
2025-03-14 22:49:21 -04:00
Tyler Goodlet a97b45d90b WIP final impl of ctx-cancellation-semantics 2025-03-14 22:18:31 -04:00
Tyler Goodlet a388d3185b Few more log msg tweaks in runtime 2025-03-14 22:18:31 -04:00
Tyler Goodlet 4d0df1bb4a Call `actor.cancel(None)` from root to avoid mismatch with (any future) meth sig changes 2025-03-14 22:18:31 -04:00
Tyler Goodlet 1be296c725 Add note that maybe `Context._eoc` should be set by caller? 2025-03-14 22:18:31 -04:00
Tyler Goodlet 9420ea0c14 Tweak `Actor` cancel method signatures
Besides improving a bunch more log msg contents similarly as before this
changes the cancel method signatures slightly with different arg names:

for `.cancel()`:
- instead of `requesting_uid: str` take in a `req_chan: Channel`
  since we can always just read its `.uid: tuple` for logging and
  further we can then offer the `chan=None` case indicating a
  "self cancel" (since there's no "requesting channel").
- the semantics of "requesting" here better indicate that the IPC connection
  is an IPC peer and further (eventually) will allow permission checking
  against given peers for cancellation requests.
- when `chan==None` we also define a meth-internal `requester_type: str`
  differently for logging content :)
- add much more detailed `.cancel()` content around the requester, its
  type, and any debugger related locking steps.

for `._cancel_task()`:
- change the `chan` arg to `parent_chan: Channel` since "parent"
  correctly indicates that the channel is the parent of the locally
  spawned rpc task to cancel; in fact no other chan should be able to
  cancel tasks parented/spawned by other channels obvi!
- also add more extensive meth-internal `.cancel()` logging with a #TODO
  around showing only the "relevant/lasest" `Context` state vars in such
  logging content.

for `.cancel_rpc_tasks()`:
- shorten `requesting_uid` -> `req_uid`.
- add `parent_chan: Channel` to be similar as above in `._cancel_task()`
  (since it's internally delegated to anyway) which replaces the prior
  `only_chan` and use it to filter to only tasks spawned by this channel
  (thus as their "parent") as before.
- instead of `if tasks:` to enter, invert and `return` early on
  `if not tasks`, for less indentation B)
- add WIP str-repr format (for `.cancel()` emissions) to show
  a multi-address (maddr) + task func (via the new `Context._nsf`) and
  report all cancel task targets with it a "tree"; include #TODO to
  finalize and implement some utils for all this!

To match ensure we adjust `process_messages()` self/`Actor` cancel
handling blocks to provide the new `kwargs` (now with `dict`-merge
syntax) to `._invoke()`.
2025-03-14 22:18:29 -04:00
Tyler Goodlet 51a3f1bef4 Add `pformat()` of `ActorNursery._children` to logging
Such that you see the children entries prior to exit instead of the
prior somewhat detail/use-less logging. Also, rename all `anursery` vars
to just `an` as is the convention in most examples.
2025-03-14 22:16:37 -04:00
Tyler Goodlet ca1b8e0224 Set any `._eoc` to the err in `_raise_from_no_key_in_msg()`
Since that's what we're now doing in `MsgStream._eoc` internal
assignments (coming in future patch), do the same in this exception
re-raise-helper and include more extensive doc string detailing all
the msg-type-to-raised-error cases. Also expose a `hide_tb: bool` like
we have already in `unpack_error()`.
2025-03-14 22:13:14 -04:00
Tyler Goodlet e403d63eb7 Better logging for cancel requests in IPC msg loop
As similarly improved in other parts of the runtime, adds much more
pedantic (`.cancel()`) logging content to indicate the src of remote
cancellation request particularly for `Actor.cancel()` and
`._cancel_task()` cases prior to `._invoke()` task scheduling. Also add
detailed case comments and much more info to the
"request-to-cancel-already-terminated-RPC-task" log emission to include
the `Channel` and `Context.cid` deats.

This helped me find the src of a race condition causing a test to fail
where a callee ctx task was returning a result *before* an expected
`ctx.cancel()` request arrived B). Adding much more pedantic
`.cancel()` msg contents around the requester's deats should ensure
these cases are much easier to detect going forward!

Also, simplify the `._invoke()` final result/error log msg to only put
*one of either* the final error or returned result above the `Context`
pprint.
2025-03-14 22:13:12 -04:00
Tyler Goodlet 3c385c6949 Use `NamespacePath` in `Context` mgmt internals
The only case where we can't is in `Portal.run_from_ns()` usage (since we
pass a path with `self:<Actor.meth>`) and because `.to_tuple()`
internally uses `.load_ref()` which will of course fail on such a path..

So or now impl as,
- mk `Actor.start_remote_task()` take a `nsf: NamespacePath` but also
  offer a `load_nsf: bool = False` such that by default we bypass ref
  loading (maybe this is fine for perf long run as well?) for the
  `Actor`/'self:'` case mentioned above.
- mk `.get_context()` take an instance `nsf` obvi.

More logging msg format tweaks:
- change msg-flow related content to show the `Context._nsf`, which,
  right, is coming follow up commit..
- bunch more `.runtime()` format updates to show `msg: dict` contents
  and internal primitives with trailing `'\n'` for easier reading.
- report import loading `stackscope` in subactors.
2025-03-14 22:11:57 -04:00
Tyler Goodlet b28df738fe Drop extra "
" when logging actor nursery errors
2025-03-14 21:49:15 -04:00
Tyler Goodlet 5fa040c7db Add `NamespacePath._ns` todo for `self:<ns.meth>` support 2025-03-14 21:49:15 -04:00
Tyler Goodlet 27b750e907 Emit warning on any `ContextCancelled.canceller == None` 2025-03-14 21:49:15 -04:00
Tyler Goodlet 338ea5529c .log: more multi-line styling 2025-03-14 16:41:08 -04:00
Tyler Goodlet 6bc67338cf Better subproc supervisor logging, todo for #320
Given i just similarly revamped a buncha `._runtime` log msg formatting,
might as well do something similar inside the spawning machinery such
that groking teardown sequences of each supervising task is much more
sane XD

Mostly this includes doing similar `'<field>: <value>\n'` multi-line
formatting when reporting various subproc supervision steps as well as
showing a detailed `trio.Process.__repr__()` as appropriate.

Also adds a detailed #TODO according to the needs of #320 for which
we're going to need some internal mechanism for intermediary parent
actors to determine if a given debug tty locker (sub-actor) is one of
*their* (transitive) children and thus stall the normal
cancellation/teardown sequence until that locker is complete.
2025-03-14 16:41:06 -04:00
Tyler Goodlet fd20004757 _supervise: iter nice expanded multi-line `._children` tups with typing 2025-03-14 16:34:17 -04:00
Tyler Goodlet ddc2e5f0f8 WIP: solved the modden client hang.. 2025-03-14 16:34:10 -04:00
Tyler Goodlet 4b0aa5e379 Baboso! fix `chan.send(None)` indent.. 2025-03-14 15:49:37 -04:00
Tyler Goodlet 6a303358df Improved log msg formatting in core
As part of solving some final edge cases todo with inter-peer remote
cancellation (particularly a remote cancel from a separate actor
tree-client hanging on the request side in `modden`..) I needed less
dense, more line-delimited log msg formats when understanding ipc
channel and context cancels from console logging; this adds a ton of
that to:
- `._invoke()` which now does,
  - better formatting of `Context`-task info as multi-line
    `'<field>: <value>\n'` messages,
  - use of `trio.Task` (from `.lowlevel.current_task()` for full
    rpc-func namespace-path info,
  - better "msg flow annotations" with `<=` for understanding
    `ContextCancelled` flow.
- `Actor._stream_handler()` where in we break down IPC peers reporting
  better as multi-line `|_<Channel>` log msgs instead of all jammed on
  one line..
- `._ipc.Channel.send()` use `pformat()` for repr of packet.

Also tweak some optional deps imports for debug mode:
- add `maybe_import_gb()` for attempting to import `greenback`.
- maybe enable `stackscope` tree pprinter on `SIGUSR1` if installed.

Add a further stale-debugger-lock guard before removal:
- read the `._debug.Lock.global_actor_in_debug: tuple` uid and possibly
  `maybe_wait_for_debugger()` when the child-user is known to have
  a live process in our tree.
- only cancel `Lock._root_local_task_cs_in_debug: CancelScope` when
  the disconnected channel maps to the `Lock.global_actor_in_debug`,
  though not sure this is correct yet?

Started adding missing type annots in sections that were modified.
2025-03-14 15:49:36 -04:00
Tyler Goodlet c85757aee1 Let `pack_error()` take a msg injected `cid: str|None` 2025-03-14 15:31:16 -04:00
Tyler Goodlet 9fc9b10b53 Add `StreamOverrun.sender: tuple` for better handling
Since it's generally useful to know who is the cause of an overrun (say
bc you want your system to then adjust the writer side to slow tf down)
might as well pack an extra `.sender: tuple[str, str]` actor uid field
which can be relayed through `RemoteActorError` boxing. Add an extra
case for the exc-type to `unpack_error()` to match B)
2025-03-14 14:14:54 -04:00
Tyler Goodlet a86275996c Offer `unpack_error(hid_tb: bool)` for `pdbp` REPL config 2025-03-14 14:14:54 -04:00
Tyler Goodlet b5431c0343 Never mask original `KeyError` in portal-error unwrapper, for now? 2025-03-14 14:14:54 -04:00
Tyler Goodlet cdee6f9354 Try allowing multi-pops of `_Cache.locks` for now? 2025-03-14 14:14:53 -04:00
Tyler Goodlet a2f1bcc23f Use `import <blah> as blah` over `__all__` in `.trionics` 2025-03-14 14:14:53 -04:00
Tyler Goodlet 45e9cb4d09 `_root`: drop unused `typing` import 2025-03-14 14:14:53 -04:00
Tyler Goodlet 27c5ffe5a7 Move missing-key-in-msg raiser to `._exceptions`
Since we use basically the exact same set of logic in
`Portal.open_context()` when expecting the first `'started'` msg factor
and generalize `._streaming._raise_from_no_yield_msg()` into a new
`._exceptions._raise_from_no_key_in_msg()` (as per the lingering todo)
which obvi requires a more generalized / optional signature including
a caller specific `log` obj. Obvi call the new func from all the other
modules X)
2025-03-14 14:14:50 -04:00
Tyler Goodlet 914efd80eb Fmt repr as multi-line style call 2025-03-14 14:14:11 -04:00
Tyler Goodlet 2d2d1ca1c4 Drop unused walrus assign of `re` 2025-03-14 14:14:11 -04:00
Tyler Goodlet 74aa5aa9cd `StackLevelAdapter._log(stacklevel: int)` for custom levels..
Apparently (and i don't know if this was always broken [i feel like no?]
or is a recent change to stdlib's `logging` stuff) we need increment the
`stacklevel` input by one for our custom level methods now? Without this
you're going to see the path to the method's-callstack-frame on every
emission instead of to the caller's. I first noticed this when debugging
the workspace layer spawning in `modden.bigd` and then verified it in
other depended projects..

I guess we should add some tests for this as well XD
2025-03-14 14:14:11 -04:00
Tyler Goodlet 44e386dd99 ._child: remove some unused imports.. 2025-03-14 13:56:25 -04:00
Tyler Goodlet 13fbcc723f Guarding for IPC failures in `._runtime._invoke()`
Took me longer then i wanted to figure out the source of
a failed-response to a remote-cancellation (in this case in `modden`
where a client was cancelling a workspace layer.. but disconnects before
receiving the ack msg) that was triggering an IPC error when sending the
error msg for the cancellation of a `Actor._cancel_task()`, but since
this (non-rpc) `._invoke()` task was trying to send to a now
disconnected canceller it was resulting in a `BrokenPipeError` (or similar)
error.

Now, we except for such IPC errors and only raise them when,
1. the transport `Channel` is for sure up (bc ow what's the point of
   trying to send an error on the thing that caused it..)
2. it's definitely for handling an RPC task

Similarly if the entire main invoke `try:` excepts,
- we only hide the call-stack frame from the debugger (with
  `__tracebackhide__: bool`) if it's an RPC task that has a connected
  channel since we always want to see the frame when debugging internal
  task or IPC failures.
- we don't bother trying to send errors to the context caller (actor)
  when it's a non-RPC request since failures on actor-runtime-internal
  tasks shouldn't really ever be reported remotely, only maybe raised
  locally.

Also some other tidying,
- this properly corrects for the self-cancel case where an RPC context
  is cancelled due to a local (runtime) task calling a method like
  `Actor.cancel_soon()`. We now set our own `.uid` as the
  `ContextCancelled.canceller` value so that other-end tasks know that
  the cancellation was due to a self-cancellation by the actor itself.
  We still need to properly test for this though!
- add a more detailed module doc-str.
- more explicit imports for `trio` core types throughout.
2025-03-14 13:56:23 -04:00
Tyler Goodlet 315f0fc7eb More thurough hard kill doc strings 2025-03-14 13:48:35 -04:00
Tyler Goodlet 0222180c11 Tweak `Channel._cancel_called` comment 2025-03-14 13:44:09 -04:00
Tyler Goodlet 7d5fda4485 Be ultra-correct in `Portal.open_context()`
This took way too long to get right but hopefully will give us grok-able
and correct context exit semantics going forward B)

The main fixes were:
- always shielding the `MsgStream.aclose()` call on teardown to avoid
  bubbling a `Cancelled`.
- properly absorbing any `ContextCancelled` in cases due to "self
  cancellation" using the new `Context.canceller` in the logic.
- capturing any error raised by the `Context.result()` call in the
  "normal exit, result received" case and setting it as the
  `Context._local_error` so that self-cancels can be easily measured via
  `Context.cancelled_caught` in same way as remote-error caused
  cancellations.
- extremely detailed comments around all of the cancellation-error cases
  to avoid ever getting confused about the control flow in the future XD
2025-03-14 13:44:08 -04:00
Tyler Goodlet f5fcd8ca2e Be mega-pedantic with `ContextCancelled` semantics
As part of extremely detailed inter-peer-actor testing, add much more
granular `Context` cancellation state tracking via the following (new)
fields:
- `.canceller: tuple[str, str]` the uuid of the actor responsible for
  the cancellation condition - always set by
  `Context._maybe_cancel_and_set_remote_error()` and replaces
  `._cancelled_remote` and `.cancel_called_remote`. If set, this value
  should normally always match a value from some `ContextCancelled`
  raised or caught by one side of the context.
- `._local_error` which is always set to the locally raised (and caller
  or callee task's scope-internal) error which caused any
  eventual cancellation/error condition and thus any closure of the
  context's per-task-side-`trio.Nursery`.
- `.cancelled_caught: bool` is now always `True` whenever the local task
  catches (or "silently absorbs") a `ContextCancelled` (a `ctxc`) that
  indeed originated from one of the context's linked tasks or any other
  context which raised its own `ctxc` in the current `.open_context()` scope.
  => whenever there is a case that no `ContextCancelled` was raised
  **in** the `.open_context().__aexit__()` (eg. `ctx.result()` called
  after a call `ctx.cancel()`), we still consider the context's as
  having "caught a cancellation" since the `ctxc` was indeed silently
  handled by the cancel requester; all other error cases are already
  represented by mirroring the state of the `._scope: trio.CancelScope`
  => IOW there should be **no case** where an error is **not raised** in
  the context's scope and `.cancelled_caught: bool == False`, i.e. no
  case where `._scope.cancelled_caught == False and ._local_error is not
  None`!
- always raise any `ctxc` from `.open_stream()` if `._cancel_called ==
  True` - if the cancellation request has not already resulted in
  a `._remote_error: ContextCancelled` we raise a `RuntimeError` to
  indicate improper usage to the guilty side's task code.
- make `._maybe_raise_remote_err()` a sync func and don't raise
  any `ctxc` which is matched against a `.canceller` determined to
  be the current actor, aka a "self cancel", and always set the
  `._local_error` to any such `ctxc`.
- `.side: str` taken from inside `.cancel()` and unused as of now since
  it might be better re-written as a similar `.is_opener() -> bool`?
- drop unused `._started_received: bool`..
- TONS and TONS of detailed comments/docs to attempt to explain all the
  possible cancellation/exit cases and how they should exhibit as either
  silent closes or raises from the `Context` API!

Adjust the `._runtime._invoke()` code to match:
- use `ctx._maybe_raise_remote_err()` in `._invoke()`.
- adjust to new `.canceller` property.
- more type hints.
- better `log.cancel()` msging around self-cancels vs. peer-cancels.
- always set the `._local_error: BaseException` for the "callee" task
  just like `Portal.open_context()` now will do B)

Prior we were raising any `Context._remote_error` directly and doing
(more or less) the same `ContextCancelled` "absorbing" logic (well
kinda) in block; instead delegate to the method
2025-03-14 13:42:55 -04:00
Tyler Goodlet 04217f319a Raise a `MessagingError` from the src error on msging edge cases 2025-03-14 13:42:15 -04:00
Tyler Goodlet 8cb8390201 Move `MessagingError` into `._exceptions` set 2025-03-14 13:42:15 -04:00
Tyler Goodlet 5035617adf Dump `.msgdata` in `RemoteActorError.__repr__()` 2025-03-14 13:42:15 -04:00
Tyler Goodlet f895c96600 Add masked super timeout line to `do_hard_kill()` for would-be runtime hackers 2025-03-14 13:40:19 -04:00
Tyler Goodlet a7c36a9cbe Tidy/clarify another `._runtime` comment 2025-03-14 13:40:19 -04:00
Tyler Goodlet 22e4b324b1 Get mega-pedantic in `Portal.open_context()`
Specifically in the `.__aexit__()` phase to ensure remote,
runtime-internal, and locally raised error-during-cancelled-handling
exceptions are NEVER masked by a local `ContextCancelled` or any
exception group of `trio.Cancelled`s.

Also adds a ton of details to doc strings including extreme detail
surrounding the `ContextCancelled` raising cases and their processing
inside `.open_context()`'s exception handler blocks.

Details, details:
- internal rename `err`/`_err` stuff to just be `scope_err` since it's
  effectively the error bubbled up from the context's surrounding (and
  cross-actor) "scope".
- always shield `._recv_chan.aclose()` to avoid any `Cancelled` from
  masking the `scope_err` with a runtime related `trio.Cancelled`.
- explicitly catch the specific set of `scope_err: BaseException` that
  we can reasonably expect to handle instead of the catch-all parent
  type including exception groups, cancels and KBIs.
2025-03-14 13:40:18 -04:00
Tyler Goodlet 89ed8b67ff Drop `msg` kwarg from `Context.cancel()`
Well first off, turns out it's never used and generally speaking
doesn't seem to help much with "runtime hacking/debugging"; why would
we need to "fabricate" a msg when `.cancel()` is called to self-cancel?

Also (and since `._maybe_cancel_and_set_remote_error()` now takes an
`error: BaseException` as input and thus expects error-msg unpacking
prior to being called), we now manually set `Context._cancel_msg: dict`
just prior to any remote error assignment - so any case where we would
have fabbed a "cancel msg" near calling `.cancel()`, just do the manual
assign.

In this vein some other subtle changes:
- obviously don't set `._cancel_msg` in `.cancel()` since it's no longer
  an input.
- generally do walrus-style `error := unpack_error()` before applying
  and setting remote error-msg state.
- always raise any `._remote_error` in `.result()` instead of returning
  the exception instance and check before AND after the underlying mem
  chan read.
- add notes/todos around `raise self._remote_error from None` masking of
  (runtime) errors in `._maybe_raise_remote_err()` and use it inside
  `.result()` since we had the inverse duplicate logic there anyway..

Further, this adds and extends a ton of (internal) interface docs and
details comments around the `Context` API including many subtleties
pertaining to calling `._maybe_cancel_and_set_remote_error()`.
2025-03-14 13:37:55 -04:00
Tyler Goodlet 11bbf15817 `._exceptions`: typing and error unpacking updates
Bump type annotations to 3.10+ style throughout module as well as fill
out doc strings a bit. Inside `unpack_error()` pop any `error_dict: dict`
and,
- return `None` early if not found,
- versus pass directly as `**error_dict` to the error constructor
  instead of a double field read.
2025-03-14 13:36:16 -04:00
Tyler Goodlet a18663213a Add comments around diff between `C/context` refs 2025-03-14 13:36:16 -04:00
Tyler Goodlet d4d09b6071 Factor non-yield stream msg processing into helper
Since both `MsgStream.receive()` and `.receive_nowait()` need the same
raising logic when a non-stream msg arrives (so that maybe an
appropriate IPC translated error can be raised) move the `KeyError`
handler code into a new `._streaming._raise_from_no_yield_msg()` func
and call it from both methods to make the error-interface-raising
symmetrical across both methods.
2025-03-14 13:36:16 -04:00
Tyler Goodlet 6d10f0c516 Always raise remote (cancelled) error if set
Previously we weren't raising a remote error if the local scope was
cancelled during a call to `Context.result()` which is problematic if
the caller WAS NOT the requester for said remote cancellation; in that
case we still want a `ContextCancelled` raised with the `.canceller:
str` set to the cancelling actor uid.

Further fix a naming bug where the (seemingly older) `._remote_err` was
being set to such an error instead of `._remote_error` XD
2025-03-14 13:36:16 -04:00
Tyler Goodlet fa9b57bae0 Write more comprehensive `Portal.cancel_actor()` doc str 2025-03-14 13:36:16 -04:00
Tyler Goodlet 144d1f4d94 Msg-ified `ContextCancelled`s sub-error type should always be just, its type.. 2025-03-14 13:36:16 -04:00
Tyler Goodlet cff69d07fe Mk `gather_contexts()` support `@acm`s yielding `None`
We were using a `all(<yielded values>)` condition which obviously won't
work if the batched managers yield any non-truthy value. So instead see
the `unwrapped: dict` with the `id(mngrs)` and only unblock once all
values have been filled in to be something that is not that value.
2025-03-14 13:36:16 -04:00
Tyler Goodlet ee94d6d62c Teensie tidy up on actor doc string 2025-03-14 13:36:16 -04:00
Tyler Goodlet 89b84ed6c0 Make `NamespacePath` work on object refs
Detect if the input ref is a non-func (like an `object` instance) in
which case grab its type name using `type()`. Wrap all the name-getting
into a new `_mk_fqpn()` static meth: gets the "fully qualified path
name" and returns path and name in tuple; port other methds to use it.
Refine and update the docs B)
2025-03-14 13:36:16 -04:00
Tyler Goodlet f33f689f34 .log: more correct handling for `get_logger(__name__)` usage 2025-03-14 13:36:16 -04:00
Tyler Goodlet 7507e269ec Just import `mp` top level in `._spawn` 2023-06-14 15:32:15 -04:00
Tyler Goodlet 17ae449160 Tidy up `typing` imports in broadcaster mod 2023-06-14 15:31:52 -04:00
Tyler Goodlet 6495688730 Drop `Optional` style from runtime mod 2023-05-25 16:00:05 -04:00
Tyler Goodlet a0276f41c2 Remote cancellation runtime-internal vars renames
- `Context._cancel_called_remote` -> `._cancelled_remote` since "called"
  implies the cancellation was "requested" when it could be due to
  another error and the actor uid is the value - only set once the far
  end task scope is terminated due to either error or cancel, which has
  nothing to do with *what* caused the cancellation.
- `Actor._cancel_called_remote` -> `._cancel_called_by_remote` which
  emphasizes that this variable is **only set** IFF some remote actor
  **requested that** this actor's runtime be cancelled via
  `Actor.cancel()`.
2023-05-19 14:31:55 -04:00
Tyler Goodlet ead9e418de Expose `allow_overruns` to `Portal.open_context()`
Turns out you can get a case where you might be opening multiple
ctx-streams concurrently and during the context opening phase you block
for all contexts to open, but then when you eventually start opening
streams some slow to start context has caused the others become in an
overrun state.. so we need to let the caller control whether that's an
error ;)

This also needs a test!
2023-05-15 10:00:45 -04:00
Tyler Goodlet 60791ed546 Oof, fix remaining `Actor.cancel()` in `Actor._from_parent()` 2023-05-15 10:00:45 -04:00
Tyler Goodlet 7293b82bcc Tweak doc string 2023-05-15 10:00:45 -04:00
Tyler Goodlet 20d75ff934 Move move context code into new `._context` mod 2023-05-15 10:00:45 -04:00
Tyler Goodlet 04e4397a8f Ignore drainer-task nursery RTE during context exit 2023-05-15 10:00:45 -04:00
Tyler Goodlet 968f13f9ef Set `Context._scope_nursery` on callee side too
Because obviously we probably want to support `allow_overruns` on the
remote callee side as well XD

Only found the bugs fixed in this patch this thanks to writing a much
more exhaustive test set for overrun cases B)
2023-05-15 10:00:45 -04:00
Tyler Goodlet f9911c22a4 Seriously cover all overrun cases
This actually caught further runtime bugs so it's gud i tried..
Add overrun-ignore enabled / disabled cases and error catching for all
of them. More or less this should cover every possible outcome when
it comes to setting `allow_overruns: bool` i hope XD
2023-05-15 10:00:45 -04:00
Tyler Goodlet 6db656fecf Flip allocate log msgs to debug 2023-05-15 10:00:45 -04:00
Tyler Goodlet c72026091e Remote `Context` cancellation semantics rework B)
This adds remote cancellation semantics to our `tractor.Context`
machinery to more closely match that of `trio.CancelScope` but
with operational differences to handle the nature of parallel tasks interoperating
across multiple memory boundaries:

- if an actor task cancels some context it has opened via
  `Context.cancel()`, the remote (scope linked) task will be cancelled
  using the normal `CancelScope` semantics of `trio` meaning the remote
  cancel scope surrounding the far side task is cancelled and
  `trio.Cancelled`s are expected to be raised in that scope as per
  normal `trio` operation, and in the case where no error is raised
  in that remote scope, a `ContextCancelled` error is raised inside the
  runtime machinery and relayed back to the opener/caller side of the
  context.
- if any actor task cancels a full remote actor runtime using
  `Portal.cancel_actor()` the same semantics as above apply except every
  other remote actor task which also has an open context with the actor
  which was cancelled will also be sent a `ContextCancelled` **but**
  with the `.canceller` field set to the uid of the original cancel
  requesting actor.

This changeset also includes a more "proper" solution to the issue of
"allowing overruns" during streaming without attempting to implement any
form of IPC streaming backpressure. Implementing task-granularity
backpressure cross-process turns out to be more or less impossible
without augmenting out streaming protocol (likely at the cost of
performance). Further allowing overruns requires special care since
any blocking of the runtime RPC msg loop task effectively can block
control msgs such as cancels and stream terminations.

The implementation details per abstraction layer are as follows.

._streaming.Context:
- add a new contructor factor func `mk_context()` which provides
  a strictly private init-er whilst allowing us to not have to define
  an `.__init__()` on the type def.
- add public `.cancel_called` and `.cancel_called_remote` properties.
- general rename of what was the internal `._backpressure` var to
  `._allow_overruns: bool`.
- move the old contents of `Actor._push_result()` into a new
  `._deliver_msg()` allowing for better encapsulation of per-ctx
  msg handling.
 - always check for received 'error' msgs and process them with the new
   `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to
   the local task, thus guaranteeing error and cancellation handling
   despite any overflow handling.
- add a new `._drain_overflows()` task-method for use with new
  `._allow_overruns: bool = True` mode.
 - add back a `._scope_nursery: trio.Nursery` (allocated in
   `Portal.open_context()`) who's sole purpose is to spawn a single task
   which runs the above method; anything else is an error.
 - augment `._deliver_msg()` to start a task and run the above method
   when operating in no overrun mode; the task queues overflow msgs and
   attempts to send them to the underlying mem chan using a blocking
   `.send()` call.
 - on context exit, any existing "drainer task" will be cancelled and
   remaining overflow queued msgs are discarded with a warning.
- rename `._error` -> `_remote_error` and set it in a new method
  `_maybe_cancel_and_set_remote_error()` which is called before
  processing
- adjust `.result()` to always call `._maybe_raise_remote_err()` at its
  start such that whenever a `ContextCancelled` arrives we do logic for
  whether or not to immediately raise that error or ignore it due to the
  current actor being the one who requested the cancel, by checking the
  error's `.canceller` field.
 - set the default value of `._result` to be `id(Context()` thus avoiding
   conflict with any `.result()` actually being `False`..

._runtime.Actor:
- augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to
  take a `requesting_uid: tuple` indicating the source actor of every
  cancellation request.
- pass through the new `Context._allow_overruns` through `.get_context()`
- call the new `Context._deliver_msg()` from `._push_result()` (since
  the factoring out that method's contents).

._runtime._invoke:
- `TastStatus.started()` back a `Context` (unless an error is raised)
  instead of the cancel scope to make it easy to set/get state on that
  context for the purposes of cancellation and remote error relay.
- always raise any remote error via `Context._maybe_raise_remote_err()`
  before doing any `ContextCancelled` logic.
- assign any `Context._cancel_called_remote` set by the `requesting_uid`
  cancel methods (mentioned above) to the `ContextCancelled.canceller`.

._runtime.process_messages:
- always pass a `requesting_uid: tuple` to `Actor.cancel()` and
  `._cancel_task` to that any corresponding `ContextCancelled.canceller`
  can be set inside `._invoke()`.
2023-05-15 10:00:45 -04:00
Tyler Goodlet 90e41016b9 Only tuplize `.canceller` if non-`None` 2023-05-15 10:00:45 -04:00
Tyler Goodlet f54c415060 Move `NoRuntime` import inside `current_actor()` to avoid cycle 2023-05-15 10:00:45 -04:00
Tyler Goodlet 67f82c6ebd Add new remote error introspection attrs
To handle both remote cancellation this adds `ContextCanceled.canceller:
tuple` the uid of the cancel requesting actor and is expected to be set
by the runtime when servicing any remote cancel request. This makes it
possible for `ContextCancelled` receivers to know whether "their actor
runtime" is the source of the cancellation.

Also add an explicit `RemoteActor.src_actor_uid` which better formalizes
the notion of "which remote actor" the error originated from.

Both of these new attrs are expected to be packed in the `.msgdata` when
the errors are loaded locally.
2023-05-15 10:00:45 -04:00
Tyler Goodlet 220b244508 Log waiter task cancelling msg as cancel-level 2023-05-15 10:00:45 -04:00
Tyler Goodlet 831790377b Assign `RemoteActorError` boxed error type for context cancelleds 2023-05-15 10:00:45 -04:00
Tyler Goodlet e80e0a551f Change a bunch of log levels to cancel, including any `ContextCancelled` handling 2023-05-15 10:00:45 -04:00
Tyler Goodlet b3f9251eda Add some log-level method doc-strings 2023-05-15 10:00:45 -04:00
Tyler Goodlet 903537ce04 Tweak context doc str 2023-05-15 10:00:45 -04:00
Tyler Goodlet d75343106b More single doc-strs in discovery mod 2023-05-15 10:00:45 -04:00
Tyler Goodlet cfb2bc0fee Enable `Context` backpressure by default; avoid startup race-crashes? 2023-05-15 10:00:45 -04:00
Tyler Goodlet 1c3893a383 Drop commented `pdbpp` import logic 2023-05-15 09:01:55 -04:00
Tyler Goodlet 79622bbeea Restore `breakpoint()` hook after runtime exits
Previously we were leaking our (pdb++) override into the Python runtime
which would always result in a runtime error whenever `breakpoint()` is
called outside our runtime; after exit of the root actor . This
explicitly restores any previous hook override (detected during startup)
or deletes the hook and restores the environment if none existed prior.

Also adds a new WIP debugging example script to ensure breakpointing
works as normal after runtime close; this will be added to the test
suite.
2023-05-15 00:47:29 -04:00
Tyler Goodlet 95535b2226 Some more 3.10+ optional type sigs 2023-05-15 00:47:29 -04:00
Tyler Goodlet ae4ff5dc8d pdbp: adding typing to config settings vars 2023-05-14 22:38:46 -04:00
Tyler Goodlet 705538398f `pdbp`: turn off line truncating by default, fixes terminal resizing stuff 2023-05-14 22:38:16 -04:00
Tyler Goodlet 86aef5238d Hide actor nursery exit frame 2023-05-14 21:24:26 -04:00
Tyler Goodlet cc82447db6 First try: switch debug machinery over to `pdbp` B) 2023-05-14 21:24:26 -04:00
Tyler Goodlet 23cffbd940 Use multiline import for debug mod 2023-05-14 21:24:26 -04:00
Tyler Goodlet f667d16d66 Copy the now deprecated `trio.Process.aclose()`
Move it into our `_spawn.do_hard_kill()` since we do indeed rely on
the particular process killing sequence on "soft kill" failure cases.
2023-05-14 19:31:50 -04:00
Tyler Goodlet 24a062341e Just call `trio.Process.aclose()` directly for now? 2023-04-02 14:34:41 -04:00
Tyler Goodlet 8637778739 Expose `raise_on_lag: bool` flag through factory 2023-01-30 12:18:23 -05:00
Tyler Goodlet 47166e45f0 Be explicit with passthrough kwargs (there's so few) 2023-01-29 17:31:21 -05:00
Tyler Goodlet 4ce2dcd12b Switch back to raising `Lagged` by default
Makes the broadcast test suite not hang xD, and is our expected default
behaviour. Also removes a ton of commented legacy cruft from before the
refactor to remove the `.receive()` recursion and fixes some typing.

Oh right, and in the case where there's only one subscriber left we warn
log about it since in theory we could actually entirely unwind the
bcaster back to the original underlying, though not sure if that's sane
or works for some use cases (like wanting to have some other subscriber
get added dynamically later).
2023-01-29 15:03:34 -05:00
Tyler Goodlet 80f983818f Ignore monkey patched `.send()` type annot 2023-01-29 15:03:34 -05:00
Tyler Goodlet 6ba29f8d56 Recurse and get the last value when in warn mode 2023-01-29 15:03:34 -05:00
Tyler Goodlet 2707a0e971 Add `._raise_on_lag` flag to disable `Lag` raising 2023-01-29 15:03:34 -05:00
Tyler Goodlet 9f9907271b Merge `ReceiveMsgStream` and `MsgStream`
Since one-way streaming can be accomplished by just *not* sending on one
side (and/or thus wrapping such usage in a more restrictive API), we
just drop the recv-only parent type. The only method different was
`MsgStream.send()`, now merged in. Further in usage of `.subscribe()`
we monkey patch the underlying stream's `.send()` onto the delivered
broadcast receiver so that subscriber tasks can two-way stream as though
using the stream directly.

This allows us to more definitively drop `tractor.open_stream_from()` in
the longer run if we so choose as well; note currently this will
potentially create an issue if a caller tries to `.send()` on such a one
way stream.
2023-01-29 15:03:34 -05:00
Tyler Goodlet c2367c1c5e Better `trio`-ize `BroadcastReceiver` internals
Driven by a bug found in `piker` where we'd get an inf recursion error
due to `BroadcastReceiver.receive()` being called when consumer tasks
are awoken but no value is ready to `.nowait_receive()`.

This new rework takes an approach closer to the interface and internals
of `trio.MemoryReceiveChannel` particularly in terms of,

- implementing a `BroadcastReceiver.receive_nowait()` and using it
  within the async `.receive()`.
- failing over to an internal `._receive_from_underlying()` when the
  `_nowait()` call raises `trio.WouldBlock`.
- adding `BroadcastState.statistics()` for debugging and testing
  dropping recursion from `.receive()`.
2023-01-29 15:03:34 -05:00
Tyler Goodlet 13c9eadc8f Move result log msg up and drop else block 2023-01-29 14:55:02 -05:00
Tyler Goodlet aa4871b13d Call `MsgStream.aclose()` in `Context.open_stream.__aexit__()`
We weren't doing this originally I *think* just because of the path
dependent nature of the way the code was developed (originally being
mega pedantic about one-way vs. bidirectional streams) but, it doesn't
seem like there's any issue just calling the stream's `.aclose()`; also
have the benefit of just being less code and logic checks B)
2023-01-29 14:55:02 -05:00
Tyler Goodlet 556f4626db Tweak warning msg for still-alive-after-cancelled actor 2023-01-29 14:55:02 -05:00
Tyler Goodlet df01294bb2 Show more functiony syntax in ctx-cancelled log msgs 2023-01-29 14:55:02 -05:00
Tyler Goodlet ddf3d0d1b3 Show tracebacks for un-shipped/propagated errors 2023-01-29 14:55:02 -05:00
Tyler Goodlet 97d5f7233b Fix uid2nursery lookup table type annot 2023-01-29 14:55:02 -05:00
Tyler Goodlet d27c081a15 Ensure arbiter sockaddr type before usage 2023-01-29 14:55:02 -05:00
Tyler Goodlet a4874a3227 Always set the `parent_exit: trio.Event` on exit 2023-01-29 14:55:02 -05:00
Tyler Goodlet de04bbb2bb Don't raise on a broken IPC-context when sending stop msg 2023-01-29 14:55:02 -05:00
Tyler Goodlet 4f977189c0 Handle broken mem chan on `Actor._push_result()`
When backpressure is used and a feeder mem chan breaks during msg
delivery (usually because the IPC allocating task already terminated)
instead of raising we simply warn as we do for the non-backpressure
case.

Also, add a proper `Actor.is_arbiter` test inside `._invoke()` to avoid
doing an arbiter-registry lookup if the current actor **is** the
registrar.
2023-01-29 14:55:02 -05:00
Tyler Goodlet 121a8cc891 Drop `Optional` usage from root mod 2023-01-26 16:00:08 -05:00
Tyler Goodlet c54b8ca4ba Begin deprecation of `arbiter_addr` -> `registry_addr` 2023-01-26 16:00:08 -05:00
Tyler Goodlet 5b8a87d0f6 Slightly better `xonsh` check hack, fix typing 2023-01-26 15:48:15 -05:00
Tyler Goodlet 2e278ceb74 Add a super hacky check for `xonsh`, smh.. 2023-01-26 15:26:43 -05:00
Tyler Goodlet dba8118553 Always attempt prompt redraw on ctl-c in REPL
The stdlib has all sorts of muckery with ignoring SIGINT in the
`Pdb._cmdloop()` but here we just override all that since we don't trust
their decisions about cancellation handling whatsoever. Adds
a `Lock.repl: MultiActorPdb` attr which is set by any task which
acquires root TTY lock indicating (via actor global state) that the
current actor is using the debugger REPL and can be expected to re-draw
the prompt on SIGINT. Further we mask out log messages from any actor
who also has the `shield_sigint_handler()` enabled to avoid logging
noise when debugging.
2023-01-26 12:44:13 -05:00
Tyler Goodlet fca2e7c10e Simplify closed abruptly log msg 2023-01-26 12:44:13 -05:00
Tyler Goodlet 5ed62c5c54 Add note about intermediary-actor in debug issue 2023-01-26 12:44:13 -05:00
Tyler Goodlet 6c8cacc9d1 Adjust all default is `None` annots (per new `mypy`) 2022-12-12 13:18:22 -05:00
Tyler Goodlet 38326e8c15 Avoid error on context double pops 2022-12-11 23:46:33 -05:00
Tyler Goodlet b5192cca8e Always greedily `list`-cast`mngrs` input sequence 2022-12-11 23:20:58 -05:00
Tyler Goodlet c606be8c64 Passthrough runtime kwargs from `open_actor_cluster()` 2022-12-11 19:56:08 -05:00
Tyler Goodlet f2641c8964 Avoid "task never called `.started()`" runtime erros when cancelling 2022-10-14 19:42:23 -04:00
Tyler Goodlet f39414ce12 Drop error-repacking for `.run_in_actor()`s block
If we pack the nursery parent task's error into the `errors` table
directly in the handler, we don't need to specially handle packing that
same error into any exception group raised while handling sub-actor
cancellation; drops some ugly indentation ;)
2022-10-14 19:42:23 -04:00
Tyler Goodlet e298b70edf Drop added `.pdp()` level msgs used duringn dev 2022-10-14 19:42:23 -04:00
Tyler Goodlet 38f9d35dee Fix errors table type annot 2022-10-14 19:42:23 -04:00
Tyler Goodlet 88448f7281 Fix handler type annot 2022-10-14 19:42:23 -04:00
Tyler Goodlet 0956d5f461 Restore the `trio` SIGINT handler, cancel root lock tasks on no-peers
Pretty sure this is the final touch to alleviate all our debug lock
headaches! Instead of trying to revert to the "last" handler (as `pdb`
does internally in the stdlib) we always just revert to the handler
`trio` registers during startup. Further this seems to allow cancelling
the root-side locking task if it's detected as stale IFF we only do this
when the root actor is in a "no more IPC peers" state.

Deatz:
- (always) set `._debug.Lock._trio_handler` as the `trio` version, not
  some last used handler to make sure we're getting the ctrl-c handling
  we want when not in debug mode.
- assign the trio handler in `open_root_actor()`
  `._runtime._async_main()` to be sure it's applied in subactors as well
  as the root.
- only do debug lock blocking and root-side-locking-task cancels when
  a "no peers" condition is detected in the root actor: i.e. no IPC
  channels are detected by the root meaning it's impossible any actor
  has a sane lock-state ongoing for debug mode.
2022-10-14 18:18:01 -04:00
Tyler Goodlet 33f2234baf Hide some stack layers the user doesn't really need to see 2022-10-14 18:18:01 -04:00
Tyler Goodlet 7521bded3d Pack error from the parent task into the actor nursery 2022-10-14 18:16:51 -04:00
Tyler Goodlet 50fe098e06 First pass, swap `MultiError` for `BaseExceptionGroup` 2022-10-14 18:16:51 -04:00
Tyler Goodlet 98056f6ed7 Move logging context map into `log.py` module 2022-10-12 12:46:20 -04:00
Tyler Goodlet b81b6be98a Drop extra log msgs, some old commented code 2022-10-12 12:35:35 -04:00
Tyler Goodlet fb721f36ef Support debug-lock blocking, use on no-more IPC
This is a lingering debugger locking race case we needed to handle:

- child crashes acquires TTY lock in root and attaches to `pdb`
- child IPC goes down such that all channels to the root are broken
  / non-functional.
- root is stuck thinking the child is still in debug even though it
  can't be contacted and the child actor machinery hasn't been
  cancelled by its parent.
- root get's stuck in deadlock with child since it won't send a cancel
  request until the child is finished debugging, but the child can't
  unlock the debugger bc IPC is down.

To avoid this scenario add debug lock blocking list via
`._debug.Lock._blocked: set[tuple]` which holds actor uids for any actor
that is detected by the root as having no transport channel connections
with said root (of which at least one should exist if this sub-actor at
some point acquired the debug lock). The root consequently checks this
list for any actor that tries to (re)acquire the lock and blocks with
a `ContextCancelled`. When a debug condition is tested in
`._runtime._invoke` the context's `._enter_debugger_on_cancel` which
is set to `False` if the actor is on the block list in which case the
post-mortem entry is skipped.

Further this adds a root-locking-task side cancel scope to
`Lock._root_local_task_cs_in_debug` which can be cancelled by the root
runtime when a stale lock is detected after all IPC channels for the
actor have been torn down. NOTE: right now we're NOT doing this since it
seems to cause test failures likely due because it may cause pre-mature
cancellation and maybe needs a bit more experimenting?
2022-10-11 20:00:05 -04:00
Tyler Goodlet 734d8dd663 Move `trio` scope outside first inter-task-chan receive 2022-10-11 20:00:05 -04:00
Tyler Goodlet 1c480e6c92 Add `Context` cancel message and debug toggle flag
In the case of a callee-side context cancelling itself it can be handy
to let the caller-side task know (even if through logging) that the
cancel was due to some known reason. Make `.cancel()` accept such
a message on the callee side and have it included in the
`._runtime._invoke()` raised `ContextCancelled` emission.

Also add a `Context._trigger_debugger_on_cancel: bool` flag which can be
set to `False` to avoid the debugger post-mortem crash mode from
engaging on cross-context tasks which cancel themselves for a known
reason (as is needed for blocked tasks in the debug TTY-lock machinery).
2022-10-11 20:00:05 -04:00
Tyler Goodlet 44b59f3338 Go back to a `global` single-ton nursery per actor
Turns out the lifetime mgmt of separate nurseries per delegate manager
is tricky; a new nursery can't be naively allocated on cache-misses since
it may get closed by some early terminating task instead of by the "last
using" consumer task. In theory if we allocate using the same logic as
that used for the last-task-triggers-exit then this should work?

For now just go back to a single global nursery per `_Cache` which still
avoids use of the internal actor service nursery.
2022-10-09 21:27:23 -04:00
Tyler Goodlet 7a719ac2a7 Use one nursery per unique manager (signature)
Instead of sticking all `trionics.maybe_open_context()` tasks inside the
actor's (root) service nursery, open a unique one per manager function
instance (id).

Further, accept a callable for the `key` such that a user can have
more flexible control on the caching logic and move the
`maybe_open_nursery()` helper out of the portal mod and into this
trionics "managers" module.
2022-10-09 21:27:23 -04:00
Tyler Goodlet d24fae8381 'Rename mp spawn methods to have a `'mp_'` prefix' 2022-10-09 17:54:55 -04:00
Tyler Goodlet 5ab98513b7 Move `@tractor_test` into `conftest.py` 2022-10-09 17:14:20 -04:00
Tyler Goodlet 90f4912580 Organize process spawning into lookup table
Instead of the logic branching create a table `._spawn._methods`
which is used to lookup the desired backend framework (in this case
still only one of `multiprocessing` or `trio`) and make the top level
`.new_proc()` do the lookup and any common logic. Use a `typing.Literal`
to define the lookup table's key set.

Repair and ignore a bunch of type-annot related stuff todo with `mypy`
updates and backend-specific process typing.
2022-10-09 16:51:21 -04:00
Tyler Goodlet 15047341bd Ignore forserver override attrs with `mypy` 2022-10-09 16:14:11 -04:00
Tyler Goodlet e609183242 Expose lifetime stack as class attr, add base test suite 2022-09-15 23:50:15 -04:00
Tyler Goodlet 10eeda2d2b Use built-ins for all data-structure-type annotations 2022-09-15 23:41:28 -04:00
Tyler Goodlet ad19bf2cf1 Remove `tractor.run()` once and for all
It's been deprecated for a while now and all docs and tests have been
changed.

Closes #183
2022-09-15 23:41:28 -04:00
Tyler Goodlet 9aef03772a Expose `Actor` at pkg level, adjust debug type annots 2022-09-15 23:41:28 -04:00
Tyler Goodlet 7548dba8f2 Change to new doc string style 2022-09-15 23:41:28 -04:00
Tyler Goodlet 208d56af2c Make `async_main()` a module func 2022-09-15 23:41:28 -04:00
Tyler Goodlet a3a5bc267e Make `process_messages()` a mod func 2022-09-15 23:41:28 -04:00
Tyler Goodlet d4084b2032 Rename our core module to `_runtime` 2022-09-15 23:41:28 -04:00
Tyler Goodlet bafd10a260 Make `maybe_open_context()` re-entrant safe, use per factory locks 2022-09-15 19:02:02 -04:00
Tyler Goodlet 5ad540c417 Add debug complete event `None`-guard for when already reset 2022-09-15 19:02:02 -04:00
Tyler Goodlet 8f1fe2376a Simplify all hooks to a common `Lock.release()` 2022-08-02 18:14:05 -04:00
Tyler Goodlet 650313dfef Drop legacy handler blocks factored into `_acquire_debug_lock()` 2022-08-02 12:50:27 -04:00
Tyler Goodlet e4006da6f4 Drop `pdbpp` bug notes, add follow up issue #320 note 2022-08-02 12:48:40 -04:00
Tyler Goodlet 7f6169a050 Drop legacy commented/todo remote debug helper block 2022-08-02 12:43:14 -04:00
Tyler Goodlet 02c3b9a672 Put `pygments` back to default 2022-08-02 12:17:34 -04:00
Tyler Goodlet c5c7a9027c Line len lint and drop rpc log msg level again 2022-08-02 12:17:34 -04:00
Tyler Goodlet 937ed99e39 Factor sigint overriding into lock methods 2022-08-02 12:17:28 -04:00
Tyler Goodlet 91f034a136 Move all module vars into a `Lock` type 2022-08-02 12:17:28 -04:00
Tyler Goodlet 6f01c78122 Disable `pygments` highlighting on ctlc tests 2022-08-02 12:17:28 -04:00
Tyler Goodlet c0cd99e374 Timeout on arbiter ping, avoid TCP SYN hangs in CI? 2022-08-02 12:17:28 -04:00
Tyler Goodlet b01daa5319 Factor lock-state release logic into helper
The common logic to both remove our custom SIGINT handler as well
as signal the actor global event that pdb is complete. Call this
whenever we exit a post mortem call and thus any time some rpc task
get's debugged inside `._actor._invoke()`.

Further, we have to manually print the REPL prompt on 3.9 for some wack
reason, so stick a version guard in the sigint handler for that..
2022-08-02 12:17:28 -04:00
Tyler Goodlet bd362a05f0 Run release hook around `next` repl commands as well 2022-08-02 12:17:28 -04:00
Tyler Goodlet b21f2e16ad Always consider the debugger when exiting contexts
When in an uncertain teardown state and in debug mode a context can be
popped from actor runtime before a child finished debugging (the case
when the parent is tearing down but the child hasn't closed/completed
its tty lock IPC exit phase) and the child sends the "stop" message to
unlock the debugger but it's ignored bc the parent has already dropped
the ctx. Instead we call `._debug.maybe_wait_for_deugger()` before these
context removals to avoid the root getting stuck thinking the lock was
never released.

Further, add special `Actor._cancel_task()` handling code inside
`_invoke()` which continues to execute the method despite the IPC
channel to the caller being broken and thus avoiding potential hangs due
to a target (child) actor task remaining alive.
2022-08-02 12:17:28 -04:00
Tyler Goodlet ba7b355d9c Add note about default behaviour of `fancycompleter` 2022-08-02 12:17:28 -04:00
Tyler Goodlet ef8dc0204c Just drop all longlisting for now and leave comments 2022-08-02 12:17:28 -04:00
Tyler Goodlet a101971027 Go back to original longlist code 2022-08-02 12:17:28 -04:00
Tyler Goodlet 835836123b Just don't call longlist on 3.10+ for now 2022-08-02 12:17:28 -04:00
Tyler Goodlet b9eb601265 General typing fixes for `mypy` 2022-08-02 12:17:27 -04:00
Tyler Goodlet 4dcc21234e Only call `.poll()` if a method on the spawn backend 2022-08-02 12:17:27 -04:00
Tyler Goodlet 8b9f342eef Port to new `.lowlevel.open_process()` API 2022-08-02 12:17:27 -04:00
Tyler Goodlet a90ca4b384 Call longlist normally when on py < 3.10 2022-08-02 12:17:06 -04:00
Tyler Goodlet d0dcd55f47 Only report disconnected actors if proc is still alive? 2022-08-02 12:17:06 -04:00
Tyler Goodlet 519f4c300b I dunno, seems like `breakpoint()` needs this? 2022-08-02 12:17:06 -04:00
Tyler Goodlet ff3f5959e9 Always enable debug level logging if mode enabled 2022-08-02 12:16:58 -04:00
Tyler Goodlet abb00531d3 Add help msg for non `__main__` modules as well 2022-08-02 12:16:58 -04:00
Tyler Goodlet 18c525d2f1 Hack around double long list print issue..
See https://github.com/pdbpp/pdbpp/issues/496
2022-08-02 12:16:58 -04:00
Tyler Goodlet e2453fd3da Add spaces before values in log msg 2022-08-02 12:16:58 -04:00
Tyler Goodlet b29def8b5d Add runtime level msg around channel draining 2022-08-02 12:16:58 -04:00
Tyler Goodlet f07e9dbb2f Always undo SIGINT overrides, cancel detached children
Ensure that even when `pdb` resumption methods are called during a crash
where `trio`'s runtime has already terminated (eg. `Event.set()` will
raise) we always revert our sigint handler to the original. Further
inside the handler if we hit a case where a child is in debug and
(thinks it) has the global pdb lock, if it has no IPC connection to
a parent, simply presume tty sync-coordination is now lost and cancel
the child immediately.
2022-08-02 12:16:49 -04:00
Tyler Goodlet c7035be2fc Tolerate double `.remove()`s of stream on portal teardowns 2022-07-27 11:40:02 -04:00
Tyler Goodlet deaca7d6cc Always propagate SIGINT when no locking peer found
A hopefully significant fix here is to always avoid suppressing a SIGINT
when the root actor can not detect an active IPC connections (via
a connected channel) to the supposed debug lock holding actor. In that
case it is most likely that the actor has either terminated or has lost
its connection for debugger control and there is no way the root can
verify the lock is in use; thus we choose to allow KBI cancellation.

Drop the (by comment) `try`-`finally` block in
`_hijoack_stdin_for_child()` around the `_acquire_debug_lock()` call
since all that logic should now be handled internal to that locking
manager. Try to catch a weird error around the `.do_longlist()` method
call that seems to sometimes break on py3.10 and latest `pdbpp`.
2022-07-27 11:40:02 -04:00
Tyler Goodlet d47d0e7c37 Always call pdb hook even if tty locking fails 2022-07-27 11:40:02 -04:00
Tyler Goodlet 0062c96a3c Log cancels with appropriate level 2022-07-27 11:40:02 -04:00
Tyler Goodlet 4be13b7387 Just warn on IPC breaks 2022-07-27 11:40:02 -04:00
Tyler Goodlet 7bb5addd4c Only warn on `trio.BrokenResourceError`s from `_invoke()` 2022-07-27 11:40:02 -04:00
Tyler Goodlet 89b44f8163 Pre-declare disconnected flag 2022-07-27 11:40:02 -04:00
Tyler Goodlet 2819b6a5b2 Avoid attr error XD 2022-07-27 11:40:02 -04:00
Tyler Goodlet f2671ed026 Type annot updates 2022-07-27 11:40:02 -04:00
Tyler Goodlet 41924c86a6 Drop uneeded backframe traceback hide annotation 2022-07-27 11:40:02 -04:00
Tyler Goodlet 206c7c0720 Make `Actor._process_messages()` report disconnects
The method now returns a `bool` which flags whether the transport died
to the caller and allows for reporting a disconnect in the
channel-transport handler task. This is something a user will normally
want to know about on the caller side especially after seeing
a traceback from the peer (if in tree) on console.
2022-07-27 11:40:02 -04:00
Tyler Goodlet bf0ac3116c Only cancel/get-result from a ctx if transport is up
There's no point in sending a cancel message to the remote linked task
and especially no reason to block waiting on a result from that task if
the transport layer is detected to be disconnected. We expect that the
transport shouldn't go down at the layer of the message loop
(reconnection logic should be handled in the transport layer itself) so
if we detect the channel is not connected we don't bother requesting
cancels nor waiting on a final result message.

Why?

- if the connection goes down in error the caller side won't have a way
  to know "how long" it should block to wait for a cancel ack or result
  and causes a potential hang that may require an additional ctrl-c from
  the user especially if using the debugger or if the traceback is not
  seen on console.
- obviously there's no point in waiting for messages when there's no
  transport to deliver them XD

Further, add some more detailed cancel logging detailing the task and
actor ids.
2022-07-27 11:40:02 -04:00
Tyler Goodlet 74b819a857 Typing fixes, simplify `_set_trace()` 2022-07-27 11:40:02 -04:00
Tyler Goodlet 8892204c84 Add notes around py3.10 stdlib bug from `pdb++`
There's a bug that's triggered in the stdlib without latest `pdb++`
installed; add a note for that.

Further inside `wait_for_parent_stdin_hijack()` don't `.started()` until
the interactor stream has been opened to avoid races when debugging this
`._debug.py` module (at the least) since we usually don't want the
spawning (parent) task to resume until we know for sure the tty lock has
been acquired. Also, drop the random checkpoint we had inside
`_breakpoint()`, not sure it was actually adding anything useful since
we're (mostly) carefully shielded throughout this func.
2022-07-27 11:40:02 -04:00
Tyler Goodlet 8f4bbf1cbf Add and use a pdb instance factory 2022-07-27 11:40:02 -04:00
Tyler Goodlet aea8f63bae Drop all the `@cm.__exit__()` override attempts..
None of it worked (you still will see `.__exit__()` frames on debugger
entry - you'd think this would have been solved by now but, shrug) so
instead wrap the debugger entry-point in a `try:` and put the SIGINT
handler restoration inside `MultiActorPdb` teardown hooks.

This seems to restore the UX as it was prior but with also giving the
desired SIGINT override handler behaviour.
2022-07-27 11:40:02 -04:00
Tyler Goodlet 7964a9f6f8 Try overriding `_GeneratorContextManager.__exit__()`; didn't work..
Using either of `@pdb.hideframe` or `__tracebackhide__` on stdlib
methods doesn't seem to work either.. This all seems to have something
to do with async generator usage I think ?
2022-07-27 11:40:02 -04:00
Tyler Goodlet e5195264a1 Handle a context cancel? Might be a noop 2022-07-27 11:40:02 -04:00
Tyler Goodlet 345573e602 Make `mypy` happy 2022-07-27 11:40:02 -04:00
Tyler Goodlet 4e60c17375 Refine the handler for child vs. root cases
This gets very close to avoiding any possible hangs to do with tty
locking and SIGINT handling minus a special case that will be detailed
below.

Summary of implementation changes:

- convert `_mk_pdb()` -> `with _open_pdb() as pdb:` which implicitly
  handles the `bdb.BdbQuit` case such that debugger teardown hooks are
  always called.
- rename the handler to `shield_sigint()` and handle a variety of new
  cases:
  * the root is in debug but hasn't been cancelled -> call
    `Actor.cancel_soon()`
  * the root is in debug but *has* been called (`Actor.cancel_soon()`
    already called) -> raise KBI
  * a child is in debug *and* has a task locking the debugger -> ignore
    SIGINT in child *and* the root actor.
- if the debugger instance is provided to the handler at acquire time,
  on SIGINT handling completion re-print the last pdb++ REPL output so
  that the user realizes they are still actively in debug.
- ignore the unlock case where a race condition of "no task" holding the
  lock causes the `RuntimeError` normally associated with the "wrong
  task" doing so (not sure if this is a `trio` bug?).
- change debug logs to runtime level.

Unhandled case(s):

- a child is maybe in debug mode but does not itself have any task using
  the debugger.
    * ToDo: we need a way to decide what to do with
      "intermediate" child actors who themselves either are not in
      `debug_mode=True` but have children who *are* such that a SIGINT
      won't cause cancellation of that child-as-parent-of-another-child
      **iff** any of their children are in in debug mode.
2022-07-27 11:40:02 -04:00
Tyler Goodlet 6b7b58346f (facepalm) Reraise `BdbQuit` and discard ownerless lock releases 2022-07-27 11:40:02 -04:00
Tyler Goodlet 3cac323421 Add WIP while-debugger-active SIGINT ignore handler 2022-07-27 11:40:02 -04:00
goodboy 4902e184e9
Merge pull request #318 from goodboy/aio_error_propagation
Add context test that opens an inter-task-channel that errors
2022-07-15 12:42:19 -04:00
Tyler Goodlet 05790a20c1 Slight lint fixes 2022-07-15 11:18:48 -04:00
Tyler Goodlet f0d78e1a6e Use local task ref, fixes `mypy` 2022-07-15 10:39:49 -04:00
Tyler Goodlet 0906559ed9 Drop manual stack construction, fix attr typo 2022-07-14 20:43:17 -04:00
Tyler Goodlet 38d03858d7 Fix `asyncio`-task-sync and error propagation
This fixes an previously undetected bug where if an
`.open_channel_from()` spawned task errored the error would not be
propagated to the `trio` side and instead would fail silently with
a console log error. What was most odd is that it only seems easy to
trigger when you put a slight task sleep before the error is raised
(:eyeroll:). This patch adds a few things to address this and just in
general improve iter-task lifetime syncing:

- add `LinkedTaskChannel._trio_exited: bool` a flag set from the `trio`
  side when the channel block exits.
- add a `wait_on_aio_task: bool` flag to `translate_aio_errors` which
  toggles whether to wait the `asyncio` task termination event on exit.
- cancel the `asyncio` task if the trio side has ended, when
  `._trio_exited == True`.
- always close the `trio` mem channel when the task exits such that
  the `asyncio` side can error on any next `.send()` call.
2022-07-14 16:35:41 -04:00
Tyler Goodlet 41983edc43 Use `str` | `bytes` union for typing msg dump 2022-07-12 11:59:11 -04:00
Tyler Goodlet 5168700fbf Tolerate non-decode-able bytes 2022-07-12 11:55:55 -04:00
Tyler Goodlet 673c4a8c66 Decode bytes prior to log msg 2022-07-12 11:55:55 -04:00
Tyler Goodlet 932b841176 Allow up to 4 `msgpsec` decode failures 2022-07-12 11:55:55 -04:00
Tyler Goodlet f594f1bdda Handle a connection reset on `msgspec` transport 2022-07-12 11:55:55 -04:00
Tyler Goodlet 4e7ab54452 Appease `mypy` 2022-07-12 11:22:30 -04:00
Tyler Goodlet f94b7cd991 Drop `msgpack` lib and use `msgspec` for transport 2022-07-12 10:37:13 -04:00
Tyler Goodlet 8901272854 Fix typing 2022-04-13 08:20:53 -04:00
Tyler Goodlet 80897a8f2b Add `tractor.query_actor()` an addr looker-upper
Sometimes it's handy to just have a non-`Portal` yielding way
to figure out if a "service" actor is up, so add this discovery
helper for that. We'll prolly just leave it undocumented for
now until we figure out a longer-term/better discovery system.
2022-04-13 07:50:42 -04:00
Tyler Goodlet f3606d5bd8 Type fixes 2022-04-12 11:48:32 -04:00
Tyler Goodlet c322a193f2 Make `LinkedTaskChannel` trio-task-broadcastable with `.subscribe()` 2022-04-12 11:42:44 -04:00
Tyler Goodlet 46963c2e63 Don't handle `GeneratorExit` on `asyncio` tasks 2022-04-12 11:42:44 -04:00
Tyler Goodlet 9b77b8c9ee Add more explicit `asyncio` task error logging
When an `asyncio` side task errors or is cancelled we now explicitly
report the traceback and task name if possible as well as the source
reason for the error (some come from the `trio` side).

Further, properly set any `trio` side exception (after unwrapping it
from the `outcome.Error`) on the future that runs the `trio` guest run.
2022-04-12 11:42:44 -04:00
Tyler Goodlet c30cece37a Fix one missing import/ref 2022-02-17 13:03:37 -05:00
Tyler Goodlet 509082c935 Port to new `msgspec` error type 2022-02-17 11:55:26 -05:00
Tyler Goodlet 75bb1added Avoid importing mp for as long as possible 2022-02-17 11:55:26 -05:00
Tyler Goodlet 76a0492028 Fix type annot 2022-02-15 08:52:04 -05:00
Tyler Goodlet 4eab4a0213 Type fix 2022-02-15 08:51:25 -05:00
Tyler Goodlet 0edc6a26bc Go back to strict map keys 2022-02-15 08:48:43 -05:00