Commit Graph

1451 Commits (ec4b72259d015651746d87e51b06db02f3d72243)

Author SHA1 Message Date
Tyler Goodlet ec4b72259d Drop brackpressure usage from fan out tests 2023-04-14 16:35:25 -04:00
Tyler Goodlet e97ed377b0 Remote `Context` cancellation semantics rework B)
This adds remote cancellation semantics to our `tractor.Context`
machinery to more closely match that of `trio.CancelScope` but
with operational differences to handle the nature of parallel tasks interoperating
across multiple memory boundaries:

- if an actor task cancels some context it has opened via
  `Context.cancel()`, the remote (scope linked) task will be cancelled
  using the normal `CancelScope` semantics of `trio` meaning the remote
  cancel scope surrounding the far side task is cancelled and
  `trio.Cancelled`s are expected to be raised in that scope as per
  normal `trio` operation, and in the case where no error is raised
  in that remote scope, a `ContextCancelled` error is raised inside the
  runtime machinery and relayed back to the opener/caller side of the
  context.
- if any actor task cancels a full remote actor runtime using
  `Portal.cancel_actor()` the same semantics as above apply except every
  other remote actor task which also has an open context with the actor
  which was cancelled will also be sent a `ContextCancelled` **but**
  with the `.canceller` field set to the uid of the original cancel
  requesting actor.

This changeset also includes a more "proper" solution to the issue of
"allowing overruns" during streaming without attempting to implement any
form of IPC streaming backpressure. Implementing task-granularity
backpressure cross-process turns out to be more or less impossible
without augmenting out streaming protocol (likely at the cost of
performance). Further allowing overruns requires special care since
any blocking of the runtime RPC msg loop task effectively can block
control msgs such as cancels and stream terminations.

The implementation details per abstraction layer are as follows.

._streaming.Context:
- add a new contructor factor func `mk_context()` which provides
  a strictly private init-er whilst allowing us to not have to define
  an `.__init__()` on the type def.
- add public `.cancel_called` and `.cancel_called_remote` properties.
- general rename of what was the internal `._backpressure` var to
  `._allow_overruns: bool`.
- move the old contents of `Actor._push_result()` into a new
  `._deliver_msg()` allowing for better encapsulation of per-ctx
  msg handling.
 - always check for received 'error' msgs and process them with the new
   `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to
   the local task, thus guaranteeing error and cancellation handling
   despite any overflow handling.
- add a new `._drain_overflows()` task-method for use with new
  `._allow_overruns: bool = True` mode.
 - add back a `._scope_nursery: trio.Nursery` (allocated in
   `Portal.open_context()`) who's sole purpose is to spawn a single task
   which runs the above method; anything else is an error.
 - augment `._deliver_msg()` to start a task and run the above method
   when operating in no overrun mode; the task queues overflow msgs and
   attempts to send them to the underlying mem chan using a blocking
   `.send()` call.
 - on context exit, any existing "drainer task" will be cancelled and
   remaining overflow queued msgs are discarded with a warning.
- rename `._error` -> `_remote_error` and set it in a new method
  `_maybe_cancel_and_set_remote_error()` which is called before
  processing
- adjust `.result()` to always call `._maybe_raise_remote_err()` at its
  start such that whenever a `ContextCancelled` arrives we do logic for
  whether or not to immediately raise that error or ignore it due to the
  current actor being the one who requested the cancel, by checking the
  error's `.canceller` field.
 - set the default value of `._result` to be `id(Context()` thus avoiding
   conflict with any `.result()` actually being `False`..

._runtime.Actor:
- augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to
  take a `requesting_uid: tuple` indicating the source actor of every
  cancellation request.
- pass through the new `Context._allow_overruns` through `.get_context()`
- call the new `Context._deliver_msg()` from `._push_result()` (since
  the factoring out that method's contents).

._runtime._invoke:
- `TastStatus.started()` back a `Context` (unless an error is raised)
  instead of the cancel scope to make it easy to set/get state on that
  context for the purposes of cancellation and remote error relay.
- always raise any remote error via `Context._maybe_raise_remote_err()`
  before doing any `ContextCancelled` logic.
- assign any `Context._cancel_called_remote` set by the `requesting_uid`
  cancel methods (mentioned above) to the `ContextCancelled.canceller`.

._runtime.process_messages:
- always pass a `requesting_uid: tuple` to `Actor.cancel()` and
  `._cancel_task` to that any corresponding `ContextCancelled.canceller`
  can be set inside `._invoke()`.
2023-04-14 16:35:25 -04:00
Tyler Goodlet 1ec30577de Only tuplize `.canceller` if non-`None` 2023-04-14 16:35:25 -04:00
Tyler Goodlet 0e81350a42 Move `NoRuntime` import inside `current_actor()` to avoid cycle 2023-04-14 16:35:25 -04:00
Tyler Goodlet ac51cf07b9 Augment test cases for callee-returns-result early
Turns out stuff was totally broken in these cases because we're either
closing the underlying mem chan too early or not handling the
"allow_overruns" mode's cancellation correctly..
2023-04-14 16:35:25 -04:00
Tyler Goodlet e16e7ca82a Add new remote error introspection attrs
To handle both remote cancellation this adds `ContextCanceled.canceller:
tuple` the uid of the cancel requesting actor and is expected to be set
by the runtime when servicing any remote cancel request. This makes it
possible for `ContextCancelled` receivers to know whether "their actor
runtime" is the source of the cancellation.

Also add an explicit `RemoteActor.src_actor_uid` which better formalizes
the notion of "which remote actor" the error originated from.

Both of these new attrs are expected to be packed in the `.msgdata` when
the errors are loaded locally.
2023-04-14 16:35:25 -04:00
Tyler Goodlet 7dd5d8d1f8 Add new set of context cancellation tests
These will verify new changes to the runtime/messaging core which allows
us to adopt an "ignore cancel if requested by us" style handling of
`ContextCancelled` more like how `trio` does with
`trio.Nursery.cancel_scope.cancel()`. We now expect
a `ContextCancelled.canceller: tuple` which is set to the actor uid of
the actor which requested the cancellation which eventually resulted in
the remote error-msg.

Also adds some experimental tweaks to the "backpressure" test which it
turns out is very problematic in coordination with context cancellation
since blocking on the feed mem chan to some task will block the ipc msg
loop and thus handling of cancellation.. More to come to both the test
and core to address this hopefully since right now this test is failing.
2023-04-14 16:35:25 -04:00
Tyler Goodlet 8913829511 Log waiter task cancelling msg as cancel-level 2023-04-14 16:35:25 -04:00
Tyler Goodlet 60bff71cd3 Assign `RemoteActorError` boxed error type for context cancelleds 2023-04-14 16:35:25 -04:00
Tyler Goodlet 29a1171142 Change a bunch of log levels to cancel, including any `ContextCancelled` handling 2023-04-14 16:35:25 -04:00
Tyler Goodlet b6c7f423f0 Add some log-level method doc-strings 2023-04-14 16:35:25 -04:00
Tyler Goodlet 4db87d3c43 Tweak context doc str 2023-04-14 16:35:25 -04:00
Tyler Goodlet 5f1e83e741 More single doc-strs in discovery mod 2023-04-14 16:35:25 -04:00
Tyler Goodlet ea2cc9ec75 Enable `Context` backpressure by default; avoid startup race-crashes? 2023-04-14 16:35:25 -04:00
goodboy 649c5e7504
Merge pull request #343 from goodboy/breceiver_internals
Avoid inf recursion in `BroadcastReceiver.receive()`
2023-01-30 14:01:13 -05:00
Tyler Goodlet 203f95615c Add nooz 2023-01-30 12:42:26 -05:00
Tyler Goodlet efb8bec828 Add a basic no-raise-on lag test 2023-01-30 12:26:07 -05:00
Tyler Goodlet 8637778739 Expose `raise_on_lag: bool` flag through factory 2023-01-30 12:18:23 -05:00
Tyler Goodlet 47166e45f0 Be explicit with passthrough kwargs (there's so few) 2023-01-29 17:31:21 -05:00
Tyler Goodlet 4ce2dcd12b Switch back to raising `Lagged` by default
Makes the broadcast test suite not hang xD, and is our expected default
behaviour. Also removes a ton of commented legacy cruft from before the
refactor to remove the `.receive()` recursion and fixes some typing.

Oh right, and in the case where there's only one subscriber left we warn
log about it since in theory we could actually entirely unwind the
bcaster back to the original underlying, though not sure if that's sane
or works for some use cases (like wanting to have some other subscriber
get added dynamically later).
2023-01-29 15:03:34 -05:00
Tyler Goodlet 80f983818f Ignore monkey patched `.send()` type annot 2023-01-29 15:03:34 -05:00
Tyler Goodlet 6ba29f8d56 Recurse and get the last value when in warn mode 2023-01-29 15:03:34 -05:00
Tyler Goodlet 2707a0e971 Add `._raise_on_lag` flag to disable `Lag` raising 2023-01-29 15:03:34 -05:00
Tyler Goodlet c8efcdd0d3 Drop `ReceiveMsgStream` from test suite 2023-01-29 15:03:34 -05:00
Tyler Goodlet 9f9907271b Merge `ReceiveMsgStream` and `MsgStream`
Since one-way streaming can be accomplished by just *not* sending on one
side (and/or thus wrapping such usage in a more restrictive API), we
just drop the recv-only parent type. The only method different was
`MsgStream.send()`, now merged in. Further in usage of `.subscribe()`
we monkey patch the underlying stream's `.send()` onto the delivered
broadcast receiver so that subscriber tasks can two-way stream as though
using the stream directly.

This allows us to more definitively drop `tractor.open_stream_from()` in
the longer run if we so choose as well; note currently this will
potentially create an issue if a caller tries to `.send()` on such a one
way stream.
2023-01-29 15:03:34 -05:00
Tyler Goodlet c2367c1c5e Better `trio`-ize `BroadcastReceiver` internals
Driven by a bug found in `piker` where we'd get an inf recursion error
due to `BroadcastReceiver.receive()` being called when consumer tasks
are awoken but no value is ready to `.nowait_receive()`.

This new rework takes an approach closer to the interface and internals
of `trio.MemoryReceiveChannel` particularly in terms of,

- implementing a `BroadcastReceiver.receive_nowait()` and using it
  within the async `.receive()`.
- failing over to an internal `._receive_from_underlying()` when the
  `_nowait()` call raises `trio.WouldBlock`.
- adding `BroadcastState.statistics()` for debugging and testing
  dropping recursion from `.receive()`.
2023-01-29 15:03:34 -05:00
goodboy a777217674
Merge pull request #346 from goodboy/ipc_failure_while_streaming
Ipc failure while streaming
2023-01-29 15:02:54 -05:00
Tyler Goodlet 13c9eadc8f Move result log msg up and drop else block 2023-01-29 14:55:02 -05:00
Tyler Goodlet af6c325072 Bump up legacy streaming timeout a smidgen 2023-01-29 14:55:02 -05:00
Tyler Goodlet 195d2f0ed4 Add nooz 2023-01-29 14:55:02 -05:00
Tyler Goodlet aa4871b13d Call `MsgStream.aclose()` in `Context.open_stream.__aexit__()`
We weren't doing this originally I *think* just because of the path
dependent nature of the way the code was developed (originally being
mega pedantic about one-way vs. bidirectional streams) but, it doesn't
seem like there's any issue just calling the stream's `.aclose()`; also
have the benefit of just being less code and logic checks B)
2023-01-29 14:55:02 -05:00
Tyler Goodlet 556f4626db Tweak warning msg for still-alive-after-cancelled actor 2023-01-29 14:55:02 -05:00
Tyler Goodlet 3967c0ed9e Add a simplified zombie lord specific process reaping test 2023-01-29 14:55:02 -05:00
Tyler Goodlet e34823aab4 Add parent vs. child cancels first cases 2023-01-29 14:55:02 -05:00
Tyler Goodlet 6c35ba2cb6 Add IPC breakage on both parent and child side
With the new fancy `_pytest.pathlib.import_path()` we can do real
parametrization of the example-script-module code and thus configure
whether the child, parent, or both silently break the IPC connection.

Parametrize the test for all the above mentioned cases as well as the
case where the IPC never breaks but we still simulate the user hammering
ctl-c / SIGINT to terminate the actor tree. Adjust expected errors based
on each case and heavily document each of these.
2023-01-29 14:55:02 -05:00
Tyler Goodlet 3a0817ff55 Skip `advanced_faults/` subset in docs examples tests 2023-01-29 14:55:02 -05:00
Tyler Goodlet 7fddb4416b Handle `mp` spawn method cases in test suite 2023-01-29 14:55:02 -05:00
Tyler Goodlet 1d92f2552a Adjust other examples tests to expect `pathlib` objects 2023-01-29 14:55:02 -05:00
Tyler Goodlet 4f8586a928 Wrap ex in new test, change dir helpers to use `pathlib.Path` 2023-01-29 14:55:02 -05:00
Tyler Goodlet fb9ff45745 Move example to a new `advanced_faults` egs subset dir 2023-01-29 14:55:02 -05:00
Tyler Goodlet 36a83cb306 Refine example to drop IPC mid-stream
Use a task nursery in the subactor to spawn tasks which cancel the IPC
channel mid stream to simulate the most concurrent case we're likely to
see. Make `main()` accept a `debug_mode: bool` for parametrization. Fill
out detailed comments/docs on this example.
2023-01-29 14:55:02 -05:00
Tyler Goodlet 7394a187e0 Name one-way streaming (con generators) what it is 2023-01-29 14:55:02 -05:00
Tyler Goodlet df01294bb2 Show more functiony syntax in ctx-cancelled log msgs 2023-01-29 14:55:02 -05:00
Tyler Goodlet ddf3d0d1b3 Show tracebacks for un-shipped/propagated errors 2023-01-29 14:55:02 -05:00
Tyler Goodlet 158569adae Add WIP example of silent IPC breaks while streaming 2023-01-29 14:55:02 -05:00
Tyler Goodlet 97d5f7233b Fix uid2nursery lookup table type annot 2023-01-29 14:55:02 -05:00
Tyler Goodlet d27c081a15 Ensure arbiter sockaddr type before usage 2023-01-29 14:55:02 -05:00
Tyler Goodlet a4874a3227 Always set the `parent_exit: trio.Event` on exit 2023-01-29 14:55:02 -05:00
Tyler Goodlet de04bbb2bb Don't raise on a broken IPC-context when sending stop msg 2023-01-29 14:55:02 -05:00
Tyler Goodlet 4f977189c0 Handle broken mem chan on `Actor._push_result()`
When backpressure is used and a feeder mem chan breaks during msg
delivery (usually because the IPC allocating task already terminated)
instead of raising we simply warn as we do for the non-backpressure
case.

Also, add a proper `Actor.is_arbiter` test inside `._invoke()` to avoid
doing an arbiter-registry lookup if the current actor **is** the
registrar.
2023-01-29 14:55:02 -05:00