forked from goodboy/tractor
1
0
Fork 0
Commit Graph

1578 Commits (28ea8e787aef8ab7ff99b34d6ab7cf19306b6b98)

Author SHA1 Message Date
Tyler Goodlet 968f13f9ef Set `Context._scope_nursery` on callee side too
Because obviously we probably want to support `allow_overruns` on the
remote callee side as well XD

Only found the bugs fixed in this patch this thanks to writing a much
more exhaustive test set for overrun cases B)
2023-05-15 10:00:45 -04:00
Tyler Goodlet f9911c22a4 Seriously cover all overrun cases
This actually caught further runtime bugs so it's gud i tried..
Add overrun-ignore enabled / disabled cases and error catching for all
of them. More or less this should cover every possible outcome when
it comes to setting `allow_overruns: bool` i hope XD
2023-05-15 10:00:45 -04:00
Tyler Goodlet 63adf73b4b Adjust aio test for silent cancellation by parent 2023-05-15 10:00:45 -04:00
Tyler Goodlet f1e9c0be93 Fix cluster test to use `allow_overruns` 2023-05-15 10:00:45 -04:00
Tyler Goodlet 6db656fecf Flip allocate log msgs to debug 2023-05-15 10:00:45 -04:00
Tyler Goodlet 6994d2026d Drop brackpressure usage from fan out tests 2023-05-15 10:00:45 -04:00
Tyler Goodlet c72026091e Remote `Context` cancellation semantics rework B)
This adds remote cancellation semantics to our `tractor.Context`
machinery to more closely match that of `trio.CancelScope` but
with operational differences to handle the nature of parallel tasks interoperating
across multiple memory boundaries:

- if an actor task cancels some context it has opened via
  `Context.cancel()`, the remote (scope linked) task will be cancelled
  using the normal `CancelScope` semantics of `trio` meaning the remote
  cancel scope surrounding the far side task is cancelled and
  `trio.Cancelled`s are expected to be raised in that scope as per
  normal `trio` operation, and in the case where no error is raised
  in that remote scope, a `ContextCancelled` error is raised inside the
  runtime machinery and relayed back to the opener/caller side of the
  context.
- if any actor task cancels a full remote actor runtime using
  `Portal.cancel_actor()` the same semantics as above apply except every
  other remote actor task which also has an open context with the actor
  which was cancelled will also be sent a `ContextCancelled` **but**
  with the `.canceller` field set to the uid of the original cancel
  requesting actor.

This changeset also includes a more "proper" solution to the issue of
"allowing overruns" during streaming without attempting to implement any
form of IPC streaming backpressure. Implementing task-granularity
backpressure cross-process turns out to be more or less impossible
without augmenting out streaming protocol (likely at the cost of
performance). Further allowing overruns requires special care since
any blocking of the runtime RPC msg loop task effectively can block
control msgs such as cancels and stream terminations.

The implementation details per abstraction layer are as follows.

._streaming.Context:
- add a new contructor factor func `mk_context()` which provides
  a strictly private init-er whilst allowing us to not have to define
  an `.__init__()` on the type def.
- add public `.cancel_called` and `.cancel_called_remote` properties.
- general rename of what was the internal `._backpressure` var to
  `._allow_overruns: bool`.
- move the old contents of `Actor._push_result()` into a new
  `._deliver_msg()` allowing for better encapsulation of per-ctx
  msg handling.
 - always check for received 'error' msgs and process them with the new
   `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to
   the local task, thus guaranteeing error and cancellation handling
   despite any overflow handling.
- add a new `._drain_overflows()` task-method for use with new
  `._allow_overruns: bool = True` mode.
 - add back a `._scope_nursery: trio.Nursery` (allocated in
   `Portal.open_context()`) who's sole purpose is to spawn a single task
   which runs the above method; anything else is an error.
 - augment `._deliver_msg()` to start a task and run the above method
   when operating in no overrun mode; the task queues overflow msgs and
   attempts to send them to the underlying mem chan using a blocking
   `.send()` call.
 - on context exit, any existing "drainer task" will be cancelled and
   remaining overflow queued msgs are discarded with a warning.
- rename `._error` -> `_remote_error` and set it in a new method
  `_maybe_cancel_and_set_remote_error()` which is called before
  processing
- adjust `.result()` to always call `._maybe_raise_remote_err()` at its
  start such that whenever a `ContextCancelled` arrives we do logic for
  whether or not to immediately raise that error or ignore it due to the
  current actor being the one who requested the cancel, by checking the
  error's `.canceller` field.
 - set the default value of `._result` to be `id(Context()` thus avoiding
   conflict with any `.result()` actually being `False`..

._runtime.Actor:
- augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to
  take a `requesting_uid: tuple` indicating the source actor of every
  cancellation request.
- pass through the new `Context._allow_overruns` through `.get_context()`
- call the new `Context._deliver_msg()` from `._push_result()` (since
  the factoring out that method's contents).

._runtime._invoke:
- `TastStatus.started()` back a `Context` (unless an error is raised)
  instead of the cancel scope to make it easy to set/get state on that
  context for the purposes of cancellation and remote error relay.
- always raise any remote error via `Context._maybe_raise_remote_err()`
  before doing any `ContextCancelled` logic.
- assign any `Context._cancel_called_remote` set by the `requesting_uid`
  cancel methods (mentioned above) to the `ContextCancelled.canceller`.

._runtime.process_messages:
- always pass a `requesting_uid: tuple` to `Actor.cancel()` and
  `._cancel_task` to that any corresponding `ContextCancelled.canceller`
  can be set inside `._invoke()`.
2023-05-15 10:00:45 -04:00
Tyler Goodlet 90e41016b9 Only tuplize `.canceller` if non-`None` 2023-05-15 10:00:45 -04:00
Tyler Goodlet f54c415060 Move `NoRuntime` import inside `current_actor()` to avoid cycle 2023-05-15 10:00:45 -04:00
Tyler Goodlet 03644f59cc Augment test cases for callee-returns-result early
Turns out stuff was totally broken in these cases because we're either
closing the underlying mem chan too early or not handling the
"allow_overruns" mode's cancellation correctly..
2023-05-15 10:00:45 -04:00
Tyler Goodlet 67f82c6ebd Add new remote error introspection attrs
To handle both remote cancellation this adds `ContextCanceled.canceller:
tuple` the uid of the cancel requesting actor and is expected to be set
by the runtime when servicing any remote cancel request. This makes it
possible for `ContextCancelled` receivers to know whether "their actor
runtime" is the source of the cancellation.

Also add an explicit `RemoteActor.src_actor_uid` which better formalizes
the notion of "which remote actor" the error originated from.

Both of these new attrs are expected to be packed in the `.msgdata` when
the errors are loaded locally.
2023-05-15 10:00:45 -04:00
Tyler Goodlet 71cd445319 Add new set of context cancellation tests
These will verify new changes to the runtime/messaging core which allows
us to adopt an "ignore cancel if requested by us" style handling of
`ContextCancelled` more like how `trio` does with
`trio.Nursery.cancel_scope.cancel()`. We now expect
a `ContextCancelled.canceller: tuple` which is set to the actor uid of
the actor which requested the cancellation which eventually resulted in
the remote error-msg.

Also adds some experimental tweaks to the "backpressure" test which it
turns out is very problematic in coordination with context cancellation
since blocking on the feed mem chan to some task will block the ipc msg
loop and thus handling of cancellation.. More to come to both the test
and core to address this hopefully since right now this test is failing.
2023-05-15 10:00:45 -04:00
Tyler Goodlet 220b244508 Log waiter task cancelling msg as cancel-level 2023-05-15 10:00:45 -04:00
Tyler Goodlet 831790377b Assign `RemoteActorError` boxed error type for context cancelleds 2023-05-15 10:00:45 -04:00
Tyler Goodlet e80e0a551f Change a bunch of log levels to cancel, including any `ContextCancelled` handling 2023-05-15 10:00:45 -04:00
Tyler Goodlet b3f9251eda Add some log-level method doc-strings 2023-05-15 10:00:45 -04:00
Tyler Goodlet 903537ce04 Tweak context doc str 2023-05-15 10:00:45 -04:00
Tyler Goodlet d75343106b More single doc-strs in discovery mod 2023-05-15 10:00:45 -04:00
Tyler Goodlet cfb2bc0fee Enable `Context` backpressure by default; avoid startup race-crashes? 2023-05-15 10:00:45 -04:00
goodboy e5ee2e3de8
Merge pull request from goodboy/switch_to_pdbp
Switch to `pdbp` 🏄🏼
2023-05-15 09:58:58 -04:00
Tyler Goodlet 41aa91c8eb Add news file 2023-05-15 09:35:59 -04:00
Tyler Goodlet 6758e4487c Drop lingering `pdbpp` comment-refs in tests 2023-05-15 09:14:42 -04:00
Tyler Goodlet 1c3893a383 Drop commented `pdbpp` import logic 2023-05-15 09:01:55 -04:00
Tyler Goodlet 73befac9bc Switch to `pdbp` in test reqs 2023-05-15 09:01:27 -04:00
Tyler Goodlet 79622bbeea Restore `breakpoint()` hook after runtime exits
Previously we were leaking our (pdb++) override into the Python runtime
which would always result in a runtime error whenever `breakpoint()` is
called outside our runtime; after exit of the root actor . This
explicitly restores any previous hook override (detected during startup)
or deletes the hook and restores the environment if none existed prior.

Also adds a new WIP debugging example script to ensure breakpointing
works as normal after runtime close; this will be added to the test
suite.
2023-05-15 00:47:29 -04:00
Tyler Goodlet 95535b2226 Some more 3.10+ optional type sigs 2023-05-15 00:47:29 -04:00
Tyler Goodlet 87c6e09d6b Switch readme links to point @ `pdbp` B) 2023-05-14 22:52:24 -04:00
Tyler Goodlet 9ccd3a74b6 More detailed preface description 2023-05-14 22:38:47 -04:00
Tyler Goodlet ae4ff5dc8d pdbp: adding typing to config settings vars 2023-05-14 22:38:46 -04:00
Tyler Goodlet 705538398f `pdbp`: turn off line truncating by default, fixes terminal resizing stuff 2023-05-14 22:38:16 -04:00
Tyler Goodlet 86aef5238d Hide actor nursery exit frame 2023-05-14 21:24:26 -04:00
Tyler Goodlet cc82447db6 First try: switch debug machinery over to `pdbp` B) 2023-05-14 21:24:26 -04:00
Tyler Goodlet 23cffbd940 Use multiline import for debug mod 2023-05-14 21:24:26 -04:00
Tyler Goodlet 3d202272c4 Change over debugger tests to use `PROMPT` var.. 2023-05-14 21:24:26 -04:00
Tyler Goodlet 63cdb0891f Switch to `pdbp` since noone is maintaining `pdbpp` 2023-05-14 21:24:26 -04:00
goodboy 0f7db27b68
Merge pull request from goodboy/drop_proc_actxmngr
`trio.Process.aclose()`?
2023-05-14 20:59:53 -04:00
Tyler Goodlet c53d62d2f7 Add news file 2023-05-14 20:31:26 -04:00
Tyler Goodlet f667d16d66 Copy the now deprecated `trio.Process.aclose()`
Move it into our `_spawn.do_hard_kill()` since we do indeed rely on
the particular process killing sequence on "soft kill" failure cases.
2023-05-14 19:31:50 -04:00
Tyler Goodlet 24a062341e Just call `trio.Process.aclose()` directly for now? 2023-04-02 14:34:41 -04:00
goodboy e714bec8db
Merge pull request from kehrazy/patch-1
fixed the `Zombie` example having wrong indentation
2023-04-01 12:11:47 -04:00
Igor 009cd6552e
fixed the `Zombie` example having wrong indentation 2023-03-31 17:50:46 +03:00
goodboy 649c5e7504
Merge pull request from goodboy/breceiver_internals
Avoid inf recursion in `BroadcastReceiver.receive()`
2023-01-30 14:01:13 -05:00
Tyler Goodlet 203f95615c Add nooz 2023-01-30 12:42:26 -05:00
Tyler Goodlet efb8bec828 Add a basic no-raise-on lag test 2023-01-30 12:26:07 -05:00
Tyler Goodlet 8637778739 Expose `raise_on_lag: bool` flag through factory 2023-01-30 12:18:23 -05:00
Tyler Goodlet 47166e45f0 Be explicit with passthrough kwargs (there's so few) 2023-01-29 17:31:21 -05:00
Tyler Goodlet 4ce2dcd12b Switch back to raising `Lagged` by default
Makes the broadcast test suite not hang xD, and is our expected default
behaviour. Also removes a ton of commented legacy cruft from before the
refactor to remove the `.receive()` recursion and fixes some typing.

Oh right, and in the case where there's only one subscriber left we warn
log about it since in theory we could actually entirely unwind the
bcaster back to the original underlying, though not sure if that's sane
or works for some use cases (like wanting to have some other subscriber
get added dynamically later).
2023-01-29 15:03:34 -05:00
Tyler Goodlet 80f983818f Ignore monkey patched `.send()` type annot 2023-01-29 15:03:34 -05:00
Tyler Goodlet 6ba29f8d56 Recurse and get the last value when in warn mode 2023-01-29 15:03:34 -05:00
Tyler Goodlet 2707a0e971 Add `._raise_on_lag` flag to disable `Lag` raising 2023-01-29 15:03:34 -05:00