tractor

Commit Graph

Author	SHA1	Message	Date
Tyler Goodlet	7d5fda4485	Be ultra-correct in `Portal.open_context()` This took way too long to get right but hopefully will give us grok-able and correct context exit semantics going forward B) The main fixes were: - always shielding the `MsgStream.aclose()` call on teardown to avoid bubbling a `Cancelled`. - properly absorbing any `ContextCancelled` in cases due to "self cancellation" using the new `Context.canceller` in the logic. - capturing any error raised by the `Context.result()` call in the "normal exit, result received" case and setting it as the `Context._local_error` so that self-cancels can be easily measured via `Context.cancelled_caught` in same way as remote-error caused cancellations. - extremely detailed comments around all of the cancellation-error cases to avoid ever getting confused about the control flow in the future XD	2025-03-14 13:44:08 -04:00
Tyler Goodlet	f5fcd8ca2e	Be mega-pedantic with `ContextCancelled` semantics As part of extremely detailed inter-peer-actor testing, add much more granular `Context` cancellation state tracking via the following (new) fields: - `.canceller: tuple[str, str]` the uuid of the actor responsible for the cancellation condition - always set by `Context._maybe_cancel_and_set_remote_error()` and replaces `._cancelled_remote` and `.cancel_called_remote`. If set, this value should normally always match a value from some `ContextCancelled` raised or caught by one side of the context. - `._local_error` which is always set to the locally raised (and caller or callee task's scope-internal) error which caused any eventual cancellation/error condition and thus any closure of the context's per-task-side-`trio.Nursery`. - `.cancelled_caught: bool` is now always `True` whenever the local task catches (or "silently absorbs") a `ContextCancelled` (a `ctxc`) that indeed originated from one of the context's linked tasks or any other context which raised its own `ctxc` in the current `.open_context()` scope. => whenever there is a case that no `ContextCancelled` was raised in the `.open_context().__aexit__()` (eg. `ctx.result()` called after a call `ctx.cancel()`), we still consider the context's as having "caught a cancellation" since the `ctxc` was indeed silently handled by the cancel requester; all other error cases are already represented by mirroring the state of the `._scope: trio.CancelScope` => IOW there should be no case where an error is not raised in the context's scope and `.cancelled_caught: bool == False`, i.e. no case where `._scope.cancelled_caught == False and ._local_error is not None`! - always raise any `ctxc` from `.open_stream()` if `._cancel_called == True` - if the cancellation request has not already resulted in a `._remote_error: ContextCancelled` we raise a `RuntimeError` to indicate improper usage to the guilty side's task code. - make `._maybe_raise_remote_err()` a sync func and don't raise any `ctxc` which is matched against a `.canceller` determined to be the current actor, aka a "self cancel", and always set the `._local_error` to any such `ctxc`. - `.side: str` taken from inside `.cancel()` and unused as of now since it might be better re-written as a similar `.is_opener() -> bool`? - drop unused `._started_received: bool`.. - TONS and TONS of detailed comments/docs to attempt to explain all the possible cancellation/exit cases and how they should exhibit as either silent closes or raises from the `Context` API! Adjust the `._runtime._invoke()` code to match: - use `ctx._maybe_raise_remote_err()` in `._invoke()`. - adjust to new `.canceller` property. - more type hints. - better `log.cancel()` msging around self-cancels vs. peer-cancels. - always set the `._local_error: BaseException` for the "callee" task just like `Portal.open_context()` now will do B) Prior we were raising any `Context._remote_error` directly and doing (more or less) the same `ContextCancelled` "absorbing" logic (well kinda) in block; instead delegate to the method	2025-03-14 13:42:55 -04:00
Tyler Goodlet	04217f319a	Raise a `MessagingError` from the src error on msging edge cases	2025-03-14 13:42:15 -04:00
Tyler Goodlet	8cb8390201	Move `MessagingError` into `._exceptions` set	2025-03-14 13:42:15 -04:00
Tyler Goodlet	5035617adf	Dump `.msgdata` in `RemoteActorError.__repr__()`	2025-03-14 13:42:15 -04:00
Tyler Goodlet	715348c5c2	Port all tests to new `reg_addr` fixture name	2025-03-14 13:42:15 -04:00
Tyler Goodlet	fdf0c43bfa	Type out the full-fledged streaming ex.	2025-03-14 13:40:19 -04:00
Tyler Goodlet	f895c96600	Add masked super timeout line to `do_hard_kill()` for would-be runtime hackers	2025-03-14 13:40:19 -04:00
Tyler Goodlet	ca1a1476bb	Add a first serious inter-peer remote cancel suite Tests that appropriate `Context` exit state, the relay of a `ContextCancelled` error and its `.canceller: tuple[str, str]` value are set when an inter-peer cancellation happens via an "out of band" request method (in this case using `Portal.cancel_actor()` and that cancellation is propagated "horizontally" to other peers. Verify that any such cancellation scenario which also experiences an "error during `ContextCancelled` handling" DOES NOT result in that further error being suppressed and that the user's exception bubbles out of the `Context.open_context()` block(s) appropriately! Likely more tests to come as well as some factoring of the teardown state checks where possible. Pertains to serious testing the major work landing in #357	2025-03-14 13:40:19 -04:00
Tyler Goodlet	a7c36a9cbe	Tidy/clarify another `._runtime` comment	2025-03-14 13:40:19 -04:00
Tyler Goodlet	22e4b324b1	Get mega-pedantic in `Portal.open_context()` Specifically in the `.__aexit__()` phase to ensure remote, runtime-internal, and locally raised error-during-cancelled-handling exceptions are NEVER masked by a local `ContextCancelled` or any exception group of `trio.Cancelled`s. Also adds a ton of details to doc strings including extreme detail surrounding the `ContextCancelled` raising cases and their processing inside `.open_context()`'s exception handler blocks. Details, details: - internal rename `err`/`_err` stuff to just be `scope_err` since it's effectively the error bubbled up from the context's surrounding (and cross-actor) "scope". - always shield `._recv_chan.aclose()` to avoid any `Cancelled` from masking the `scope_err` with a runtime related `trio.Cancelled`. - explicitly catch the specific set of `scope_err: BaseException` that we can reasonably expect to handle instead of the catch-all parent type including exception groups, cancels and KBIs.	2025-03-14 13:40:18 -04:00
Tyler Goodlet	89ed8b67ff	Drop `msg` kwarg from `Context.cancel()` Well first off, turns out it's never used and generally speaking doesn't seem to help much with "runtime hacking/debugging"; why would we need to "fabricate" a msg when `.cancel()` is called to self-cancel? Also (and since `._maybe_cancel_and_set_remote_error()` now takes an `error: BaseException` as input and thus expects error-msg unpacking prior to being called), we now manually set `Context._cancel_msg: dict` just prior to any remote error assignment - so any case where we would have fabbed a "cancel msg" near calling `.cancel()`, just do the manual assign. In this vein some other subtle changes: - obviously don't set `._cancel_msg` in `.cancel()` since it's no longer an input. - generally do walrus-style `error := unpack_error()` before applying and setting remote error-msg state. - always raise any `._remote_error` in `.result()` instead of returning the exception instance and check before AND after the underlying mem chan read. - add notes/todos around `raise self._remote_error from None` masking of (runtime) errors in `._maybe_raise_remote_err()` and use it inside `.result()` since we had the inverse duplicate logic there anyway.. Further, this adds and extends a ton of (internal) interface docs and details comments around the `Context` API including many subtleties pertaining to calling `._maybe_cancel_and_set_remote_error()`.	2025-03-14 13:37:55 -04:00
Tyler Goodlet	11bbf15817	`._exceptions`: typing and error unpacking updates Bump type annotations to 3.10+ style throughout module as well as fill out doc strings a bit. Inside `unpack_error()` pop any `error_dict: dict` and, - return `None` early if not found, - versus pass directly as `**error_dict` to the error constructor instead of a double field read.	2025-03-14 13:36:16 -04:00
Tyler Goodlet	a18663213a	Add comments around diff between `C/context` refs	2025-03-14 13:36:16 -04:00
Tyler Goodlet	d4d09b6071	Factor non-yield stream msg processing into helper Since both `MsgStream.receive()` and `.receive_nowait()` need the same raising logic when a non-stream msg arrives (so that maybe an appropriate IPC translated error can be raised) move the `KeyError` handler code into a new `._streaming._raise_from_no_yield_msg()` func and call it from both methods to make the error-interface-raising symmetrical across both methods.	2025-03-14 13:36:16 -04:00
Tyler Goodlet	6d10f0c516	Always raise remote (cancelled) error if set Previously we weren't raising a remote error if the local scope was cancelled during a call to `Context.result()` which is problematic if the caller WAS NOT the requester for said remote cancellation; in that case we still want a `ContextCancelled` raised with the `.canceller: str` set to the cancelling actor uid. Further fix a naming bug where the (seemingly older) `._remote_err` was being set to such an error instead of `._remote_error` XD	2025-03-14 13:36:16 -04:00
Tyler Goodlet	fa9b57bae0	Write more comprehensive `Portal.cancel_actor()` doc str	2025-03-14 13:36:16 -04:00
Tyler Goodlet	81776a6238	Drop pause line from ctx cancel handler block in test	2025-03-14 13:36:16 -04:00
Tyler Goodlet	144d1f4d94	Msg-ified `ContextCancelled`s sub-error type should always be just, its type..	2025-03-14 13:36:16 -04:00
Tyler Goodlet	51fdf3524c	Start inter-peer cancellation test mod Move over relevant test from the "context semantics" test module which was already verifying peer-caused-`ContextCancelled.canceller: tuple` error info and propagation during an inter-peer cancellation scenario. Also begin a more general set of inter-peer cancellation tests starting with the simplest case where when a peer is cancelled the parent should NOT get an "muted" `trio.Cancelled` and instead a `tractor.ContextCancelled` with a `.canceller: tuple` which points to the sibling actor which requested the peer cancel.	2025-03-14 13:36:16 -04:00
Tyler Goodlet	cff69d07fe	Mk `gather_contexts()` support `@acm`s yielding `None` We were using a `all(<yielded values>)` condition which obviously won't work if the batched managers yield any non-truthy value. So instead see the `unwrapped: dict` with the `id(mngrs)` and only unblock once all values have been filled in to be something that is not that value.	2025-03-14 13:36:16 -04:00
Tyler Goodlet	ee94d6d62c	Teensie tidy up on actor doc string	2025-03-14 13:36:16 -04:00
Tyler Goodlet	89b84ed6c0	Make `NamespacePath` work on object refs Detect if the input ref is a non-func (like an `object` instance) in which case grab its type name using `type()`. Wrap all the name-getting into a new `_mk_fqpn()` static meth: gets the "fully qualified path name" and returns path and name in tuple; port other methds to use it. Refine and update the docs B)	2025-03-14 13:36:16 -04:00
Tyler Goodlet	f33f689f34	.log: more correct handling for `get_logger(__name__)` usage	2025-03-14 13:36:16 -04:00
Tyler Goodlet	7507e269ec	Just import `mp` top level in `._spawn`	2023-06-14 15:32:15 -04:00
Tyler Goodlet	17ae449160	Tidy up `typing` imports in broadcaster mod	2023-06-14 15:31:52 -04:00
Tyler Goodlet	6495688730	Drop `Optional` style from runtime mod	2023-05-25 16:00:05 -04:00
Tyler Goodlet	a0276f41c2	Remote cancellation runtime-internal vars renames - `Context._cancel_called_remote` -> `._cancelled_remote` since "called" implies the cancellation was "requested" when it could be due to another error and the actor uid is the value - only set once the far end task scope is terminated due to either error or cancel, which has nothing to do with what caused the cancellation. - `Actor._cancel_called_remote` -> `._cancel_called_by_remote` which emphasizes that this variable is only set IFF some remote actor requested that this actor's runtime be cancelled via `Actor.cancel()`.	2023-05-19 14:31:55 -04:00
Tyler Goodlet	ead9e418de	Expose `allow_overruns` to `Portal.open_context()` Turns out you can get a case where you might be opening multiple ctx-streams concurrently and during the context opening phase you block for all contexts to open, but then when you eventually start opening streams some slow to start context has caused the others become in an overrun state.. so we need to let the caller control whether that's an error ;) This also needs a test!	2023-05-15 10:00:45 -04:00
Tyler Goodlet	60791ed546	Oof, fix remaining `Actor.cancel()` in `Actor._from_parent()`	2023-05-15 10:00:45 -04:00
Tyler Goodlet	7293b82bcc	Tweak doc string	2023-05-15 10:00:45 -04:00
Tyler Goodlet	20d75ff934	Move move context code into new `._context` mod	2023-05-15 10:00:45 -04:00
Tyler Goodlet	041d7da721	Drop caller cancels overrun test; covered in new tests	2023-05-15 10:00:45 -04:00
Tyler Goodlet	04e4397a8f	Ignore drainer-task nursery RTE during context exit	2023-05-15 10:00:45 -04:00
Tyler Goodlet	968f13f9ef	Set `Context._scope_nursery` on callee side too Because obviously we probably want to support `allow_overruns` on the remote callee side as well XD Only found the bugs fixed in this patch this thanks to writing a much more exhaustive test set for overrun cases B)	2023-05-15 10:00:45 -04:00
Tyler Goodlet	f9911c22a4	Seriously cover all overrun cases This actually caught further runtime bugs so it's gud i tried.. Add overrun-ignore enabled / disabled cases and error catching for all of them. More or less this should cover every possible outcome when it comes to setting `allow_overruns: bool` i hope XD	2023-05-15 10:00:45 -04:00
Tyler Goodlet	63adf73b4b	Adjust aio test for silent cancellation by parent	2023-05-15 10:00:45 -04:00
Tyler Goodlet	f1e9c0be93	Fix cluster test to use `allow_overruns`	2023-05-15 10:00:45 -04:00
Tyler Goodlet	6db656fecf	Flip allocate log msgs to debug	2023-05-15 10:00:45 -04:00
Tyler Goodlet	6994d2026d	Drop brackpressure usage from fan out tests	2023-05-15 10:00:45 -04:00
Tyler Goodlet	c72026091e	Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` but with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` before any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.	2023-05-15 10:00:45 -04:00
Tyler Goodlet	90e41016b9	Only tuplize `.canceller` if non-`None`	2023-05-15 10:00:45 -04:00
Tyler Goodlet	f54c415060	Move `NoRuntime` import inside `current_actor()` to avoid cycle	2023-05-15 10:00:45 -04:00
Tyler Goodlet	03644f59cc	Augment test cases for callee-returns-result early Turns out stuff was totally broken in these cases because we're either closing the underlying mem chan too early or not handling the "allow_overruns" mode's cancellation correctly..	2023-05-15 10:00:45 -04:00
Tyler Goodlet	67f82c6ebd	Add new remote error introspection attrs To handle both remote cancellation this adds `ContextCanceled.canceller: tuple` the uid of the cancel requesting actor and is expected to be set by the runtime when servicing any remote cancel request. This makes it possible for `ContextCancelled` receivers to know whether "their actor runtime" is the source of the cancellation. Also add an explicit `RemoteActor.src_actor_uid` which better formalizes the notion of "which remote actor" the error originated from. Both of these new attrs are expected to be packed in the `.msgdata` when the errors are loaded locally.	2023-05-15 10:00:45 -04:00
Tyler Goodlet	71cd445319	Add new set of context cancellation tests These will verify new changes to the runtime/messaging core which allows us to adopt an "ignore cancel if requested by us" style handling of `ContextCancelled` more like how `trio` does with `trio.Nursery.cancel_scope.cancel()`. We now expect a `ContextCancelled.canceller: tuple` which is set to the actor uid of the actor which requested the cancellation which eventually resulted in the remote error-msg. Also adds some experimental tweaks to the "backpressure" test which it turns out is very problematic in coordination with context cancellation since blocking on the feed mem chan to some task will block the ipc msg loop and thus handling of cancellation.. More to come to both the test and core to address this hopefully since right now this test is failing.	2023-05-15 10:00:45 -04:00
Tyler Goodlet	220b244508	Log waiter task cancelling msg as cancel-level	2023-05-15 10:00:45 -04:00
Tyler Goodlet	831790377b	Assign `RemoteActorError` boxed error type for context cancelleds	2023-05-15 10:00:45 -04:00
Tyler Goodlet	e80e0a551f	Change a bunch of log levels to cancel, including any `ContextCancelled` handling	2023-05-15 10:00:45 -04:00
Tyler Goodlet	b3f9251eda	Add some log-level method doc-strings	2023-05-15 10:00:45 -04:00

... 3 4 5 6 7 ...

1712 Commits (pkg_tidying) All Branches Search

1712 Commits (pkg_tidying)

All Branches