tractor

Commit Graph

Author	SHA1	Message	Date
Tyler Goodlet	20d75ff934	Move move context code into new `._context` mod	2023-05-15 10:00:45 -04:00
Tyler Goodlet	968f13f9ef	Set `Context._scope_nursery` on callee side too Because obviously we probably want to support `allow_overruns` on the remote callee side as well XD Only found the bugs fixed in this patch this thanks to writing a much more exhaustive test set for overrun cases B)	2023-05-15 10:00:45 -04:00
Tyler Goodlet	c72026091e	Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` but with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` before any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.	2023-05-15 10:00:45 -04:00
Tyler Goodlet	e80e0a551f	Change a bunch of log levels to cancel, including any `ContextCancelled` handling	2023-05-15 10:00:45 -04:00
Tyler Goodlet	d75343106b	More single doc-strs in discovery mod	2023-05-15 10:00:45 -04:00
Tyler Goodlet	df01294bb2	Show more functiony syntax in ctx-cancelled log msgs	2023-01-29 14:55:02 -05:00
Tyler Goodlet	ddf3d0d1b3	Show tracebacks for un-shipped/propagated errors	2023-01-29 14:55:02 -05:00
Tyler Goodlet	97d5f7233b	Fix uid2nursery lookup table type annot	2023-01-29 14:55:02 -05:00
Tyler Goodlet	d27c081a15	Ensure arbiter sockaddr type before usage	2023-01-29 14:55:02 -05:00
Tyler Goodlet	4f977189c0	Handle broken mem chan on `Actor._push_result()` When backpressure is used and a feeder mem chan breaks during msg delivery (usually because the IPC allocating task already terminated) instead of raising we simply warn as we do for the non-backpressure case. Also, add a proper `Actor.is_arbiter` test inside `._invoke()` to avoid doing an arbiter-registry lookup if the current actor is the registrar.	2023-01-29 14:55:02 -05:00
Tyler Goodlet	fca2e7c10e	Simplify closed abruptly log msg	2023-01-26 12:44:13 -05:00
Tyler Goodlet	6c8cacc9d1	Adjust all default is `None` annots (per new `mypy`)	2022-12-12 13:18:22 -05:00
Tyler Goodlet	0956d5f461	Restore the `trio` SIGINT handler, cancel root lock tasks on no-peers Pretty sure this is the final touch to alleviate all our debug lock headaches! Instead of trying to revert to the "last" handler (as `pdb` does internally in the stdlib) we always just revert to the handler `trio` registers during startup. Further this seems to allow cancelling the root-side locking task if it's detected as stale IFF we only do this when the root actor is in a "no more IPC peers" state. Deatz: - (always) set `._debug.Lock._trio_handler` as the `trio` version, not some last used handler to make sure we're getting the ctrl-c handling we want when not in debug mode. - assign the trio handler in `open_root_actor()` `._runtime._async_main()` to be sure it's applied in subactors as well as the root. - only do debug lock blocking and root-side-locking-task cancels when a "no peers" condition is detected in the root actor: i.e. no IPC channels are detected by the root meaning it's impossible any actor has a sane lock-state ongoing for debug mode.	2022-10-14 18:18:01 -04:00
Tyler Goodlet	50fe098e06	First pass, swap `MultiError` for `BaseExceptionGroup`	2022-10-14 18:16:51 -04:00
Tyler Goodlet	b81b6be98a	Drop extra log msgs, some old commented code	2022-10-12 12:35:35 -04:00
Tyler Goodlet	fb721f36ef	Support debug-lock blocking, use on no-more IPC This is a lingering debugger locking race case we needed to handle: - child crashes acquires TTY lock in root and attaches to `pdb` - child IPC goes down such that all channels to the root are broken / non-functional. - root is stuck thinking the child is still in debug even though it can't be contacted and the child actor machinery hasn't been cancelled by its parent. - root get's stuck in deadlock with child since it won't send a cancel request until the child is finished debugging, but the child can't unlock the debugger bc IPC is down. To avoid this scenario add debug lock blocking list via `._debug.Lock._blocked: set[tuple]` which holds actor uids for any actor that is detected by the root as having no transport channel connections with said root (of which at least one should exist if this sub-actor at some point acquired the debug lock). The root consequently checks this list for any actor that tries to (re)acquire the lock and blocks with a `ContextCancelled`. When a debug condition is tested in `._runtime._invoke` the context's `._enter_debugger_on_cancel` which is set to `False` if the actor is on the block list in which case the post-mortem entry is skipped. Further this adds a root-locking-task side cancel scope to `Lock._root_local_task_cs_in_debug` which can be cancelled by the root runtime when a stale lock is detected after all IPC channels for the actor have been torn down. NOTE: right now we're NOT doing this since it seems to cause test failures likely due because it may cause pre-mature cancellation and maybe needs a bit more experimenting?	2022-10-11 20:00:05 -04:00
Tyler Goodlet	e609183242	Expose lifetime stack as class attr, add base test suite	2022-09-15 23:50:15 -04:00
Tyler Goodlet	7548dba8f2	Change to new doc string style	2022-09-15 23:41:28 -04:00
Tyler Goodlet	208d56af2c	Make `async_main()` a module func	2022-09-15 23:41:28 -04:00
Tyler Goodlet	a3a5bc267e	Make `process_messages()` a mod func	2022-09-15 23:41:28 -04:00
Tyler Goodlet	d4084b2032	Rename our core module to `_runtime`	2022-09-15 23:41:28 -04:00

1 2

71 Commits (636b29e4406f9202dfb21ad1df5804ba666f4f2e)