tractor

jc211

tractor

Author	SHA1	Message	Date
Tyler Goodlet	04e4397a8f	Ignore drainer-task nursery RTE during context exit	2023-05-15 10:00:45 -04:00
Tyler Goodlet	f9911c22a4	Seriously cover all overrun cases This actually caught further runtime bugs so it's gud i tried.. Add overrun-ignore enabled / disabled cases and error catching for all of them. More or less this should cover every possible outcome when it comes to setting `allow_overruns: bool` i hope XD	2023-05-15 10:00:45 -04:00
Tyler Goodlet	c72026091e	Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` but with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` before any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.	2023-05-15 10:00:45 -04:00
Tyler Goodlet	903537ce04	Tweak context doc str	2023-05-15 10:00:45 -04:00
Tyler Goodlet	cfb2bc0fee	Enable `Context` backpressure by default; avoid startup race-crashes?	2023-05-15 10:00:45 -04:00
Tyler Goodlet	80f983818f	Ignore monkey patched `.send()` type annot	2023-01-29 15:03:34 -05:00
Tyler Goodlet	9f9907271b	Merge `ReceiveMsgStream` and `MsgStream` Since one-way streaming can be accomplished by just not sending on one side (and/or thus wrapping such usage in a more restrictive API), we just drop the recv-only parent type. The only method different was `MsgStream.send()`, now merged in. Further in usage of `.subscribe()` we monkey patch the underlying stream's `.send()` onto the delivered broadcast receiver so that subscriber tasks can two-way stream as though using the stream directly. This allows us to more definitively drop `tractor.open_stream_from()` in the longer run if we so choose as well; note currently this will potentially create an issue if a caller tries to `.send()` on such a one way stream.	2023-01-29 15:03:34 -05:00
Tyler Goodlet	aa4871b13d	Call `MsgStream.aclose()` in `Context.open_stream.__aexit__()` We weren't doing this originally I think just because of the path dependent nature of the way the code was developed (originally being mega pedantic about one-way vs. bidirectional streams) but, it doesn't seem like there's any issue just calling the stream's `.aclose()`; also have the benefit of just being less code and logic checks B)	2023-01-29 14:55:02 -05:00
Tyler Goodlet	de04bbb2bb	Don't raise on a broken IPC-context when sending stop msg	2023-01-29 14:55:02 -05:00
Tyler Goodlet	fb721f36ef	Support debug-lock blocking, use on no-more IPC This is a lingering debugger locking race case we needed to handle: - child crashes acquires TTY lock in root and attaches to `pdb` - child IPC goes down such that all channels to the root are broken / non-functional. - root is stuck thinking the child is still in debug even though it can't be contacted and the child actor machinery hasn't been cancelled by its parent. - root get's stuck in deadlock with child since it won't send a cancel request until the child is finished debugging, but the child can't unlock the debugger bc IPC is down. To avoid this scenario add debug lock blocking list via `._debug.Lock._blocked: set[tuple]` which holds actor uids for any actor that is detected by the root as having no transport channel connections with said root (of which at least one should exist if this sub-actor at some point acquired the debug lock). The root consequently checks this list for any actor that tries to (re)acquire the lock and blocks with a `ContextCancelled`. When a debug condition is tested in `._runtime._invoke` the context's `._enter_debugger_on_cancel` which is set to `False` if the actor is on the block list in which case the post-mortem entry is skipped. Further this adds a root-locking-task side cancel scope to `Lock._root_local_task_cs_in_debug` which can be cancelled by the root runtime when a stale lock is detected after all IPC channels for the actor have been torn down. NOTE: right now we're NOT doing this since it seems to cause test failures likely due because it may cause pre-mature cancellation and maybe needs a bit more experimenting?	2022-10-11 20:00:05 -04:00
Tyler Goodlet	1c480e6c92	Add `Context` cancel message and debug toggle flag In the case of a callee-side context cancelling itself it can be handy to let the caller-side task know (even if through logging) that the cancel was due to some known reason. Make `.cancel()` accept such a message on the callee side and have it included in the `._runtime._invoke()` raised `ContextCancelled` emission. Also add a `Context._trigger_debugger_on_cancel: bool` flag which can be set to `False` to avoid the debugger post-mortem crash mode from engaging on cross-context tasks which cancel themselves for a known reason (as is needed for blocked tasks in the debug TTY-lock machinery).	2022-10-11 20:00:05 -04:00
Tyler Goodlet	10eeda2d2b	Use built-ins for all data-structure-type annotations	2022-09-15 23:41:28 -04:00
Tyler Goodlet	c7035be2fc	Tolerate double `.remove()`s of stream on portal teardowns	2022-07-27 11:40:02 -04:00
Tyler Goodlet	bf0ac3116c	Only cancel/get-result from a ctx if transport is up There's no point in sending a cancel message to the remote linked task and especially no reason to block waiting on a result from that task if the transport layer is detected to be disconnected. We expect that the transport shouldn't go down at the layer of the message loop (reconnection logic should be handled in the transport layer itself) so if we detect the channel is not connected we don't bother requesting cancels nor waiting on a final result message. Why? - if the connection goes down in error the caller side won't have a way to know "how long" it should block to wait for a cancel ack or result and causes a potential hang that may require an additional ctrl-c from the user especially if using the debugger or if the traceback is not seen on console. - obviously there's no point in waiting for messages when there's no transport to deliver them XD Further, add some more detailed cancel logging detailing the task and actor ids.	2022-07-27 11:40:02 -04:00
Tyler Goodlet	e2139c2bf0	Don't set `Context._error` to expected `ContextCancelled` If the one side of an inter-actor context cancels the other then that side should always expect back a `ContextCancelled` message. However we should not set this error in this case (where the cancel request was sent and a `ContextCancelled` msg was received back) since it may override some other error that caused the cancellation request to be sent out in the first place. As an example when a context opens another context to a peer and some error happens which causes the second peer context to be cancelled but we want to propagate the original error. Fixes the issue found in https://github.com/pikers/piker/issues/244	2021-12-16 19:59:10 -05:00
Tyler Goodlet	5d424e3703	Hide the key error tb on remote starting errors	2021-12-16 19:59:10 -05:00
Tyler Goodlet	57f2aca18c	Set eoc on closure (again)	2021-12-16 16:19:15 -05:00
Tyler Goodlet	f2ba961e81	Mark stream with EOC when stop message is received	2021-12-16 16:18:58 -05:00
Tyler Goodlet	6f94ffc304	Re-license code base for distribution under AGPL This commit obviously denotes a re-license of all applicable parts of the code base. Acknowledgement of this change was completed in #274 by the majority of the current set of contributors. From here henceforth all changes will be AGPL licensed and distributed. This is purely an effort to maintain the same copy-left policy whilst closing the (perceived) SaaS loophole the GPL allows for. It is merely for this loophole: to avoid code hiding by any potential "network providers" who are attempting to use the project to make a profit without either compensating the authors or re-distributing their changes. I thought quite a bit about this change and can't see a reason not to close the SaaS loophole in our current license. We still are (hard) copy-left and I plan to keep the code base this way for a couple reasons: - The code base produces income/profit through parent projects and is demonstrably of high value. - I believe firms should not get free lunch for the sake of "contributions from their employees" or "usage as a service" which I have found to be a dubious argument at best. - If a firm who intends to profit from the code base wants to use it they can propose a secondary commercial license to purchase with the proceeds going to the project's authors under some form of well defined contract. - Many successful projects like Qt use this model; I see no reason it can't work in this case until such a time as the authors feel it should be loosened. There has been detailed discussion in #103 on licensing alternatives. The main point of this AGPL change is to protect the code base for the time being from exploitation while it grows and as we move into the next phase of development which will include extension into the multi-host distributed software space.	2021-12-14 23:33:27 -05:00
Tyler Goodlet	c9132de7dc	Move maybe-raise-error-msg logic into context A context method handling all this logic makes the most sense since it contains all the state related to whether the error should be raised in a nursery scope or is expected to be raised by a consumer task which reads and processes the msg directly (via a `Portal` API call). This also makes it easy to always process remote errors even when there is no (stream) overrun condition.	2021-12-06 16:32:23 -05:00
Tyler Goodlet	b826ec8103	Better idea, enable backpressure on opened streams Keeping it disabled on context open will help with detecting any stream connection which was never opened on one side of the task pair. In that case we can report that there was an overrun and a stream wasn't opened versus if the stream is explicitly configured not to use bp then we throw the standard overflow. Use `trio.Nursery._closed` to detect "closure" XD since it seems to be the most reliable way to determine if a spawn call will trigger a runtime error.	2021-12-06 11:54:21 -05:00
Tyler Goodlet	92b540d518	Add internal msg stream backpressure controls In preparation for supporting both backpressure detection (through an optional error) as well as control over the msg channel buffer size, add internal configuration flags for both to contexts. Also adjust `Context._err_on_from_remote_msg()` -> `._maybe..` such that it can be called and will only raise if a scope nursery has been set. Add a `Context._error` for stashing the remote task's error that may be delivered in an `'error'` message.	2021-12-05 19:19:53 -05:00
Tyler Goodlet	d307eab118	Rework `Actor.send_cmd()` to `.start_remote_task()` This more formally declares the runtime's remote task startingn API and uses it throughout all the dependent `Portal` API methods. Allows dropping `Portal._submit()` and simplifying `.run_in_actor()` style result waiting to be delegated to the context APIs at remote task `return` response time. We now also track the remote entrypoint "type` as `Context._remote_func_type`.	2021-12-04 18:20:43 -05:00
Tyler Goodlet	c5c3f7e789	Use `tractor.Context` throughout the runtime core Instead of tracking feeder mem chans per RPC dialog, store `Context` instances which (now) hold refs to the underlying RPC-task feeder chans and track them inside a `Actor._contexts` map. This begins a transition to making the "context" idea the primitive abstraction for representing messaging dialogs between tasks in different memory domains (i.e. usually separate processes). A slew of changes made this possible: - change `Actor.get_memchans()` -> `.get_context()`. - Add new `Context._send_chan` and `._recv_chan` vars. - implicitly create a new context on every `Actor.send_cmd()` call. - use the context created by `.send_cmd()` in `Portal.open_context()` instead of manually creating one. - call `Actor.get_context()` inside tasks run from `._invoke()` such that feeder chans are implicitly created for callee tasks thus fixing the bug #265. NB: We might change some of the internal semantics to do with when the feeder chans are actually created to denote whether or not a far end task is actually read to receive messages. For example, in the cases where it never will be ready to receive messages (one-way streaming, a context that never opens a stream, etc.) we will likely want some kind of error or at least warning to the caller that messages can't be sent (yet).	2021-12-03 14:49:55 -05:00
Tyler Goodlet	f4793af2b9	Error on mal-use of `Context.started()` Previously we were ignoring a race where the callee an opened task context could enter `Context.open_stream()` before calling `.started(). Disallow this as well as calling `.started()` more then once.	2021-12-03 10:08:55 -05:00
Tyler Goodlet	4114eb1d25	Move broadcast channel parts into trionics	2021-10-23 15:54:40 -04:00
Tyler Goodlet	10f66e5141	De-noise warnings, add a 'cancel' log level Now that we're on our way to a (somewhat) serious beta release I think it's about time to start de-noising the logging emissions. Since we're trying out this approach of "stack layer oriented" log levels, I figured this is a good time to move most of the "warnings" to what they should be: cancellation monitoring status messages. The level is set to 16 which is just above our "runtime" level but just below the traditional "info" level. I think this will be a decent approach since usually if you're confused about why your `tractor` app is behaving unlike you expect, it's 90% of the time going to be to do with cancellation or error propagation. This this setup a user can specify the 'cancel' level and see all the msgs pertaining to both actor and task-in-actor cancellation mechanics.	2021-10-06 17:02:13 -04:00
Tyler Goodlet	8b416e6bba	Stream and context api tweaks - drop `shield` input to `MsgStream` - check for cancel called prior to loading the feeder mem chan in `Context.open_stream()` - warn on a timeout when trying to cancel a remote task from `Context.cancel()` - drop noop endofchannel handler block	2021-10-04 10:38:44 -04:00
Tyler Goodlet	63ec740e27	Add some bcaster ref sanity asserts around subscriptions	2021-09-02 21:12:54 -04:00
Tyler Goodlet	a4cb0ef21f	Fix `.receive()` re-assignment, drop `.clone()`	2021-09-02 21:12:54 -04:00
Tyler Goodlet	6c17c7367a	Store handle to underlying channel's `.receive()` This allows for wrapping an existing stream by re-assigning its receive method to the allocated broadcaster's `.receive()` so as to avoid expecting any original consumer(s) of the stream to now know about the broadcaster; this instead mutates the stream to delegate to the new receive call behind the scenes any time `.subscribe()` is called. Add a `typing.Protocol` for so called "cloneable channels" until we decide/figure out a better keying system for each subscription and mask all undesired typing failures.	2021-09-02 21:12:54 -04:00
Tyler Goodlet	2d1c24112b	Add subscription support to message streams Add `ReceiveMsgStream.subscribe()` which allows allocating a broadcast receiver around the stream for use by multiple actor-local consumer tasks. Entering this context manager idempotently mutates the stream's receive machinery which for now can not be undone. Move `.clone()` to the receive stream type. Resolves #204	2021-09-02 21:12:54 -04:00
Tyler Goodlet	b4d95e9543	Update docs to new close semantics	2021-09-02 08:24:18 -04:00
Tyler Goodlet	af85d35685	Drop stream shielding; it was from a legacy design The whole origin was not having an explicit open/close semantic for streams. We have that now so this internal mechanic isn't needed and further our streams become more correct by having `.aclose()` be independent of cancellation.	2021-09-02 08:24:18 -04:00
Tyler Goodlet	fde52d2464	Mypy fixes	2021-07-08 12:48:34 -04:00
Tyler Goodlet	b21e2a6caa	Add pre-stream open error conditions	2021-07-06 08:23:30 -04:00
Tyler Goodlet	91640facbc	Always shield cancel the caller on cancel-causing-errors, add teardown logging	2021-07-06 08:23:30 -04:00
Tyler Goodlet	3423ea4011	Add temp warning msg for context cancel call	2021-07-06 08:23:29 -04:00
Tyler Goodlet	af701c16ee	Consider relaying context error via raised-in-scope-nursery task	2021-07-06 08:23:29 -04:00
Tyler Goodlet	1703171bea	Set stream "end of channel" after shielded check! Another face palm that was causing serious issues for code that is using the `.shielded` feature.. Add a bunch more detailed comments for all this subtlety and hopefully get it right once and for all. Also aggregated the `trio` errors that should trigger closure inside `.aclose()`, hopefully that's right too.	2021-07-06 08:23:29 -04:00
Tyler Goodlet	3d633408fc	Don't clobber msg loop mem chan on rx stream close Revert this change since it really is poking at internals and doesn't make a lot of sense. If the context is going to be cancelled then the msg loop will tear down the feed memory channel when ready, we don't need to be clobbering it and confusing the runtime machinery lol.	2021-07-06 08:23:29 -04:00
Tyler Goodlet	348148ff1e	Explicitly formalize context/streaming teardown Add clear teardown semantics for `Context` such that the remote side cancellation propagation happens only on error or if client code explicitly requests it (either by exit flag to `Portal.open_context()` or by manually calling `Context.cancel()`). Add `Context.result()` to wait on and capture the final result from a remote context function; any lingering msg sequence will be consumed/discarded. Changes in order to make this possible: - pass the runtime msg loop's feeder receive channel in to the context on the calling (portal opening) side such that a final 'return' msg can be waited upon using `Context.result()` which delivers the final return value from the callee side `@tractor.context` async function. - always await a final result from the target context function in `Portal.open_context()`'s `__aexit__()` if the context has not been (requested to be) cancelled by client code on block exit. - add an internal `Context._cancel_called` for context "cancel requested" tracking (much like `trio`'s cancel scope). - allow flagging a stream as terminated using an internal `._eoc` flag which will mark the stream as stopped for iteration. - drop `StopAsyncIteration` catching in `.receive()`; it does nothing.	2021-07-06 08:23:29 -04:00
Tyler Goodlet	9a4244b9a6	Support no arg to `Context.started()` like trio	2021-07-06 08:23:29 -04:00
Tyler Goodlet	a2e2f7e7a8	Only send stop msg if not received from far end	2021-07-06 08:23:29 -04:00
Tyler Goodlet	08eb6bd019	Fix typing	2021-07-06 08:23:29 -04:00
Tyler Goodlet	1f8966ba64	Support passing `shield` at stream contruction	2021-07-06 08:23:29 -04:00
Tyler Goodlet	e3955bb62b	Add initial bi-directional streaming This mostly adds the api described in https://github.com/goodboy/tractor/issues/53#issuecomment-806258798 The first draft summary: - formalize bidir steaming using the `trio.Channel` style interface which we derive as a `MsgStream` type. - add `Portal.open_context()` which provides a `trio.Nursery.start()` remote task invocation style for setting up and tearing down tasks contexts in remote actors. - add a distinct `'started'` message to the ipc protocol to facilitate `Context.start()` with a first return value. - for our `ReceiveMsgStream` type, don't cancel the remote task in `.aclose()`; this is now done explicitly by the surrounding `Context` usage: `Context.cancel()`. - streams in either direction still use a `'yield'` message keeping the proto mostly symmetric without having to worry about which side is the caller / portal opener. - subtlety: only allow sending a `'stop'` message during a 2-way streaming context from `ReceiveStream.aclose()`, detailed comment with explanation is included. Relates to #53	2021-07-06 08:23:29 -04:00
Tyler Goodlet	fc36e73628	Comment out `MsgStream` for now	2021-04-28 16:40:38 -04:00
Tyler Goodlet	80c96cab01	Add a warning for soon to be deprecated `ctx` use in `@stream` func	2021-04-28 12:23:08 -04:00
Tyler Goodlet	7f38b7225d	Aggregate and organize streaming components Move receive stream into streaming modules and rebrand as a "message stream". Factor out cancellation mechanics in `.aclose()` into the `Context` type which will soon provide the api for for cancelling portal invocations. Comment-stage a few methods on both types in anticipation of a new bi-directional streaming api. Add a `MsgStream` bidirectional channel type which will be the eventual type yielded from `Context.open_stream()`. Adjust the response/dialog types to be the set `{'asyncfun', 'asyncgen', 'context'}`. OH, and add async func checking in `Portal.run()` to catch and error on sync funcs early.	2021-04-28 12:23:08 -04:00

1 2

54 Commits (04e4397a8f5ab078935b7964e078ad389815e977)