tractor

jc211

tractor

Author	SHA1	Message	Date
Tyler Goodlet	4f863a6989	Refine and test `tractor.pause_from_sync()` Now supports use from any `trio` task, any sync thread started with `trio.to_thread.run_sync()` AND also via `breakpoint()` builtin API! The only bit missing now is support for `asyncio` tasks when in infected mode.. Bo `greenback` setup/API adjustments: - move `._rpc.maybe_import_gb()` to -> `devx._debug` and factor out the cached import checking into a sync func whilst placing the async `.ensure_portal()` bootstrapping into a new async `maybe_init_greenback()`. - use the new init-er func inside `open_root_actor()` with the output predicating whether we override the `breakpoint()` hook. core `devx._debug` implementation deatz: - make `mk_mpdb()` only return the `pdp.Pdb` subtype instance since the sigint unshielding func is now accessible from the `Lock` singleton from anywhere. - add non-main thread support (at least for `trio.to_thread` use cases) to our `Lock` with a new `.is_trio_thread()` predicate that delegates directly to `trio`'s internal version. - do `Lock.is_trio_thread()` checks inside any methods which require special provisions when invoked from a non-main `trio` thread: - `.[un]shield_sigint()` methods since `signal.signal` usage is only allowed from cpython's main thread. - `.release()` since `trio.StrictFIFOLock` can only be called from a `trio` task. - rework `.pause_from_sync()` itself to directly call `._set_trace()` and don't bother with `greenback._await()` when we're already calling it from a `.to_thread.run_sync()` thread, oh and try to use the thread/task name when setting `Lock.local_task_in_debug`. - make it an RTE for now if you try to use `.pause_from_sync()` from any infected-`asyncio` task, but support is (hopefully) coming soon! For testing we add a new `test_debugger.py::test_pause_from_sync()` which includes a ctrl-c parametrization around the `examples/debugging/sync_bp.py` script which includes all currently supported/working usages: - `tractor.pause_from_sync()`. - via `breakpoint()` overload. - from a `trio.to_thread.run_sync()` spawn.	2024-03-22 19:58:25 -04:00
Tyler Goodlet	71de56b09a	Drop now-deprecated deps on modern `trio`/Python - `trio_typing` is nearly obsolete since `trio >= 0.23` - `exceptiongroup` is built-in to python 3.11 - `async_generator` primitives have lived in `contextlib` for quite a while!	2024-03-13 18:41:24 -04:00
Tyler Goodlet	ededa2e88f	More spaceless union type annots	2024-03-11 10:33:06 -04:00
Tyler Goodlet	4c3c3e4b56	Support a `._state.last_actor()` getter Not sure if it's really that useful other then for reporting errors from `current_actor()` but at least it alerts `tractor` devs and/or users when the runtime has already terminated vs. hasn't been started yet/correctly. Set the `._last_actor_terminated: tuple` in the root's final block which allows testing for an already terminated tree which is the case where `._state._current_actor == None` and the last is set.	2024-03-08 14:11:17 -05:00
Tyler Goodlet	e536057fea	`._entry`: use same msg info in start/terminate log	2024-03-05 12:30:34 -05:00
Tyler Goodlet	28fefe4ffe	Make stream draining status logs `.debug()` level	2024-03-01 19:27:10 -05:00
Tyler Goodlet	930d498841	Call `actor.cancel(None)` from root to avoid mismatch with (any future) meth sig changes	2024-02-22 14:45:08 -05:00
Tyler Goodlet	1d7cf7d1dd	Enable `stackscope` render via root in debug mode If `stackscope` is importable and debug_mode is enabled then we by default call and report `.devx.enable_stack_on_sig()` is set B) This makes debugging unexpected (SIGINT ignoring) hangs a cinch!	2024-02-20 13:23:16 -05:00
Tyler Goodlet	0294455c5e	`_root`: drop unused `typing` import	2024-01-02 18:43:43 -05:00
Tyler Goodlet	87cd725adb	Add `open_root_actor(ensure_registry: bool)` Allows forcing the opened actor to either obtain the passed registry addrs or raise a runtime error.	2023-11-07 16:45:24 -05:00
Tyler Goodlet	9da3b63644	Change remaining internals to use `Actor.reg_addrs`	2023-10-19 12:40:37 -04:00
Tyler Goodlet	022bf8ce75	Ensure `registry_addrs` is always set to something	2023-10-18 19:08:35 -04:00
Tyler Goodlet	fcc8cee9d3	._root: set a `_default_lo_addrs` and apply it when not provided by caller	2023-10-18 14:12:58 -04:00
Tyler Goodlet	a09b8560bb	Oof, default reg addrs needs to be in `list[tuple]` form..	2023-10-07 18:52:37 -04:00
Tyler Goodlet	4314a59327	Add post-mortem catch around failed transport addr binds to aid with runtime debugging	2023-10-03 10:54:46 -04:00
Tyler Goodlet	fa9a9cfb1d	Kick off `.devx` subpkg for our dev tools B) Where `.devx` is "developer experience", a hopefully broad enough subpkg name for all the slick stuff planned to augment working on the actor runtime 💥 Move the `._debug` module into the new subpkg and adjust rest of core code base to reflect import path change. Also add a new `.devx._debug.open_crash_handler()` manager for wrapping any sync code outside a `trio.run()` which is handy for eventual CLI addons for popular frameworks like `click`/`typer`.	2023-09-28 14:14:50 -04:00
Tyler Goodlet	3d0e95513c	Init-support for "multi homed" transports Since we'd like to eventually allow a diverse set of transport (protocol) methods and stacks, and a multi-peer discovery system for distributed actor-tree applications, this reworks all runtime internals to support multi-homing for any given tree on a logical host. In other words any actor can now bind its transport server (currently only unsecured TCP + `msgspec`) to more then one address available in its (linux) network namespace. Further, registry actors (now dubbed "registars" instead of "arbiters") can also similarly bind to multiple network addresses and provide discovery services to remote actors via multiple addresses which can now be provided at runtime startup. Deats: - adjust `._runtime` internals to use a `list[tuple[str, int]]` (and thus pluralized) socket address sequence where applicable for transport server socket binds, now exposed via `Actor.accept_addrs`: - `Actor.__init__()` now takes a `registry_addrs: list`. - `Actor.is_arbiter` -> `.is_registrar`. - `._arb_addr` -> `._reg_addrs: list[tuple]`. - always reg and de-reg from all registrars in `async_main()`. - only set the global runtime var `'_root_mailbox'` to the loopback address since normally all in-tree processes should have access to it, right? - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]` - make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]` and defaults when not passed. - change `ActorNursery.start_..()` methods take `bind_addrs: list` and pass down through the spawning layer(s) via the parent-seed-msg. - generalize all `._discovery()` APIs to accept `registry_addrs`-like inputs and move all relevant subsystems to adopt the "registry" style naming instead of "arbiter": - make `find_actor()` support batched concurrent portal queries over all provided input addresses using `.trionics.gather_contexts()` Bo - syntax: move to using `async with <tuples>` 3.9+ style chained @acms. - a general modernization of the code to a python 3.9+ style. - start deprecation and change to "registry" naming / semantics: - `._discovery.get_arbiter()` -> `.get_registry()`	2023-09-27 16:25:21 -04:00
Tyler Goodlet	b36b3d522f	Map `breakpoint()` built-in to new `.pause_from_sync()` ep	2023-07-07 15:35:52 -04:00
Tyler Goodlet	c72026091e	Remote `Context` cancellation semantics rework B) This adds remote cancellation semantics to our `tractor.Context` machinery to more closely match that of `trio.CancelScope` but with operational differences to handle the nature of parallel tasks interoperating across multiple memory boundaries: - if an actor task cancels some context it has opened via `Context.cancel()`, the remote (scope linked) task will be cancelled using the normal `CancelScope` semantics of `trio` meaning the remote cancel scope surrounding the far side task is cancelled and `trio.Cancelled`s are expected to be raised in that scope as per normal `trio` operation, and in the case where no error is raised in that remote scope, a `ContextCancelled` error is raised inside the runtime machinery and relayed back to the opener/caller side of the context. - if any actor task cancels a full remote actor runtime using `Portal.cancel_actor()` the same semantics as above apply except every other remote actor task which also has an open context with the actor which was cancelled will also be sent a `ContextCancelled` but with the `.canceller` field set to the uid of the original cancel requesting actor. This changeset also includes a more "proper" solution to the issue of "allowing overruns" during streaming without attempting to implement any form of IPC streaming backpressure. Implementing task-granularity backpressure cross-process turns out to be more or less impossible without augmenting out streaming protocol (likely at the cost of performance). Further allowing overruns requires special care since any blocking of the runtime RPC msg loop task effectively can block control msgs such as cancels and stream terminations. The implementation details per abstraction layer are as follows. ._streaming.Context: - add a new contructor factor func `mk_context()` which provides a strictly private init-er whilst allowing us to not have to define an `.__init__()` on the type def. - add public `.cancel_called` and `.cancel_called_remote` properties. - general rename of what was the internal `._backpressure` var to `._allow_overruns: bool`. - move the old contents of `Actor._push_result()` into a new `._deliver_msg()` allowing for better encapsulation of per-ctx msg handling. - always check for received 'error' msgs and process them with the new `_maybe_cancel_and_set_remote_error()` before any msg delivery to the local task, thus guaranteeing error and cancellation handling despite any overflow handling. - add a new `._drain_overflows()` task-method for use with new `._allow_overruns: bool = True` mode. - add back a `._scope_nursery: trio.Nursery` (allocated in `Portal.open_context()`) who's sole purpose is to spawn a single task which runs the above method; anything else is an error. - augment `._deliver_msg()` to start a task and run the above method when operating in no overrun mode; the task queues overflow msgs and attempts to send them to the underlying mem chan using a blocking `.send()` call. - on context exit, any existing "drainer task" will be cancelled and remaining overflow queued msgs are discarded with a warning. - rename `._error` -> `_remote_error` and set it in a new method `_maybe_cancel_and_set_remote_error()` which is called before processing - adjust `.result()` to always call `._maybe_raise_remote_err()` at its start such that whenever a `ContextCancelled` arrives we do logic for whether or not to immediately raise that error or ignore it due to the current actor being the one who requested the cancel, by checking the error's `.canceller` field. - set the default value of `._result` to be `id(Context()` thus avoiding conflict with any `.result()` actually being `False`.. ._runtime.Actor: - augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to take a `requesting_uid: tuple` indicating the source actor of every cancellation request. - pass through the new `Context._allow_overruns` through `.get_context()` - call the new `Context._deliver_msg()` from `._push_result()` (since the factoring out that method's contents). ._runtime._invoke: - `TastStatus.started()` back a `Context` (unless an error is raised) instead of the cancel scope to make it easy to set/get state on that context for the purposes of cancellation and remote error relay. - always raise any remote error via `Context._maybe_raise_remote_err()` before doing any `ContextCancelled` logic. - assign any `Context._cancel_called_remote` set by the `requesting_uid` cancel methods (mentioned above) to the `ContextCancelled.canceller`. ._runtime.process_messages: - always pass a `requesting_uid: tuple` to `Actor.cancel()` and `._cancel_task` to that any corresponding `ContextCancelled.canceller` can be set inside `._invoke()`.	2023-05-15 10:00:45 -04:00
Tyler Goodlet	79622bbeea	Restore `breakpoint()` hook after runtime exits Previously we were leaking our (pdb++) override into the Python runtime which would always result in a runtime error whenever `breakpoint()` is called outside our runtime; after exit of the root actor . This explicitly restores any previous hook override (detected during startup) or deletes the hook and restores the environment if none existed prior. Also adds a new WIP debugging example script to ensure breakpointing works as normal after runtime close; this will be added to the test suite.	2023-05-15 00:47:29 -04:00
Tyler Goodlet	121a8cc891	Drop `Optional` usage from root mod	2023-01-26 16:00:08 -05:00
Tyler Goodlet	c54b8ca4ba	Begin deprecation of `arbiter_addr` -> `registry_addr`	2023-01-26 16:00:08 -05:00
Tyler Goodlet	0956d5f461	Restore the `trio` SIGINT handler, cancel root lock tasks on no-peers Pretty sure this is the final touch to alleviate all our debug lock headaches! Instead of trying to revert to the "last" handler (as `pdb` does internally in the stdlib) we always just revert to the handler `trio` registers during startup. Further this seems to allow cancelling the root-side locking task if it's detected as stale IFF we only do this when the root actor is in a "no more IPC peers" state. Deatz: - (always) set `._debug.Lock._trio_handler` as the `trio` version, not some last used handler to make sure we're getting the ctrl-c handling we want when not in debug mode. - assign the trio handler in `open_root_actor()` `._runtime._async_main()` to be sure it's applied in subactors as well as the root. - only do debug lock blocking and root-side-locking-task cancels when a "no peers" condition is detected in the root actor: i.e. no IPC channels are detected by the root meaning it's impossible any actor has a sane lock-state ongoing for debug mode.	2022-10-14 18:18:01 -04:00
Tyler Goodlet	50fe098e06	First pass, swap `MultiError` for `BaseExceptionGroup`	2022-10-14 18:16:51 -04:00
Tyler Goodlet	90f4912580	Organize process spawning into lookup table Instead of the logic branching create a table `._spawn._methods` which is used to lookup the desired backend framework (in this case still only one of `multiprocessing` or `trio`) and make the top level `.new_proc()` do the lookup and any common logic. Use a `typing.Literal` to define the lookup table's key set. Repair and ignore a bunch of type-annot related stuff todo with `mypy` updates and backend-specific process typing.	2022-10-09 16:51:21 -04:00
Tyler Goodlet	ad19bf2cf1	Remove `tractor.run()` once and for all It's been deprecated for a while now and all docs and tests have been changed. Closes #183	2022-09-15 23:41:28 -04:00
Tyler Goodlet	208d56af2c	Make `async_main()` a module func	2022-09-15 23:41:28 -04:00
Tyler Goodlet	d4084b2032	Rename our core module to `_runtime`	2022-09-15 23:41:28 -04:00
Tyler Goodlet	c0cd99e374	Timeout on arbiter ping, avoid TCP SYN hangs in CI?	2022-08-02 12:17:28 -04:00
Tyler Goodlet	ff3f5959e9	Always enable debug level logging if mode enabled	2022-08-02 12:16:58 -04:00
Tyler Goodlet	d80f8d7a39	WIP redo asyncio async gen streaming	2021-12-17 09:38:04 -05:00
Tyler Goodlet	213447008b	Add draft code for waiting on all nurseries in root	2021-12-16 18:02:03 -05:00
Tyler Goodlet	6f94ffc304	Re-license code base for distribution under AGPL This commit obviously denotes a re-license of all applicable parts of the code base. Acknowledgement of this change was completed in #274 by the majority of the current set of contributors. From here henceforth all changes will be AGPL licensed and distributed. This is purely an effort to maintain the same copy-left policy whilst closing the (perceived) SaaS loophole the GPL allows for. It is merely for this loophole: to avoid code hiding by any potential "network providers" who are attempting to use the project to make a profit without either compensating the authors or re-distributing their changes. I thought quite a bit about this change and can't see a reason not to close the SaaS loophole in our current license. We still are (hard) copy-left and I plan to keep the code base this way for a couple reasons: - The code base produces income/profit through parent projects and is demonstrably of high value. - I believe firms should not get free lunch for the sake of "contributions from their employees" or "usage as a service" which I have found to be a dubious argument at best. - If a firm who intends to profit from the code base wants to use it they can propose a secondary commercial license to purchase with the proceeds going to the project's authors under some form of well defined contract. - Many successful projects like Qt use this model; I see no reason it can't work in this case until such a time as the authors feel it should be loosened. There has been detailed discussion in #103 on licensing alternatives. The main point of this AGPL change is to protect the code base for the time being from exploitation while it grows and as we move into the next phase of development which will include extension into the multi-host distributed software space.	2021-12-14 23:33:27 -05:00
Tyler Goodlet	92c6ec1882	`get_loglevel()` always returns a str	2021-12-07 13:17:00 -05:00
Tyler Goodlet	72eef2a4a1	Config debug mode log level after initial setup	2021-12-07 13:16:07 -05:00
Tyler Goodlet	9bd5226e76	Only adjust logging in debug mode if not noisy enough already	2021-12-07 13:13:04 -05:00
Tyler Goodlet	10f66e5141	De-noise warnings, add a 'cancel' log level Now that we're on our way to a (somewhat) serious beta release I think it's about time to start de-noising the logging emissions. Since we're trying out this approach of "stack layer oriented" log levels, I figured this is a good time to move most of the "warnings" to what they should be: cancellation monitoring status messages. The level is set to 16 which is just above our "runtime" level but just below the traditional "info" level. I think this will be a decent approach since usually if you're confused about why your `tractor` app is behaving unlike you expect, it's 90% of the time going to be to do with cancellation or error propagation. This this setup a user can specify the 'cancel' level and see all the msgs pertaining to both actor and task-in-actor cancellation mechanics.	2021-10-06 17:02:13 -04:00
Tyler Goodlet	1382ad653d	Ugh, appease mypy yet again	2021-10-05 13:37:17 -04:00
Tyler Goodlet	19d6885243	Ensure tuple for passed in arbiter addr	2021-10-05 13:37:17 -04:00
Tyler Goodlet	7431e8ea01	Don't log cancelled inceptions seen by the root	2021-08-02 21:15:42 -04:00
Tyler Goodlet	674fbbc6b3	Docs and comments tidying	2021-08-01 10:44:13 -04:00
Tyler Goodlet	09f00a5a00	Go back to only logging tbs on no debugger	2021-07-31 12:46:40 -04:00
Tyler Goodlet	443ebea165	Use "pdb" level logging in debug mode	2021-07-08 13:02:33 -04:00
Tyler Goodlet	b1cd7fdedf	Don't shield on root cancel it can causes hangs	2021-07-06 08:23:30 -04:00
Tyler Goodlet	6aab16f877	Drop added logging around root cancel	2021-07-04 11:00:08 -04:00
Tyler Goodlet	1edf5c2f06	Specially remap TCP 104-connection-reset to `TransportClosed` Since we currently have no real "discovery protocol" between process trees, the current naive approach is to check via a connect and drop to see if a TCP server is bound to a particular address during root actor startup. This was a historical decision and had no real grounding beyond taking a simple approach to get something working when the project was first started. This is obviously problematic from an error handling perspective since we need to be able to avoid such quick connect-and-drops from cancelling an "arbiter"'s (registry actor's) channel-msg loop machinery (which would propagate and cancel the actor). For now we map this particular TCP error, which gets remapped by `trio` as a `trio.BrokenResourceError` to our own internal `TransportClosed` which is swallowed by channel message loop processing and indicates a graceful teardown of the far end actor.	2021-07-03 18:57:54 -04:00
Tyler Goodlet	c4b42000eb	Shield around root actor cancel	2021-05-06 12:05:17 -04:00
Tyler Goodlet	47565cfbf3	Use root as default name from `tractor.run()`	2021-02-25 08:51:28 -05:00
Tyler Goodlet	cd636b270e	Update debug tests to expect 'root' actor name	2021-02-24 13:38:20 -05:00
Tyler Goodlet	b285db4c58	Factor OCA supervisor into new func	2021-02-24 13:13:38 -05:00

1 2

55 Commits (72b4dc14616ceb8372d4728ef6d922cd28220507)