forked from goodboy/tractor
1
0
Fork 0
Commit Graph

55 Commits (72b4dc14616ceb8372d4728ef6d922cd28220507)

Author SHA1 Message Date
Tyler Goodlet 4f863a6989 Refine and test `tractor.pause_from_sync()`
Now supports use from any `trio` task, any sync thread started with
`trio.to_thread.run_sync()` AND also via `breakpoint()` builtin API!
The only bit missing now is support for `asyncio` tasks when in infected
mode.. Bo

`greenback` setup/API adjustments:
- move `._rpc.maybe_import_gb()` to -> `devx._debug` and factor out the cached
  import checking into a sync func whilst placing the async `.ensure_portal()`
  bootstrapping into a new async `maybe_init_greenback()`.
- use the new init-er func inside `open_root_actor()` with the output
  predicating whether we override the `breakpoint()` hook.

core `devx._debug` implementation deatz:
- make `mk_mpdb()` only return the `pdp.Pdb` subtype instance since
  the sigint unshielding func is now accessible from the `Lock`
  singleton from anywhere.

- add non-main thread support (at least for `trio.to_thread` use cases)
  to our `Lock` with a new `.is_trio_thread()` predicate that delegates
  directly to `trio`'s internal version.

- do `Lock.is_trio_thread()` checks inside any methods which require
  special provisions when invoked from a non-main `trio` thread:
  - `.[un]shield_sigint()` methods since `signal.signal` usage is only
    allowed from cpython's main thread.
  - `.release()` since `trio.StrictFIFOLock` can only be called from
    a `trio` task.

- rework `.pause_from_sync()` itself to directly call `._set_trace()`
  and don't bother with `greenback._await()` when we're already calling
  it from a `.to_thread.run_sync()` thread, oh and try to use the
  thread/task name when setting `Lock.local_task_in_debug`.

- make it an RTE for now if you try to use `.pause_from_sync()` from any
  infected-`asyncio` task, but support is (hopefully) coming soon!

For testing we add a new `test_debugger.py::test_pause_from_sync()`
which includes a ctrl-c parametrization around the
`examples/debugging/sync_bp.py` script which includes all currently
supported/working usages:
- `tractor.pause_from_sync()`.
- via `breakpoint()` overload.
- from a `trio.to_thread.run_sync()` spawn.
2024-03-22 19:58:25 -04:00
Tyler Goodlet 71de56b09a Drop now-deprecated deps on modern `trio`/Python
- `trio_typing` is nearly obsolete since `trio >= 0.23`
- `exceptiongroup` is built-in to python 3.11
- `async_generator` primitives have lived in `contextlib` for quite
  a while!
2024-03-13 18:41:24 -04:00
Tyler Goodlet ededa2e88f More spaceless union type annots 2024-03-11 10:33:06 -04:00
Tyler Goodlet 4c3c3e4b56 Support a `._state.last_actor()` getter
Not sure if it's really that useful other then for reporting errors from
`current_actor()` but at least it alerts `tractor` devs and/or users
when the runtime has already terminated vs. hasn't been started
yet/correctly.

Set the `._last_actor_terminated: tuple` in the root's final block which
allows testing for an already terminated tree which is the case where
`._state._current_actor == None` and the last is set.
2024-03-08 14:11:17 -05:00
Tyler Goodlet e536057fea `._entry`: use same msg info in start/terminate log 2024-03-05 12:30:34 -05:00
Tyler Goodlet 28fefe4ffe Make stream draining status logs `.debug()` level 2024-03-01 19:27:10 -05:00
Tyler Goodlet 930d498841 Call `actor.cancel(None)` from root to avoid mismatch with (any future) meth sig changes 2024-02-22 14:45:08 -05:00
Tyler Goodlet 1d7cf7d1dd Enable `stackscope` render via root in debug mode
If `stackscope` is importable and debug_mode is enabled then we by
default call and report `.devx.enable_stack_on_sig()` is set B)

This makes debugging unexpected (SIGINT ignoring) hangs a cinch!
2024-02-20 13:23:16 -05:00
Tyler Goodlet 0294455c5e `_root`: drop unused `typing` import 2024-01-02 18:43:43 -05:00
Tyler Goodlet 87cd725adb Add `open_root_actor(ensure_registry: bool)`
Allows forcing the opened actor to either obtain the passed registry
addrs or raise a runtime error.
2023-11-07 16:45:24 -05:00
Tyler Goodlet 9da3b63644 Change remaining internals to use `Actor.reg_addrs` 2023-10-19 12:40:37 -04:00
Tyler Goodlet 022bf8ce75 Ensure `registry_addrs` is always set to something 2023-10-18 19:08:35 -04:00
Tyler Goodlet fcc8cee9d3 ._root: set a `_default_lo_addrs` and apply it when not provided by caller 2023-10-18 14:12:58 -04:00
Tyler Goodlet a09b8560bb Oof, default reg addrs needs to be in `list[tuple]` form.. 2023-10-07 18:52:37 -04:00
Tyler Goodlet 4314a59327 Add post-mortem catch around failed transport addr binds to aid with runtime debugging 2023-10-03 10:54:46 -04:00
Tyler Goodlet fa9a9cfb1d Kick off `.devx` subpkg for our dev tools B)
Where `.devx` is "developer experience", a hopefully broad enough subpkg
name for all the slick stuff planned to augment working on the actor
runtime 💥

Move the `._debug` module into the new subpkg and adjust rest of core
code base to reflect import path change. Also add a new
`.devx._debug.open_crash_handler()` manager for wrapping any sync code
outside a `trio.run()` which is handy for eventual CLI addons for
popular frameworks like `click`/`typer`.
2023-09-28 14:14:50 -04:00
Tyler Goodlet 3d0e95513c Init-support for "multi homed" transports
Since we'd like to eventually allow a diverse set of transport
(protocol) methods and stacks, and a multi-peer discovery system for
distributed actor-tree applications, this reworks all runtime internals
to support multi-homing for any given tree on a logical host. In other
words any actor can now bind its transport server (currently only
unsecured TCP + `msgspec`) to more then one address available in its
(linux) network namespace. Further, registry actors (now dubbed
"registars" instead of "arbiters") can also similarly bind to multiple
network addresses and provide discovery services to remote actors via
multiple addresses which can now be provided at runtime startup.

Deats:
- adjust `._runtime` internals to use a `list[tuple[str, int]]` (and
  thus pluralized) socket address sequence where applicable for transport
  server socket binds, now exposed via `Actor.accept_addrs`:
  - `Actor.__init__()` now takes a `registry_addrs: list`.
  - `Actor.is_arbiter` -> `.is_registrar`.
  - `._arb_addr` -> `._reg_addrs: list[tuple]`.
  - always reg and de-reg from all registrars in `async_main()`.
  - only set the global runtime var `'_root_mailbox'` to the loopback
    address since normally all in-tree processes should have access to
    it, right?
  - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]`
- make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]`
  and defaults when not passed.
- change `ActorNursery.start_..()` methods take `bind_addrs: list` and
  pass down through the spawning layer(s) via the parent-seed-msg.
- generalize all `._discovery()` APIs to accept `registry_addrs`-like
  inputs and move all relevant subsystems to adopt the "registry" style
  naming instead of "arbiter":
  - make `find_actor()` support batched concurrent portal queries over
    all provided input addresses using `.trionics.gather_contexts()` Bo
  - syntax: move to using `async with <tuples>` 3.9+ style chained
    @acms.
  - a general modernization of the code to a python 3.9+ style.
  - start deprecation and change to "registry" naming / semantics:
    - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 16:25:21 -04:00
Tyler Goodlet b36b3d522f Map `breakpoint()` built-in to new `.pause_from_sync()` ep 2023-07-07 15:35:52 -04:00
Tyler Goodlet c72026091e Remote `Context` cancellation semantics rework B)
This adds remote cancellation semantics to our `tractor.Context`
machinery to more closely match that of `trio.CancelScope` but
with operational differences to handle the nature of parallel tasks interoperating
across multiple memory boundaries:

- if an actor task cancels some context it has opened via
  `Context.cancel()`, the remote (scope linked) task will be cancelled
  using the normal `CancelScope` semantics of `trio` meaning the remote
  cancel scope surrounding the far side task is cancelled and
  `trio.Cancelled`s are expected to be raised in that scope as per
  normal `trio` operation, and in the case where no error is raised
  in that remote scope, a `ContextCancelled` error is raised inside the
  runtime machinery and relayed back to the opener/caller side of the
  context.
- if any actor task cancels a full remote actor runtime using
  `Portal.cancel_actor()` the same semantics as above apply except every
  other remote actor task which also has an open context with the actor
  which was cancelled will also be sent a `ContextCancelled` **but**
  with the `.canceller` field set to the uid of the original cancel
  requesting actor.

This changeset also includes a more "proper" solution to the issue of
"allowing overruns" during streaming without attempting to implement any
form of IPC streaming backpressure. Implementing task-granularity
backpressure cross-process turns out to be more or less impossible
without augmenting out streaming protocol (likely at the cost of
performance). Further allowing overruns requires special care since
any blocking of the runtime RPC msg loop task effectively can block
control msgs such as cancels and stream terminations.

The implementation details per abstraction layer are as follows.

._streaming.Context:
- add a new contructor factor func `mk_context()` which provides
  a strictly private init-er whilst allowing us to not have to define
  an `.__init__()` on the type def.
- add public `.cancel_called` and `.cancel_called_remote` properties.
- general rename of what was the internal `._backpressure` var to
  `._allow_overruns: bool`.
- move the old contents of `Actor._push_result()` into a new
  `._deliver_msg()` allowing for better encapsulation of per-ctx
  msg handling.
 - always check for received 'error' msgs and process them with the new
   `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to
   the local task, thus guaranteeing error and cancellation handling
   despite any overflow handling.
- add a new `._drain_overflows()` task-method for use with new
  `._allow_overruns: bool = True` mode.
 - add back a `._scope_nursery: trio.Nursery` (allocated in
   `Portal.open_context()`) who's sole purpose is to spawn a single task
   which runs the above method; anything else is an error.
 - augment `._deliver_msg()` to start a task and run the above method
   when operating in no overrun mode; the task queues overflow msgs and
   attempts to send them to the underlying mem chan using a blocking
   `.send()` call.
 - on context exit, any existing "drainer task" will be cancelled and
   remaining overflow queued msgs are discarded with a warning.
- rename `._error` -> `_remote_error` and set it in a new method
  `_maybe_cancel_and_set_remote_error()` which is called before
  processing
- adjust `.result()` to always call `._maybe_raise_remote_err()` at its
  start such that whenever a `ContextCancelled` arrives we do logic for
  whether or not to immediately raise that error or ignore it due to the
  current actor being the one who requested the cancel, by checking the
  error's `.canceller` field.
 - set the default value of `._result` to be `id(Context()` thus avoiding
   conflict with any `.result()` actually being `False`..

._runtime.Actor:
- augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to
  take a `requesting_uid: tuple` indicating the source actor of every
  cancellation request.
- pass through the new `Context._allow_overruns` through `.get_context()`
- call the new `Context._deliver_msg()` from `._push_result()` (since
  the factoring out that method's contents).

._runtime._invoke:
- `TastStatus.started()` back a `Context` (unless an error is raised)
  instead of the cancel scope to make it easy to set/get state on that
  context for the purposes of cancellation and remote error relay.
- always raise any remote error via `Context._maybe_raise_remote_err()`
  before doing any `ContextCancelled` logic.
- assign any `Context._cancel_called_remote` set by the `requesting_uid`
  cancel methods (mentioned above) to the `ContextCancelled.canceller`.

._runtime.process_messages:
- always pass a `requesting_uid: tuple` to `Actor.cancel()` and
  `._cancel_task` to that any corresponding `ContextCancelled.canceller`
  can be set inside `._invoke()`.
2023-05-15 10:00:45 -04:00
Tyler Goodlet 79622bbeea Restore `breakpoint()` hook after runtime exits
Previously we were leaking our (pdb++) override into the Python runtime
which would always result in a runtime error whenever `breakpoint()` is
called outside our runtime; after exit of the root actor . This
explicitly restores any previous hook override (detected during startup)
or deletes the hook and restores the environment if none existed prior.

Also adds a new WIP debugging example script to ensure breakpointing
works as normal after runtime close; this will be added to the test
suite.
2023-05-15 00:47:29 -04:00
Tyler Goodlet 121a8cc891 Drop `Optional` usage from root mod 2023-01-26 16:00:08 -05:00
Tyler Goodlet c54b8ca4ba Begin deprecation of `arbiter_addr` -> `registry_addr` 2023-01-26 16:00:08 -05:00
Tyler Goodlet 0956d5f461 Restore the `trio` SIGINT handler, cancel root lock tasks on no-peers
Pretty sure this is the final touch to alleviate all our debug lock
headaches! Instead of trying to revert to the "last" handler (as `pdb`
does internally in the stdlib) we always just revert to the handler
`trio` registers during startup. Further this seems to allow cancelling
the root-side locking task if it's detected as stale IFF we only do this
when the root actor is in a "no more IPC peers" state.

Deatz:
- (always) set `._debug.Lock._trio_handler` as the `trio` version, not
  some last used handler to make sure we're getting the ctrl-c handling
  we want when not in debug mode.
- assign the trio handler in `open_root_actor()`
  `._runtime._async_main()` to be sure it's applied in subactors as well
  as the root.
- only do debug lock blocking and root-side-locking-task cancels when
  a "no peers" condition is detected in the root actor: i.e. no IPC
  channels are detected by the root meaning it's impossible any actor
  has a sane lock-state ongoing for debug mode.
2022-10-14 18:18:01 -04:00
Tyler Goodlet 50fe098e06 First pass, swap `MultiError` for `BaseExceptionGroup` 2022-10-14 18:16:51 -04:00
Tyler Goodlet 90f4912580 Organize process spawning into lookup table
Instead of the logic branching create a table `._spawn._methods`
which is used to lookup the desired backend framework (in this case
still only one of `multiprocessing` or `trio`) and make the top level
`.new_proc()` do the lookup and any common logic. Use a `typing.Literal`
to define the lookup table's key set.

Repair and ignore a bunch of type-annot related stuff todo with `mypy`
updates and backend-specific process typing.
2022-10-09 16:51:21 -04:00
Tyler Goodlet ad19bf2cf1 Remove `tractor.run()` once and for all
It's been deprecated for a while now and all docs and tests have been
changed.

Closes #183
2022-09-15 23:41:28 -04:00
Tyler Goodlet 208d56af2c Make `async_main()` a module func 2022-09-15 23:41:28 -04:00
Tyler Goodlet d4084b2032 Rename our core module to `_runtime` 2022-09-15 23:41:28 -04:00
Tyler Goodlet c0cd99e374 Timeout on arbiter ping, avoid TCP SYN hangs in CI? 2022-08-02 12:17:28 -04:00
Tyler Goodlet ff3f5959e9 Always enable debug level logging if mode enabled 2022-08-02 12:16:58 -04:00
Tyler Goodlet d80f8d7a39 WIP redo asyncio async gen streaming 2021-12-17 09:38:04 -05:00
Tyler Goodlet 213447008b Add draft code for waiting on all nurseries in root 2021-12-16 18:02:03 -05:00
Tyler Goodlet 6f94ffc304 Re-license code base for distribution under AGPL
This commit obviously denotes a re-license of all applicable parts of
the code base. Acknowledgement of this change was completed in #274 by
the majority of the current set of contributors. From here henceforth
all changes will be AGPL licensed and distributed. This is purely an
effort to maintain the same copy-left policy whilst closing the
(perceived) SaaS loophole the GPL allows for. It is merely for this
loophole: to avoid code hiding by any potential "network providers" who
are attempting to use the project to make a profit without either
compensating the authors or re-distributing their changes.

I thought quite a bit about this change and can't see a reason not to
close the SaaS loophole in our current license. We still are (hard)
copy-left and I plan to keep the code base this way for a couple
reasons:

- The code base produces income/profit through parent projects and is
  demonstrably of high value.
- I believe firms should not get free lunch for the sake of
  "contributions from their employees" or "usage as a service" which
  I have found to be a dubious argument at best.
- If a firm who intends to profit from the code base wants to use it
  they can propose a secondary commercial license to purchase with the
  proceeds going to the project's authors under some form of well
  defined contract.
- Many successful projects like Qt use this model; I see no reason it
  can't work in this case until such a time as the authors feel it
  should be loosened.

There has been detailed discussion in #103 on licensing alternatives.
The main point of this AGPL change is to protect the code base for the
time being from exploitation while it grows and as we move into the next
phase of development which will include extension into the multi-host
distributed software space.
2021-12-14 23:33:27 -05:00
Tyler Goodlet 92c6ec1882 `get_loglevel()` always returns a str 2021-12-07 13:17:00 -05:00
Tyler Goodlet 72eef2a4a1 Config debug mode log level *after* initial setup 2021-12-07 13:16:07 -05:00
Tyler Goodlet 9bd5226e76 Only adjust logging in debug mode if not noisy enough already 2021-12-07 13:13:04 -05:00
Tyler Goodlet 10f66e5141 De-noise warnings, add a 'cancel' log level
Now that we're on our way to a (somewhat) serious beta release I think
it's about time to start de-noising the logging emissions. Since we're
trying out this approach of "stack layer oriented" log levels, I figured
this is a good time to move most of the "warnings" to what they should
be: cancellation monitoring status messages. The level is set to 16
which is just above our "runtime" level but just below the traditional
"info" level. I think this will be a decent approach since usually if
you're confused about why your `tractor` app is behaving unlike you
expect, it's 90% of the time going to be to do with cancellation or
error propagation. This this setup a user can specify the 'cancel' level
and see all the msgs pertaining to both actor and task-in-actor
cancellation mechanics.
2021-10-06 17:02:13 -04:00
Tyler Goodlet 1382ad653d Ugh, appease mypy yet again 2021-10-05 13:37:17 -04:00
Tyler Goodlet 19d6885243 Ensure tuple for passed in arbiter addr 2021-10-05 13:37:17 -04:00
Tyler Goodlet 7431e8ea01 Don't log cancelled inceptions seen by the root 2021-08-02 21:15:42 -04:00
Tyler Goodlet 674fbbc6b3 Docs and comments tidying 2021-08-01 10:44:13 -04:00
Tyler Goodlet 09f00a5a00 Go back to only logging tbs on no debugger 2021-07-31 12:46:40 -04:00
Tyler Goodlet 443ebea165 Use "pdb" level logging in debug mode 2021-07-08 13:02:33 -04:00
Tyler Goodlet b1cd7fdedf Don't shield on root cancel it can causes hangs 2021-07-06 08:23:30 -04:00
Tyler Goodlet 6aab16f877 Drop added logging around root cancel 2021-07-04 11:00:08 -04:00
Tyler Goodlet 1edf5c2f06 Specially remap TCP 104-connection-reset to `TransportClosed`
Since we currently have no real "discovery protocol" between process
trees, the current naive approach is to check via a connect and drop to
see if a TCP server is bound to a particular address during root actor
startup. This was a historical decision and had no real grounding beyond
taking a simple approach to get something working when the project
was first started.

This  is obviously problematic from an error handling perspective since
we need to be able to avoid such quick connect-and-drops from cancelling
an "arbiter"'s (registry actor's) channel-msg loop machinery (which
would propagate and cancel the actor).

For now we map this particular TCP error, which gets remapped by `trio`
as a `trio.BrokenResourceError` to our own internal `TransportClosed`
which is swallowed by channel message loop processing and indicates
a graceful teardown of the far end actor.
2021-07-03 18:57:54 -04:00
Tyler Goodlet c4b42000eb Shield around root actor cancel 2021-05-06 12:05:17 -04:00
Tyler Goodlet 47565cfbf3 Use root as default name from `tractor.run()` 2021-02-25 08:51:28 -05:00
Tyler Goodlet cd636b270e Update debug tests to expect 'root' actor name 2021-02-24 13:38:20 -05:00
Tyler Goodlet b285db4c58 Factor OCA supervisor into new func 2021-02-24 13:13:38 -05:00