Compare commits

...

215 Commits

Author SHA1 Message Date
Tyler Goodlet 72b4dc1461 Provision for infected-`asyncio` debug mode support
It's **almost** there, we're just missing the final translation code to
get from an `asyncio` side task to be able to call
`.devx._debug..wait_for_parent_stdin_hijack()` to do root actor TTY
locking. Then we just need to ensure internals also do the right thing
with `greenback()` for equivalent sync `breakpoint()` style pause
points.

Since i'm deferring this until later, tossing in some xfail tests to
`test_infected_asyncio` with TODOs for the needed implementation as well
as eventual test org.

By "provision" it means we add:
- `greenback` init block to `_run_asyncio_task()` when debug mode is
  enabled (but which will currently rte when `asyncio` is detected)
  using `.bestow_portal()` around the `asyncio.Task`.
- a call to `_debug.maybe_init_greenback()` in the `run_as_asyncio_guest()`
  guest-mode entry point.
- as part of `._debug.Lock.is_main_trio_thread()` whenever the async-lib
  is not 'trio' error lock the backend name (which is obvi `'asyncio'`
  in this use case).
2024-03-25 16:09:32 -04:00
Tyler Goodlet 90bfdaf58c Drop extra newline from log msg 2024-03-25 15:03:33 -04:00
Tyler Goodlet 507cd96904 Change all `| None` -> `|None` in `._runtime` 2024-03-25 14:15:36 -04:00
Tyler Goodlet 2588e54867 Add todo-notes for hiding `@acm` frames
In the particular case of the `Portal.open_context().__aexit__()` frame,
due to usage of `contextlib.asynccontextmanager`, we can't easily hook
into monkeypatching a `__tracebackhide__` set nor catch-n-reraise around
the block exit without defining our own `.__aexit__()` impl. Thus, it's
prolly most sane to do something with an override of
`contextlib._AsyncGeneratorContextManager` or the public exposed
`AsyncContextDecorator` (which uses the former internally right?).

Also fixup some old `._invoke` mod paths in comments and just show
`str(eoc)` in `.open_stream().__aexit__()` terminated-by-EoC log msg
since the `repr()` form won't pprint the IPC msg nicely..
2024-03-24 16:49:07 -04:00
Tyler Goodlet 0055c1d954 Tweak main thread predicate to ensure `trio.run()`
Change the name to `Lock.is_main_trio_thread()` indicating that when
`True` the thread is both the main one **and** the one that called
`trio.run()`. Add a todo for just copying the
`trio._util.is_main_thread()` impl (since it's private / may change) and
some brief notes about potential usage of
`trio.from_thread.check_cancelled()` to detect non-`.to_thread` thread
spawns.
2024-03-24 16:47:28 -04:00
Tyler Goodlet 4f863a6989 Refine and test `tractor.pause_from_sync()`
Now supports use from any `trio` task, any sync thread started with
`trio.to_thread.run_sync()` AND also via `breakpoint()` builtin API!
The only bit missing now is support for `asyncio` tasks when in infected
mode.. Bo

`greenback` setup/API adjustments:
- move `._rpc.maybe_import_gb()` to -> `devx._debug` and factor out the cached
  import checking into a sync func whilst placing the async `.ensure_portal()`
  bootstrapping into a new async `maybe_init_greenback()`.
- use the new init-er func inside `open_root_actor()` with the output
  predicating whether we override the `breakpoint()` hook.

core `devx._debug` implementation deatz:
- make `mk_mpdb()` only return the `pdp.Pdb` subtype instance since
  the sigint unshielding func is now accessible from the `Lock`
  singleton from anywhere.

- add non-main thread support (at least for `trio.to_thread` use cases)
  to our `Lock` with a new `.is_trio_thread()` predicate that delegates
  directly to `trio`'s internal version.

- do `Lock.is_trio_thread()` checks inside any methods which require
  special provisions when invoked from a non-main `trio` thread:
  - `.[un]shield_sigint()` methods since `signal.signal` usage is only
    allowed from cpython's main thread.
  - `.release()` since `trio.StrictFIFOLock` can only be called from
    a `trio` task.

- rework `.pause_from_sync()` itself to directly call `._set_trace()`
  and don't bother with `greenback._await()` when we're already calling
  it from a `.to_thread.run_sync()` thread, oh and try to use the
  thread/task name when setting `Lock.local_task_in_debug`.

- make it an RTE for now if you try to use `.pause_from_sync()` from any
  infected-`asyncio` task, but support is (hopefully) coming soon!

For testing we add a new `test_debugger.py::test_pause_from_sync()`
which includes a ctrl-c parametrization around the
`examples/debugging/sync_bp.py` script which includes all currently
supported/working usages:
- `tractor.pause_from_sync()`.
- via `breakpoint()` overload.
- from a `trio.to_thread.run_sync()` spawn.
2024-03-22 19:58:25 -04:00
Tyler Goodlet c04d77a3c9 First draft workin minus non-main-thread usage! 2024-03-20 19:13:13 -04:00
Tyler Goodlet 8e66f45e23 Lul, don't overwrite 'tb_str' with src actor's..
This is what was breaking the nested debugger test (where it was failing
on the traceback content matching) and it makes sense.. XD
=> We always want to use the locally boxed `RemoteActorError`'s
traceback content NOT overwrite it with that from the src actor..

Also gets rid of setting the `'relay_uid'` since it's pulled from the
final element in the `'relay_path'` anyway.
2024-03-20 11:36:39 -04:00
Tyler Goodlet 290b0a86b1 Another cancel-req-invalid log msg fmt tweak 2024-03-20 10:42:17 -04:00
Tyler Goodlet d5e5174d97 Extend inter-peer cancel tests for "inceptions"
Use new `RemoteActorError` fields in various assertions particularly
ensuring that an RTE relayed through the spawner from the little_bro
shows up at the client with the right number of entries in the
`.relay_path` and that the error is raised in the client as desired in
the original use case from `modden`'s remote spawn spawn request API
(which was kinda the whole original motivation to finally get all this
multi-actor error relay stuff workin).

Case extensions:
- RTE relayed from little_bro through spawner to client when
  `raise_sub_spawn_error_after` is set; in this case test should raise
  the relayed and RAE boxed RTE right up to the `trio.run()`.
  -> ensure the `rae.src_uid`, `.relay_uid` are set correctly.
  -> ensure ctx cancels are no acked.
- use `expect_ctxc()` around root's `tell_little_bro()` usage.
- do `debug_mode` assertions when enabled by test harness in each actor
  layer.
- obvi use new `.src_type`/`.boxed_type` for final error propagation
  assertions.
2024-03-20 10:29:40 -04:00
Tyler Goodlet 8ab5e08830 Adjust advanced faults test(s) for absorbed EoCs
More or less just simplifies to not seeing the stream closure errors and
instead expecting KBIs from the simulated user who 'ctl-cs after hang'.

Toss in a little `stuff_hangin_ctlc()` to the script to wrap all that
and always check stream closure before sending the final KBI.
2024-03-19 19:33:06 -04:00
Tyler Goodlet 668016d37b Absorb EoCs via `Context.open_stream()` silently
I swear long ago it used to operate this way but, I guess this finalizes
the design decision. It makes a lot more sense to *not* propagate any
`trio.EndOfChannel` raised from a `Context.open_stream() as stream:`
block when that EoC is due to graceful-explicit stream termination.
We use the EoC much like a `StopAsyncIteration` where the error
indicates termination of the stream due to either:
- reception of a stop IPC msg indicating the far end ended the stream
  (gracecfully),
- closure of the underlying `Context._recv_chan` either by the runtime
  or due to user code having called `MsgStream.aclose()`.

User code shouldn't expect to handle EoC outside the block since the
`@acm` having closed should indicate the exactly same lifetime state
(of said stream) ;)

Deats:
- add special EoC handler in `.open_stream()` which silently "absorbs"
  the error only when the stream is already marked as closed (meaning
  the EoC indeed corresponds to IPC closure) with an assert for now
  ensuring the error is the same as set to `MsgStream._eoc`.
- in `MsgStream.receive()` break up the handlers for EoC and
  `trio.ClosedResourceError` since the error instances are saved to
  different variables and we **don't** want to rewrite the exception in
  the eoc case (normally to mask `trio` internals in tbs) bc we need the
  instance to be the exact one for doing checks inside
  `.open_stream().__aexit__()` to absorb it.

Other surrounding "improvements":
- start using the new `Context.maybe_raise()` helper where it can easily
  replace existing equivalent block-sections.
- use new `RemoteActorError.src_uid` as required.
2024-03-19 18:40:50 -04:00
Tyler Goodlet 9221c57234 Adjust all `RemoteActorError.type` using tests
To instead use the new `.boxed_type` B)
2024-03-19 18:08:54 -04:00
Tyler Goodlet 78434f6317 Fix `.boxed_type` facepalm, drop `.src_actor_uid`
The misname of `._boxed_type` as `._src_type` was only manifesting as
a reallly strange boxing error with a packed exception-group, not sure
how or why only that but it's fixed now XD

Start refining/cleaning out stuff for sure we don't need (based on
multiple local test runs):

- discard `.src_actor_uid` fully since test set has been moved over to
  `.src_uid`; this means also removing the `.msgdata` insertion from
  `pack_error()`; a patch to all internals is coming next obvi!

- don't pass `boxed_type` to `RemoteActorError.__init__()` from
  `unpack_error()` since it's now set directly via the
  `.msgdata["boxed_type_str"]`/`error_msg: dict` input , but in the case
  where **it is passed as an arg** (only for ctxc in `._rpc._invoke()`
  rn) make sure we only do the `.__init__()` insert when `boxed_type is
  not None`.
2024-03-19 14:20:59 -04:00
Tyler Goodlet 5fb5682269 First try "relayed boxed errors", or "inceptions"
Since adding more complex inter-peer (actor) testing scenarios, we
definitely have an immediate need for `trio`'s style of "inceptions" but
for nesting `RemoteActorError`s as they're relayed through multiple
actor-IPC hops. So for example, a remote error relayed "through" some
proxy actor to another ends up packing a `RemoteActorError` into another
one such that there are 2 layers of RAEs with the first
containing/boxing an original src actor error (type).

In support of this extension to `RemoteActorError` we add:

- `get_err_type()` error type resolver helper (factored fromthe
  body of `unpack_error()`) to be used whenever rendering
  `.src_type`/`.boxed_type`.

- `.src_type_str: str` which is pulled from `.msgdata` and holds the
  above (eventually when unpacked) type as `str`.
- `._src_type: BaseException|None` for the original
  "source" actor's error as unpacked in any remote (actor's) env and
  exposed as a readonly property `.src_type`.

- `.boxed_type_str: str` the same as above but for the "last" boxed
  error's type; when the RAE is unpacked at its first hop this will
  be **the same as** `.src_type_str`.
- `._boxed_type: BaseException` which now similarly should be "rendered"
  from the below type-`str` field instead of passed in as a error-type
  via `boxed_type` (though we still do for the ctxc case atm, see
  notes).
 |_ new sanity checks in `.__init__()` mostly as a reminder to handle
   that ^ ctxc case ^ more elegantly at some point..
 |_ obvi we discard the previous `suberror_type` input arg.

- fully remove the `.type`/`.type_str` properties instead expecting
  usage of `.boxed_/.src_` equivalents.
- start deprecation of `.src_actor_uid` and make it delegate to new
  `.src_uid`
- add `.relay_uid` propery for the last relay/hop's actor uid.
- add `.relay_path: list[str]` which holds the per-hop updated sequence
  of relay actor uid's which consecutively did boxing of an RAE.
- only include `.src_uid` and `.relay_path` in reprol() output.
- factor field-to-str rendering into a new `_mk_fields_str()`
  and use it in `.__repr__()`/`.reprol()`.
- add an `.unwrap()` to (attempt to) render the src error.

- rework `pack_error()` to handle inceptions including,
  - packing the correct field-values for the new `boxed_type_str`, `relay_uid`,
    `src_uid`, `src_type_str`.
  - always updating the `relay_path` sequence with the uid of the
    current actor.

- adjust `unpack_error()` to match all these changes,
  - pulling `boxed_type_str` and passing any resolved `boxed_type` to
    `RemoteActorError.__init__()`.
  - use the new `Context.maybe_raise()` convenience method.

Adjust `._rpc` packing to `ContextCancelled(boxed_type=trio.Cancelled)`
and tweak some more log msg formats.
2024-03-18 14:28:24 -04:00
Tyler Goodlet 71de56b09a Drop now-deprecated deps on modern `trio`/Python
- `trio_typing` is nearly obsolete since `trio >= 0.23`
- `exceptiongroup` is built-in to python 3.11
- `async_generator` primitives have lived in `contextlib` for quite
  a while!
2024-03-13 18:41:24 -04:00
Tyler Goodlet e5cb39804c Pin to `trio>=0.24` to avoid `trio_typing` 2024-03-13 16:21:30 -04:00
Tyler Goodlet d28c7e17c6 Add `.trionics._broadcast` todos for py 3.12 2024-03-13 16:09:31 -04:00
Tyler Goodlet d23d8c1779 Start a `._rpc` module
Since `._runtime` was getting pretty long (> 2k LOC) and much of the RPC
low-level machinery is fairly isolated to a handful of task-funcs, it
makes sense to re-org the RPC task scheduling and driving msg loop to
its own code space.

The move includes:
- `process_messages()` which is the main IPC business logic.
- `try_ship_error_to_remote()` helper, to box local errors for the wire.
- `_invoke()`, the core task scheduler entrypoing used in the msg loop.
- `_invoke_non_context()`, holds impls for non-`@context` task starts.
- `_errors_relayed_via_ipc()` which does all error catch-n-boxing for
   wire-msg shipment using `try_ship_error_to_remote()` internally.

Also inside `._runtime` improve some `Actor` methods docs.
2024-03-13 15:57:15 -04:00
Tyler Goodlet 58cc57a422 Move `Portal.open_context()` impl to `._context`
Finally, since normally you need the content from `._context.Context`
and surroundings in order to effectively grok `Portal.open_context()`
anyways, might as well move the impl to the ctx module as
`open_context_from_portal()` and just bind it on the `Portal` class def.

Associated/required tweaks:
- avoid circ import on `.devx` by only import
  `.maybe_wait_for_debugger()` when debug mode is set.
- drop `async_generator` usage, not sure why this hadn't already been
  changed to `contextlib`?
- use `@acm` alias throughout `._portal`
2024-03-13 12:09:38 -04:00
Tyler Goodlet da913ef2bb Attempt at better internal traceback hiding
Previously i was trying to approach this using lots of
`__tracebackhide__`'s in various internal funcs but since it's not
exactly straight forward to do this inside core deps like `trio` and the
stdlib, it makes a bit more sense to optionally catch and re-raise
certain classes of errors from their originals using `raise from` syntax
as per:
https://docs.python.org/3/library/exceptions.html#exception-context

Deats:
- litter `._context` methods with `__tracebackhide__`/`hide_tb` which
  were previously being shown but that don't need to be to application
  code now that cancel semantics testing is finished up.
- i originally did the same but later commented it all out in `._ipc`
  since error catch and re-raise instead in higher level layers
  (above the transport) seems to be a much saner approach.
- add catch-n-reraise-from in `MsgStream.send()`/.`receive()` to avoid
  seeing the depths of `trio` and/or our `._ipc` layers on comms errors.

Further this patch adds some refactoring to use the
same remote-error shipper routine from both the actor-core in the RPC
invoker:
- rename it as `try_ship_error_to_remote()` and call it from
  `._invoke()` as well as it's prior usage.
- make it optionally accept `cid: str` a `remote_descr: str` and of
  course a `hide_tb: bool`.

Other misc tweaks:
- add some todo notes around `Actor.load_modules()` debug hooking.
- tweak the zombie reaper log msg and timeout value ;)
2024-03-13 10:44:51 -04:00
Tyler Goodlet 96992bcbb9 Add (back) a `tractor._testing` sub-pkg
Since importing from our top level `conftest.py` is not scaleable
or as "future forward thinking" in terms of:
- LoC-wise (it's only one file),
- prevents "external" (aka non-test) example scripts from importing
  content easily,
- seemingly(?) can't be used via abs-import if using
  a `[tool.pytest.ini_options]` in a `pyproject.toml` vs.
  a `pytest.ini`, see:
  https://docs.pytest.org/en/8.0.x/reference/customize.html#pyproject-toml)

=> Go back to having an internal "testing" pkg like `trio` (kinda) does.

Deats:
- move generic top level helpers into pkg-mod including the new
  `expect_ctxc()` (which i needed in the advanced faults testing script.
- move `@tractor_test` into `._testing.pytest` sub-mod.
- adjust all the helper imports to be a `from tractor._testing import <..>`

Rework `test_ipc_channel_break_during_stream()` and backing script:
- make test(s) pull `debug_mode` from new fixture (which is now
  controlled manually from `--tpdb` flag) and drop the previous
  parametrized input.
- update logic in ^ test for "which-side-fails" cases to better match
  recently updated/stricter cancel/failure semantics in terms of
  `ClosedResouruceError` vs. `EndOfChannel` expectations.
- handle `ExceptionGroup`s with expected embedded errors in test.
- better pendantics around whether to expect a user simulated KBI.
- for `examples/advanced_faults/ipc_failure_during_stream.py` script:
  - generalize ipc breakage in new `break_ipc()` with support for diff
    internal `trio` methods and a #TODO for future disti frameworks
  - only make one sub-actor task break and the other just stream.
  - use new `._testing.expect_ctxc()` around ctx block.
  - add a bit of exception handling with `print()`s around ctxc (unused
    except if 'msg' break method is set) and eoc cases.
  - don't break parent side ipc in loop any more then once
    after first break, checked via flag var.
  - add a `pre_close: bool` flag to control whether
    `MsgStreama.aclose()` is called *before* any ipc breakage method.

Still TODO:
- drop `pytest.ini` and add the alt section to `pyproject.py`.
 -> currently can't get `--rootdir=` opt to work.. not showing in
   console header.
 -> ^ also breaks on 'tests' `enable_modules` imports in subactors
   during discovery tests?
2024-03-13 09:09:08 -04:00
Tyler Goodlet 6533285d7d Add `an: ActorNursery` var placeholder for final log msg 2024-03-12 08:56:17 -04:00
Tyler Goodlet 8c39b8b124 Tweak some tests for spurious failues
With the seeming cause that some cases occasionally raise
`ExceptionGroup` instead of a (collapsed out) single error which, in
those cases at least try to check that `.exceptions` has the original
error.
2024-03-11 10:37:34 -04:00
Tyler Goodlet ededa2e88f More spaceless union type annots 2024-03-11 10:33:06 -04:00
Tyler Goodlet dd168184c3 Add a open-ctx-with-self test
Found exactly why trying this won't work when playing around with
opening workspaces in `modden` using a `Portal.open_context()` back to
the 'bigd' root actor: the RPC machinery only registers one entry in
`Actor._contexts` which will get overwritten by each task's side and
then experience race-based IPC msging errors (eg. rxing `{'started': _}`
on the callee side..). Instead make opening a ctx back to the self-actor
a runtime error describing it as an invalid op.

To match:
- add a new test `test_ctx_with_self_actor()` to the context semantics
  suite.
- tried out adding a new `side: str` to the `Actor.get_context()` (and
  callers) but ran into not being able to determine the value from in
  `._push_result()` where it's needed to figure out which side to push
  to.. So, just leaving the commented arg (passing) in the runtime core
  for now in case we can come back to trying to make it work, tho i'm
  thinking it's not the right hack anyway XD
2024-03-11 10:29:42 -04:00
Tyler Goodlet 37ee477aee Let `MsgStream.receive_nowait()` take in msg key list
Call it `allow_msg_keys: list[str] = ['yield']` and set it to accept
`['yield', 'return']` from the drain loop in `.aclose()`. Only pass the
last key error to `_raise_from_no_key_in_msg()` in the fall-through
case.

Somehow this seems to prevent all the intermittent test failures i was
seeing in local runs including when running the entire suite all in
sequence; i ain't complaining B)
2024-03-11 10:20:55 -04:00
Tyler Goodlet f067cf48a7 Unify some log msgs in `.to_asyncio`
Much like similar recent changes throughout the core, build out `msg:
str` depending on error cases and emit with `.cancel()` level as
appropes. Also mute (via level) some duplication in the cancel case
inside `_run_asyncio_task()` for console noise reduction.
2024-03-08 16:07:17 -05:00
Tyler Goodlet c56d4b0a79 Assign `ctx._local_error` ASAP from `.open_context()`
Such that `.outcome` related fields render nicely asap for logging
withing `Portal.open_context()` itself.
2024-03-08 16:03:13 -05:00
Tyler Goodlet 7cafb59ab7 Tweak `Context.repr_outcome()` for KBIs
Since apparently `str(KeyboardInterrupt()) == ''`? So instead add little
`<str> or repr(merr)` expressions throughout to avoid blank strings
rendering if various `repr()`/`.__str__()` outputs..
2024-03-08 15:46:42 -05:00
Tyler Goodlet 7458f99733 Add a `._state._runtime_vars['_registry_addrs']`
Such that it's set to whatever `Actor.reg_addrs: list[tuple]` is during
the actor's init-after-spawn guaranteeing each actor has at least the
registry infos from its parent. Ensure we read this if defined over
`_root._default_lo_addrs` in `._discovery` routines, namely
`.find_actor()` since it's the one API normally used without expecting
the runtime's `current_actor()` to be up.

Update the latest inter-peer cancellation test to use the `reg_addr`
fixture (and thus test this new runtime-vars value via `find_actor()`
usage) since it was failing if run *after* the infected `asyncio` suite
due to registry contact failure.
2024-03-08 15:34:20 -05:00
Tyler Goodlet 4c3c3e4b56 Support a `._state.last_actor()` getter
Not sure if it's really that useful other then for reporting errors from
`current_actor()` but at least it alerts `tractor` devs and/or users
when the runtime has already terminated vs. hasn't been started
yet/correctly.

Set the `._last_actor_terminated: tuple` in the root's final block which
allows testing for an already terminated tree which is the case where
`._state._current_actor == None` and the last is set.
2024-03-08 14:11:17 -05:00
Tyler Goodlet b29d33d603 Make `Actor._cancel_task(requesting_uid: tuple)` required arg 2024-03-08 14:03:18 -05:00
Tyler Goodlet 1617e0ff2c Woops, fix one last `ctx._cancelled_caught` in drain loop 2024-03-08 13:48:35 -05:00
Tyler Goodlet c025761f15 Adjust `asyncio` test for stricter ctx-self-cancels
Use `expect_ctx()` around the portal cancellation case, toss in
a `'context'` parametrization and return just the `Context.outcome` from
`main()` B)
2024-03-07 21:33:07 -05:00
Tyler Goodlet 2e797ef7ee Update ctx test suites to stricter semantics
Including mostly tweaking asserts on relayed `ContextCancelled`s and
the new pub ctx properties: `.outcome`, `.maybe_error`, etc. as it
pertains to graceful (absorbed) remote cancellation vs. loud ctxc cases
expected to be raised by any `Portal.cancel_actor()` style teardown.

Start checking a variety internals like `._remote/local_error`,
`._is_self_cancelled()`, `._is_final_result_set()`, `._cancel_msg`
where applicable.

Also factor out the new `expect_ctxc()` checker to our `conftest.py` for
use in other suites.
2024-03-07 21:26:57 -05:00
Tyler Goodlet c36deb1f4d Woops, fix `_post_mortem()` type sig..
We're passing a `extra_frames_up_when_async=2` now (from prior attempt
to hide `CancelScope.__exit__()` when `shield=True`) and thus both
`debug_func`s must accept it 🤦

On the brighter side found out that the `TypeError` from the call-sig
mismatch was actually being swallowed entirely so add some
`.exception()` msgs for such cases to at least alert the dev they broke
stuff XD
2024-03-07 21:24:34 -05:00
Tyler Goodlet fa7e37d6ed (Event) more pedantic `.cancel_acked: bool` def
Changes the condition logic to be more strict and moves it to a private
`._is_self_cancelled() -> bool` predicate which can be used elsewhere
(instead of having almost similar duplicate checks all over the
place..) and allows taking in a specific `remote_error` just for
verification purposes (like for tests).

Main strictness distinctions are now:
- obvi that `.cancel_called` is set (this filters any
  `Portal.cancel_actor()` or other out-of-band RPC),
- the received `ContextCancelled` **must** have its `.canceller` set to
  this side's `Actor.uid` (indicating we are the requester).
- `.src_actor_uid` **must** be the same as the `.chan.uid` (so the error
  must have originated from the opposite side's task.
- `ContextCancelled.canceller` should be already set to the `.chan.uid`
  indicating we received the msg via the runtime calling
  `._deliver_msg()` -> `_maybe_cancel_and_set_remote_error()` which
  ensures the error is specifically destined for this ctx-task exactly
  the same as how `Actor._cancel_task()` sets it from an input
  `requesting_uid` arg.

In support of the above adjust some impl deats:
- add `Context._actor: Actor` which is set once in `mk_context()` to
  avoid issues (particularly in testing) where `current_actor()` raises
  after the root actor / runtime is already exited. Use `._actor.uid` in
  both `.cancel_acked` (obvi) and '_maybe_cancel_and_set_remote_error()`
  when deciding whether to call `._scope.cancel()`.
- always cast `.canceller` to `tuple` if not null.
- delegate `.cancel_acked` directly to new private predicate (obvi).
- always set `._canceller` from any `RemoteActorError.src_actor_uid` or
  failing over to the `.chan.uid` when a non-remote error (tho that
  shouldn't ever happen right?).
- more extensive doc-string for `.cancel()` detailing the new strictness
  rules about whether an eventual `.cancel_acked` might be set.

Also tossed in even more logging format tweaks by adding a
`type_only: bool` to `.repr_outcome()` as desired for simpler output in
the `state: <outcome-repr-here>` and `.repr_rpc()` sections of the
`.__str__()`.
2024-03-07 20:35:43 -05:00
Tyler Goodlet 364ea91983 Set `._cancel_msg` to RPC `{cmd: 'self._cancel_task', ..}` msg
Like how we set `Context._cancel_msg` in `._deliver_msg()` (in
which case normally it's an `{'error': ..}` msg), do the same when any
RPC task is remotely cancelled via `Actor._cancel_task` where that task
doesn't yet have a cancel msg set yet.

This makes is much easier to distinguish between ctx cancellations due
to some remote error vs. Explicit remote requests via any of
`Actor.cancel()`, `Portal.cancel_actor()` or `Context.cancel()`.
2024-03-07 18:24:00 -05:00
Tyler Goodlet 7ae9b5319b Tweak inter-peer `._scope` state asserts
We don't expect `._scope.cancelled_caught` to be set really ever on
inter-peer cancellation since no ctx is ever cancelling itself, a peer
cancels some other and then bubbles back to all other peers.

Also add `ids: lambda` for `error_during_ctxerr_handling` param to
`test_peer_canceller()`
2024-03-06 16:09:38 -05:00
Tyler Goodlet 6156ff95f8 Add `shield: bool` support to `.pause()`
It's been on the todo for a while and I've given up trying to properly
hide the `trio.CancelScope.__exit__()` frame for now instead opting to
just `log.pdb()` a big apology XD

Users can obvi still just not use the flag and wrap `tractor.pause()` in
their own cs block if they want to avoid having to hit `'up'` in the pdb
REPL if needed in a cancelled task-scope.

Impl deatz:
- factor orig `.pause()` impl into new `._pause()` so that we can more tersely
  wrap the original content depending on `shield: bool` input; only open
  the cancel-scope when shield is set to avoid aforemented extra strack
  frame annoyance.
- pass through `shield` to underlying `_pause` and `debug_func()` so we
  can actually know when so log our apology.
- add a buncha notes to new `.pause()` wrapper regarding the inability
  to hide the cancel-scope `.__exit__()`, inluding that overriding the
  code in `trio._core._run.CancelScope` doesn't seem to solve the issue
  either..

Unrelated `maybe_wait_for_debugger()` tweaks:
- don't read `Lock.global_actor_in_debug` more then needed, rename local
  read var to `in_debug` (since it can also hold the root actor uid, not
  just sub-actors).
- shield the `await debug_complete.wait()` since ideally we avoid the
  root cancellation child-actors in debug even when the root calls this
  func in a cancelled scope.
2024-03-06 14:37:54 -05:00
Tyler Goodlet 9e3f41a5b1 Tweak inter-peer tests for new/refined semantics
Buncha subtle details changed mostly to do with when `Context.cancel()`
gets called on "real" remote errors vs. (peer requested) cancellation
and then local side handling of `ContextCancelled`.

Specific changes to make tests pass:
- due to raciness with `sleeper_ctx.result()` raising the ctxc locally
  vs. the child-peers receiving similar ctxcs themselves (and then
  erroring and propagating back to the root parent), we might not see
  `._remote_error` set during the sub-ctx loops (except for the sleeper
  itself obvi).
- do not expect `.cancel_called`/`.cancel_caught` to be set on any
  sub-ctx since currently `Context.cancel()` is only called non-shielded
  and thus is not in invoked when `._scope.cancel()` is called as part
  of each root-side ctx ref/block handling the inter-peer ctxc.
- do not expect `Context._scope.cancelled_caught` to be set in most cases
  (even the sleeper)

TODO Outstanding adjustments not fixed yet:
-[ ] `_scope.cancelled_caught` checks outside the `.open_context()`
  blocks.
2024-03-06 10:13:41 -05:00
Tyler Goodlet 7c22f76274 Yahh, add `.devx` package to installed subpkgs.. 2024-03-06 09:55:05 -05:00
Tyler Goodlet 04c99c2749 Woops, add `.msg` sub-pkg to install set 2024-03-06 09:48:46 -05:00
Tyler Goodlet e536057fea `._entry`: use same msg info in start/terminate log 2024-03-05 12:30:34 -05:00
Tyler Goodlet c6b4da5788 Tweak `._portal` log content to use `Context.repr_outcome()` 2024-03-05 12:26:33 -05:00
Tyler Goodlet 1f7f84fdfa Mk debugger tests work for arbitrary pre-REPL format
Since this was changed as part of overall project wide logging format
updates, and i ended up changing the both the crash and pause `.pdb()`
msgs to include some multi-line-ascii-"stuff", might as well make the
pre-prompt checks in the test suite more flexible to match.

As such, this exposes 2 new constants inside the `.devx._debug` mod:
- `._pause_msg: str` for the pre `tractor.pause()` header emitted via
  `log.pdb()` and,
- `._crash_msg: str` for the pre `._post_mortem()` equiv when handling
  errors in debug mode.

Adjust the test suite to use these values and thus make us more capable
to absorb changes in the future as well:
- add a new `in_prompt_msg()` predicate, very similar to `assert_before()`
  but minus `assert`s which takes in a `parts: list[str]` to match
  in the pre-prompt stdout.
- delegate to `in_prompt_msg()` in `assert_before()` since it was mostly
  duplicate minus `assert`.
- adjust all previous `<patt> in before` asserts to instead use
  `in_prompt_msg()` with separated pre-prompt-header vs. actor-name
  `parts`.
- use new `._pause/crash_msg` values in all such calls including any
  `assert_before()` cases.
2024-03-05 12:22:04 -05:00
Tyler Goodlet a5bdc6db66 Flip rpc tests over to use `ExceptionGroup` on new `trio` 2024-03-05 10:34:32 -05:00
Tyler Goodlet 9a18b57d38 Mega-refactor on `._invoke()` targeting `@context`s
Since eventually we want to implement all other RPC "func types" as
contexts underneath this starts the rework to move all the other cases
into a separate func not only to simplify the main `._invoke()` body but
also as a reminder of the intention to do it XD

Details of re-factor:
- add a new `._invoke_non_context()` which just moves all the old blocks
  for non-context handling to a single def.
- factor what was basically just the `finally:` block handler (doing all
  the task bookkeeping) into a new `@acm`: `_errors_relayed_via_ipc()`
  with that content packed into the post-`yield` (also with a `hide_tb:
  bool` flag added of course).
  * include a `debug_kbis: bool` for when needed.
- since the `@context` block is the only type left in the main
  `_invoke()` body, de-dent it so it's more grok-able B)

Obviously this patch also includes a few improvements regarding
context-cancellation-semantics (for the `context` RPC case) on the
callee side in order to match previous changes to the `Context` api:
- always setting any ctxc as the `Context._local_error`.
- using the new convenience `.maybe_raise()` topically (for now).
- avoiding any previous reliance on `Context.cancelled_caught` for
  anything public of meaning.

Further included is more logging content updates:
- being pedantic in `.cancel()` msgs about whether termination is caused
  by error or ctxc.
- optional `._invoke()` traceback hiding via a `hide_tb: bool`.
- simpler log headers throughout instead leveraging new `.__repr__()` on
  primitives.
- buncha `<= <actor-uid>` sent some message emissions.
- simplified handshake statuses reporting.

Other subsys api changes we need to match:
- change to `Channel.transport`.
- avoiding any `local_nursery: ActorNursery` waiting when the
  `._implicit_runtime_started` is set.

And yes, lotsa more comments for #TODOs dawg.. since there's always
somethin!
2024-03-02 22:12:00 -05:00
Tyler Goodlet ed10632d97 Avoid `ctx.cancel()` after ctxc rxed in `.open_context()`
In the case where the callee side delivers us a ctxc with `.canceller`
set we can presume that remote cancellation already has taken place and
thus we don't need to do the normal call-`Context.cancel()`-on-error
step. Further, in the case where we do call it also handle any
`trio.CloseResourceError` gracefully with a `.warning()`.

Also, originally I had added a post-`yield`-maybe-raise to attempt
handling any remote ctxc the same as for the local case (i.e. raised
from `yield` line) wherein if we get a remote ctxc the same handler
branch-path would trigger, thus avoiding different behaviour in that
case. I ended up masking it out (but can't member why.. ) as it seems
the normal `.result()` call and its internal handling gets the same
behaviour? I've left in the heavily commented code in case it ends up
being the better way to go; likely making the move to having a single
code in both cases is better even if it is just a matter of deciding
whether to swallow the ctxc or not in the `.cancel_acked` case.

Further teensie improvements:
- obvi improve/simplify log msg contents as in prior patches.
- use the new `maybe_wait_for_debugger(header_msg: str)` if/when waiting
  to exit in debug mode.
- another `hide_tb: bool` frame hider flag.
- rando type-annot updates of course :)
2024-03-02 17:18:55 -05:00
Tyler Goodlet 299429a278 Deep `Context` refinements
Spanning from the pub API, to instance `repr()` customization (for
logging/REPL content), to the impl details around the notion of a "final
outcome" and surrounding IPC msg draining mechanics during teardown.

A few API and field updates:

- new `.cancel_acked: bool` to replace what we were mostly using
  `.cancelled_caught: bool` for but, for purposes of better mapping the
  semantics of remote cancellation of parallel executing tasks; it's set
  only when `.cancel_called` is set and a ctxc arrives with
  a `.canceller` field set to the current actor uid indicating we
  requested and received acknowledgement from the other side's task
  that is cancelled gracefully.

- strongly document and delegate (and prolly eventually remove as a pub
  attr) the `.cancelled_caught` property entirely to the underlying
  `._scope: trio.CancelScope`; the `trio` semantics don't really map
  well to the "parallel with IPC msging"  case in the sense that for
  us it breaks the concept of the ctx/scope closure having "caught"
  something instead of having "received" a msg that the other side has
  "acknowledged" (i.e. which for us is the completion of cancellation).

- new `.__repr__()`/`.__str__()` format that tries to tersely yet
  comprehensively as possible display everything you need to know about
  the 3 main layers of an SC-linked-IPC-context:
  * ipc: the transport + runtime layers net-addressing and prot info.
  * rpc: the specific linked caller-callee task signature details
    including task and msg-stream instances.
  * state: current execution and final outcome state of the task pair.
  * a teensie extra `.repr_rpc` for a condensed rpc signature.

- new `.dst_maddr` to get a `libp2p` style "multi-address" (though right
  now it's just showing the transport layers so maybe we should move to
  to our `Channel`?)

- new public instance-var fields supporting more granular remote
  cancellation/result/error state:
  * `.maybe_error: Exception|None` for any final (remote) error/ctxc
    which computes logic on the values of `._remote_error`/`._local_error`
    to determine the "final error" (if any) on termination.
  * `.outcome` to the final error or result (or `None` if un-terminated)
  * `.repr_outcome()` for a console/logging friendly version of the
    final result or error as needed for the `.__str__()`.

- new private interface bits to support all of ^:
  * a new "no result yet" sentinel value, `Unresolved`, using a module
    level class singleton that `._result` is set too (instead of
    `id(self)`) to both determine if and present when no final result
    from the callee has-yet-been/was delivered (ever).
    => really we should get rid of `.result()` and change it to
    `.wait_for_result()` (or something)u
  * `_final_result_is_set()` predicate to avoid waiting for an already
    delivered result.
  * `._maybe_raise()` proto-impl that we should use to replace all the
    `if re:` blocks it can XD
  * new `._stream: MsgStream|None` for when a stream is opened to aid
    with the state repr mentioned above.

Tweaks to the termination drain loop `_drain_to_final_msg()`:

- obviously (obvi) use all the changes above when determining whether or
  not a "final outcome" has arrived and thus breaking from the loop ;)
  * like the `.outcome` `.maybe_error`  and `._final_ctx_is_set()` in
    the `while` pred expression.

- drop the `_recv_chan.receive_nowait()` + guard logic since it seems
  with all the surrounding (and coming soon) changes to
  `Portal.open_context()` using all the new API stuff (mentioned in
  first bullet set above) we never hit the case of inf-block?

Oh right and obviously a ton of (hopefully improved) logging msg content
changes, commented code removal and detailed comment-docs strewn about!
2024-03-01 22:37:32 -05:00
Tyler Goodlet 28fefe4ffe Make stream draining status logs `.debug()` level 2024-03-01 19:27:10 -05:00
Tyler Goodlet 08a6a51cb8 Add `._implicit_runtime_started` mark, better logs
After some deep logging improvements to many parts of `._runtime`,
I realized a silly detail where we are always waiting on any opened
`local_nursery: ActorNursery` to signal exit from
`Actor._stream_handler()` even in the case of being an implicitly opened
root actor (`open_root_actor()` wasn't called by user/app code) via
`._supervise.open_nursery()`..

So, to address this add a `ActorNursery._implicit_runtime_started: bool`
that can be set and then checked to avoid doing the unnecessary
`.exited.wait()` (and any subsequent warn logging on an exit timeout) in
that special but most common case XD

Matching with other subsys log format refinements, improve readability
and simplicity of the actor-nursery supervisory log msgs, including:
- simplify and/or remove any content that more or less duplicates msg
  content found in emissions from lower-level primitives and sub-systems
  (like `._runtime`, `_context`, `_portal` etc.).
- add a specific `._open_and_supervise_one_cancels_all_nursery()`
  handler block for `ContextCancelled` to log with `.cancel()` level
  noting that the case is a "remote cancellation".
- put the nursery-exit and actor-tree shutdown status into a single msg
  in the `implicit_runtime` case.
2024-03-01 15:44:01 -05:00
Tyler Goodlet 50465d4b34 Spawn naming and log format tweaks
- rename `.soft_wait()` -> `.soft_kill()`
- rename `.do_hard_kill()` -> `.hard_kill()`
- adjust any `trio.Process.__repr__()` log msg contents to have the
  little tree branch prefix: `'|_'`
2024-03-01 11:37:23 -05:00
Tyler Goodlet 4f69af872c Add field-first subproca `.info()` to `._entry` 2024-02-29 20:01:39 -05:00
Tyler Goodlet 9bc6a61c93 Add "fancier" remote-error `.__repr__()`-ing
Our remote error box types `RemoteActorError`, `ContextCancelled` and
`StreamOverrun` needed a console display makeover particularly for
logging content and `repr()` in higher level primitives like `Context`.

This adds a more "dramatic" str-representation to showcase the
underlying boxed traceback content more sensationally (via ascii-art
emphasis) as well as support a more terse `.reprol()` (representation
for one-line) format that can be used for types that track remote
errors/cancels like with `Context._remote_error`.

Impl deats:
- change `RemoteActorError.__repr__()` formatting to show (sub-type
  specific) `.msgdata` fields in a multi-line format (similar to our new
  `.msg.types.Struct` style) followed by some ascii accented delimiter
  lines to emphasize any `.msgdata["tb_str"]` packed by the remote
- for rme and subtypes allow picking the specifically relevant fields
  via a type defined `.reprol_fields: list[str]` and pick for each
  subtype:
   |_ `RemoteActorError.src_actor_uid`
   |_ `ContextCancelled.canceller`
   |_ `StreamOverrun.sender`

- add `.reprol()` to show a `repr()`-on-one-line formatted string that
  can be used by other multi-line-field-`repr()` styled composite types
  as needed in (high level) logging info.
- toss in some mod level `_body_fields: list[str]` for summary of such
  fields (if needed).
- add some new rae (remote-actor-error) props:
  - `.type` around a newly named `.boxed_type`
  - `.type_str: str`
  - `.tb_str: str`
2024-02-29 18:56:31 -05:00
Tyler Goodlet 23aa97692e Fix `Channel.__repr__()` safety, renames to `._transport`
Hit a reallly weird bug in the `._runtime` IPC msg handling loop where
it seems that by `str.format()`-ing a `Channel` before initializing it
would put the `._MsgTransport._agen()` in an already started state
causing an irrecoverable core startup failure..

I presume it's something to do with delegating to the
`MsgpackTCPStream.__repr__()` and, something something.. the
`.set_msg_transport(stream)` getting called to too early such that
`.msgstream.__init__()` is called thus init-ing the `._agen()` before
necessary? I'm sure there's a design lesson to be learned in here
somewhere XD

This was discovered while trying to add more "fancy" logging throughout
said core for the purposes of cobbling together an init attempt at
libp2p style multi-address representations for our IPC primitives. Thus
I also tinker here with adding some new fields to `MsgpackTCPStream`:
- `layer_key`: int = 4
- `name_key`: str = 'tcp'
- `codec_key`: str = 'msgpack'

Anyway, just changed it so that if `.msgstream` ain't set then we just
return a little "null repr" `str` value thinger.

Also renames `Channel.msgstream` internally to `._transport` with
appropriate pub `@property`s added such that everything else won't break
;p

Also drops `Optional` typing vis-a-vi modern union syntax B)
2024-02-29 18:37:04 -05:00
Tyler Goodlet 1e5810e56c Make `NamespacePath` kinda support methods..
Obviously we can't deterministic-ally call `.load_ref()` (since you'd
have to point to an `id()` or something and presume a particular
py-runtime + virt-mem space for it to exist?) but it at least helps with
the `str` formatting for logging purposes (like `._cancel_rpc_tasks()`)
when `repr`-ing ctxs and their specific "rpc signatures".

Maybe in the future getting this working at least for singleton types
per process (like `Actor` XD ) will be a thing we can support and make
some sense of.. Bo
2024-02-29 17:37:02 -05:00
Tyler Goodlet b54cb6682c Add #TODO for generating func-sig type-annots as `str` for pprinting 2024-02-29 17:21:43 -05:00
Tyler Goodlet 3ed309f019 Add test for `modden` sub-spawner-server hangs on cancel
As per a lot of the recent refinements to `Context` cancellation, add
a new test case to replicate the original hang-on-cancel found with
`modden` when using a client actor to spawn a subactor in some other
tree where despite `Context.cancel()` being called the requesting client
would hang on the opened context with the server.

The specific scenario added here is to have,
- root actor spawns 2 children: a client and a spawn server.
- the spawn server opens with a spawn-request serve loop and begins to
  wait for the client.
- client spawns and connects to the sibling spawn server, requests to
  spawn a sub-actor, the "little bro", connects to it then does some
  echo streaming, cancels the request with it's sibling (the spawn
  server) which should in turn cancel the root's-grandchild and result
  in a cancel-ack back to the client's `.open_context()`.
- root ensures that it can also connect to the grandchild (little bro),
  do the same echo streaming, then ensure everything tears down
  correctly after cancelling all the children.

More refinements to come here obvi in the specific cancellation
semantics and possibly causes.

Also tweaks the other tests in suite to use the new `Context` properties
recently introduced and similarly updated in the previous patch to the
ctx-semantics suite.
2024-02-29 15:45:55 -05:00
Tyler Goodlet d08aeaeafe Make `@context`-cancelled tests more pedantic
In order to match a very significant and coming-soon patch set to the
IPC `Context` and `Channel` cancellation semantics with significant but
subtle changes to the primitives and runtime logic:

- a new set of `Context` state pub meth APIs for checking exact
  inter-actor-linked-task outcomes such as `.outcome`, `.maybe_error`,
  and `.cancel_acked`.

- trying to move away from `Context.cancelled_caught` usage since the
  semantics from `trio` don't really map well (in terms of cancel
  requests and how they result in cancel-scope graceful closure) and
  `.cancel_acked: bool` is a better approach for IPC req-resp msging.
  - change test usage to access `._scope.cancelled_caught` directly.

- more pedantic ctxc-raising expects around the "type of self
  cancellation" and final outcome in ctxc cases:
  - `ContextCancelled` is raised by ctx (`Context.result()`) consumer
    methods when `Portal.cancel_actor()` is called (since it's an
    out-of-band request) despite `Channel._cancel_called` being set.
  - also raised by `.open_context().__aexit__()` on close.
  - `.outcome` is always `.maybe_error` is always one of
    `._local/remote_error`.
2024-02-28 19:25:27 -05:00
Tyler Goodlet c6ee4e5dc1 Add a `pytest.ini` config 2024-02-22 20:37:12 -05:00
Tyler Goodlet ad5eee5666 WIP final impl of ctx-cancellation-semantics 2024-02-22 18:33:18 -05:00
Tyler Goodlet fc72d75061 Support `maybe_wait_for_debugger(header_msg: str)`
Allow callers to stick in a header to the `.pdb()` level emitted msg(s)
such that any "waiting status" content is only shown if the caller
actually get's blocked waiting for the debug lock; use it inside the
`._spawn` sub-process reaper call.

Also, return early if `Lock.global_actor_in_debug == None` and thus
only enter the poll loop when actually needed, consequently raise
if we fall through the loop without acquisition.
2024-02-22 15:08:10 -05:00
Tyler Goodlet de1843dc84 Few more log msg tweaks in runtime 2024-02-22 15:06:39 -05:00
Tyler Goodlet 930d498841 Call `actor.cancel(None)` from root to avoid mismatch with (any future) meth sig changes 2024-02-22 14:45:08 -05:00
Tyler Goodlet 5ea112699d Tweak broadcast fanout test to never inf loop
Since a bug in the new `MsgStream.aclose()` impl's drain block logic was
triggering an actual inf loop (by not ever canceller the streamer child
actor), make sure we put a loop limit on the `inf_streamer`()` XD

Also add a bit more deats to the test `print()`s in each actor and toss
in `debug_mode` fixture support.
2024-02-22 14:41:28 -05:00
Tyler Goodlet e244747bc3 Add note that maybe `Context._eoc` should be set by caller? 2024-02-22 14:22:45 -05:00
Tyler Goodlet 5a09ccf459 Tweak `Actor` cancel method signatures
Besides improving a bunch more log msg contents similarly as before this
changes the cancel method signatures slightly with different arg names:

for `.cancel()`:
- instead of `requesting_uid: str` take in a `req_chan: Channel`
  since we can always just read its `.uid: tuple` for logging and
  further we can then offer the `chan=None` case indicating a
  "self cancel" (since there's no "requesting channel").
- the semantics of "requesting" here better indicate that the IPC connection
  is an IPC peer and further (eventually) will allow permission checking
  against given peers for cancellation requests.
- when `chan==None` we also define a meth-internal `requester_type: str`
  differently for logging content :)
- add much more detailed `.cancel()` content around the requester, its
  type, and any debugger related locking steps.

for `._cancel_task()`:
- change the `chan` arg to `parent_chan: Channel` since "parent"
  correctly indicates that the channel is the parent of the locally
  spawned rpc task to cancel; in fact no other chan should be able to
  cancel tasks parented/spawned by other channels obvi!
- also add more extensive meth-internal `.cancel()` logging with a #TODO
  around showing only the "relevant/lasest" `Context` state vars in such
  logging content.

for `.cancel_rpc_tasks()`:
- shorten `requesting_uid` -> `req_uid`.
- add `parent_chan: Channel` to be similar as above in `._cancel_task()`
  (since it's internally delegated to anyway) which replaces the prior
  `only_chan` and use it to filter to only tasks spawned by this channel
  (thus as their "parent") as before.
- instead of `if tasks:` to enter, invert and `return` early on
  `if not tasks`, for less indentation B)
- add WIP str-repr format (for `.cancel()` emissions) to show
  a multi-address (maddr) + task func (via the new `Context._nsf`) and
  report all cancel task targets with it a "tree"; include #TODO to
  finalize and implement some utils for all this!

To match ensure we adjust `process_messages()` self/`Actor` cancel
handling blocks to provide the new `kwargs` (now with `dict`-merge
syntax) to `._invoke()`.
2024-02-22 14:22:08 -05:00
Tyler Goodlet ce1bcf6d36 Fix overruns test to avoid return-beats-ctxc race
Turns out that py3.11 might be so fast that iterating a EoC-ed
`MsgStream` 1k times is faster then a `Context.cancel()` msg
transmission from a parent actor to it's child (which i guess makes
sense). So tweak the test to delay 5ms between stream async-for iteration
attempts when the stream is detected to be `.closed: bool` (coming in
patch) or `ctx.cancel_called == true`.
2024-02-21 13:53:25 -05:00
Tyler Goodlet 28ba5e5435 Add `pformat()` of `ActorNursery._children` to logging
Such that you see the children entries prior to exit instead of the
prior somewhat detail/use-less logging. Also, rename all `anursery` vars
to just `an` as is the convention in most examples.
2024-02-21 13:21:28 -05:00
Tyler Goodlet 10adf34be5 Set any `._eoc` to the err in `_raise_from_no_key_in_msg()`
Since that's what we're now doing in `MsgStream._eoc` internal
assignments (coming in future patch), do the same in this exception
re-raise-helper and include more extensive doc string detailing all
the msg-type-to-raised-error cases. Also expose a `hide_tb: bool` like
we have already in `unpack_error()`.
2024-02-21 13:17:37 -05:00
Tyler Goodlet 82dcaff8db Better logging for cancel requests in IPC msg loop
As similarly improved in other parts of the runtime, adds much more
pedantic (`.cancel()`) logging content to indicate the src of remote
cancellation request particularly for `Actor.cancel()` and
`._cancel_task()` cases prior to `._invoke()` task scheduling. Also add
detailed case comments and much more info to the
"request-to-cancel-already-terminated-RPC-task" log emission to include
the `Channel` and `Context.cid` deats.

This helped me find the src of a race condition causing a test to fail
where a callee ctx task was returning a result *before* an expected
`ctx.cancel()` request arrived B). Adding much more pedantic
`.cancel()` msg contents around the requester's deats should ensure
these cases are much easier to detect going forward!

Also, simplify the `._invoke()` final result/error log msg to only put
*one of either* the final error or returned result above the `Context`
pprint.
2024-02-21 13:05:22 -05:00
Tyler Goodlet 621b252b0c Use `NamespacePath` in `Context` mgmt internals
The only case where we can't is in `Portal.run_from_ns()` usage (since we
pass a path with `self:<Actor.meth>`) and because `.to_tuple()`
internally uses `.load_ref()` which will of course fail on such a path..

So or now impl as,
- mk `Actor.start_remote_task()` take a `nsf: NamespacePath` but also
  offer a `load_nsf: bool = False` such that by default we bypass ref
  loading (maybe this is fine for perf long run as well?) for the
  `Actor`/'self:'` case mentioned above.
- mk `.get_context()` take an instance `nsf` obvi.

More logging msg format tweaks:
- change msg-flow related content to show the `Context._nsf`, which,
  right, is coming follow up commit..
- bunch more `.runtime()` format updates to show `msg: dict` contents
  and internal primitives with trailing `'\n'` for easier reading.
- report import loading `stackscope` in subactors.
2024-02-20 16:15:48 -05:00
Tyler Goodlet 20a089c331 Drop extra "
" when logging actor nursery errors
2024-02-20 15:58:11 -05:00
Tyler Goodlet df50d78042 Fix `.devx.maybe_wait_for_debugger()` polling deats
When entered by the root actor avoid excessive polling cycles by,
- blocking on the `Lock.no_remote_has_tty: trio.Event` and breaking
  *immediately* when set (though we should really also lock
  it from the root right?) to avoid extra loops..
- shielding the `await trio.sleep(poll_delay)` call to avoid any local
  cancellation causing the (presumably root-actor task) caller to move
  on (possibly to cancel its children) and instead to continue
  poll-blocking until the lock is actually released by its user.
- `break` the poll loop immediately if no remote locker is detected.
- use `.pdb()` level for reporting lock state changes.

Also add a #TODO to handle calls by non-root actors as it pertains to
2024-02-20 15:57:31 -05:00
Tyler Goodlet 114ec36436 Add `stackscope` as dep, drop legacy `pdb` issue cruft 2024-02-20 15:29:31 -05:00
Tyler Goodlet 179d7d2b04 Add `NamespacePath._ns` todo for `self:<ns.meth>` support 2024-02-20 15:28:11 -05:00
Tyler Goodlet f568fca98f Emit warning on any `ContextCancelled.canceller == None` 2024-02-20 15:26:14 -05:00
Tyler Goodlet 6c9bc627d8 Make ctx tests support `debug_mode: bool` fixture
Such that with `--tpdb` passed (sub)actors will engage the `pdbp` REPL
automatically and so that we can use the new `stackscope` support when
complex cases hang Bo

Also,
- simplified some type-annots (ns paths),
- doc-ed an inter-peer test func with some ascii msg flows,
- added a bottom #TODO for replicating the scenario i hit in `modden`
  where a separate client actor-tree was hanging on cancelling a `bigd`
  sub-workspace..
2024-02-20 15:14:58 -05:00
Tyler Goodlet 1d7cf7d1dd Enable `stackscope` render via root in debug mode
If `stackscope` is importable and debug_mode is enabled then we by
default call and report `.devx.enable_stack_on_sig()` is set B)

This makes debugging unexpected (SIGINT ignoring) hangs a cinch!
2024-02-20 13:23:16 -05:00
Tyler Goodlet 54a0a0000d .log: more multi-line styling 2024-02-20 13:22:44 -05:00
Tyler Goodlet 0268b2ce91 Better subproc supervisor logging, todo for #320
Given i just similarly revamped a buncha `._runtime` log msg formatting,
might as well do something similar inside the spawning machinery such
that groking teardown sequences of each supervising task is much more
sane XD

Mostly this includes doing similar `'<field>: <value>\n'` multi-line
formatting when reporting various subproc supervision steps as well as
showing a detailed `trio.Process.__repr__()` as appropriate.

Also adds a detailed #TODO according to the needs of #320 for which
we're going to need some internal mechanism for intermediary parent
actors to determine if a given debug tty locker (sub-actor) is one of
*their* (transitive) children and thus stall the normal
cancellation/teardown sequence until that locker is complete.
2024-02-20 13:12:51 -05:00
Tyler Goodlet 81f8e2d4ac _supervise: iter nice expanded multi-line `._children` tups with typing 2024-02-20 09:18:22 -05:00
Tyler Goodlet bf0739c194 Add `stackscope` tree pprinter triggered by SIGUSR1
Can be optionally enabled via a new `enable_stack_on_sig()` which will
swap in the SIGUSR1 handler. Much thanks to @oremanj for writing this
amazing project, it's thus far helped me fix some very subtle hangs
inside our new IPC-context cancellation machinery that would have
otherwise taken much more manual pdb-ing and hair pulling XD

Full credit for `dump_task_tree()` goes to the original project author
with some minor tweaks as was handed to me via the trio-general matrix
room B)

Slight changes from orig version:
- use a `log.pdb()` emission to pprint to console
- toss in an ex sh CLI cmd to trigger the dump from another terminal
  using `kill` + `pgrep`.
2024-02-20 09:05:34 -05:00
Tyler Goodlet 5fe3f58ea9 Add a `debug_mode: bool` fixture via `--tpdb` flag
Allows tests (including any `@tractor_test`s) to subscribe to a CLI flag
`--tpdb` (for "tractor python debugger") which the session can provide
to tests which can then proxy the value to `open_root_actor()` (via
`open_nursery()`) when booting the runtime - thus enabling our debug
mode globally to any subscribers B)

This is real handy if you have some failures but can't determine the
root issue without jumping into a `pdbp` REPL inside a (sub-)actor's
spawned-task.
2024-02-20 08:53:37 -05:00
Tyler Goodlet 3e1d033708 WIP: solved the modden client hang.. 2024-02-19 17:00:46 -05:00
Tyler Goodlet c35576e196 Baboso! fix `chan.send(None)` indent.. 2024-02-19 14:41:03 -05:00
Tyler Goodlet 8ce26d692f Improved log msg formatting in core
As part of solving some final edge cases todo with inter-peer remote
cancellation (particularly a remote cancel from a separate actor
tree-client hanging on the request side in `modden`..) I needed less
dense, more line-delimited log msg formats when understanding ipc
channel and context cancels from console logging; this adds a ton of
that to:
- `._invoke()` which now does,
  - better formatting of `Context`-task info as multi-line
    `'<field>: <value>\n'` messages,
  - use of `trio.Task` (from `.lowlevel.current_task()` for full
    rpc-func namespace-path info,
  - better "msg flow annotations" with `<=` for understanding
    `ContextCancelled` flow.
- `Actor._stream_handler()` where in we break down IPC peers reporting
  better as multi-line `|_<Channel>` log msgs instead of all jammed on
  one line..
- `._ipc.Channel.send()` use `pformat()` for repr of packet.

Also tweak some optional deps imports for debug mode:
- add `maybe_import_gb()` for attempting to import `greenback`.
- maybe enable `stackscope` tree pprinter on `SIGUSR1` if installed.

Add a further stale-debugger-lock guard before removal:
- read the `._debug.Lock.global_actor_in_debug: tuple` uid and possibly
  `maybe_wait_for_debugger()` when the child-user is known to have
  a live process in our tree.
- only cancel `Lock._root_local_task_cs_in_debug: CancelScope` when
  the disconnected channel maps to the `Lock.global_actor_in_debug`,
  though not sure this is correct yet?

Started adding missing type annots in sections that were modified.
2024-02-19 14:00:23 -05:00
Tyler Goodlet 7f29fd8dcf Let `pack_error()` take a msg injected `cid: str|None` 2024-02-18 17:17:31 -05:00
Tyler Goodlet 7fbada8a15 Add `StreamOverrun.sender: tuple` for better handling
Since it's generally useful to know who is the cause of an overrun (say
bc you want your system to then adjust the writer side to slow tf down)
might as well pack an extra `.sender: tuple[str, str]` actor uid field
which can be relayed through `RemoteActorError` boxing. Add an extra
case for the exc-type to `unpack_error()` to match B)
2024-02-16 15:23:02 -05:00
Tyler Goodlet 286e75d342 Offer `unpack_error(hid_tb: bool)` for `pdbp` REPL config 2024-02-14 16:13:32 -05:00
Tyler Goodlet df641d9d31 Bring in pretty-ified `msgspec.Struct` extension
Originally designed and used throughout `piker`, the subtype adds some
handy pprinting and field diffing extras often handy when viewing struct
types in logging or REPL console interfaces B)

Obvi this rejigs the `tractor.msg` mod into a sub-pkg and moves the
existing namespace obj-pointer stuff into a new `.msg.ptr` sub mod.
2024-01-28 16:33:10 -05:00
Tyler Goodlet 35b0c4bef0 Never mask original `KeyError` in portal-error unwrapper, for now? 2024-01-23 11:14:10 -05:00
Tyler Goodlet c4496f21fc Try allowing multi-pops of `_Cache.locks` for now? 2024-01-23 11:13:07 -05:00
Tyler Goodlet 7e0e627921 Use `import <blah> as blah` over `__all__` in `.trionics` 2024-01-23 11:09:38 -05:00
Tyler Goodlet 28ea8e787a Bump timeout on resource cache test a bitty bit. 2024-01-03 22:27:05 -05:00
Tyler Goodlet 0294455c5e `_root`: drop unused `typing` import 2024-01-02 18:43:43 -05:00
Tyler Goodlet 734bc09b67 Move missing-key-in-msg raiser to `._exceptions`
Since we use basically the exact same set of logic in
`Portal.open_context()` when expecting the first `'started'` msg factor
and generalize `._streaming._raise_from_no_yield_msg()` into a new
`._exceptions._raise_from_no_key_in_msg()` (as per the lingering todo)
which obvi requires a more generalized / optional signature including
a caller specific `log` obj. Obvi call the new func from all the other
modules X)
2024-01-02 18:34:15 -05:00
Tyler Goodlet 0bcdea28a0 Fmt repr as multi-line style call 2024-01-02 11:28:55 -05:00
Tyler Goodlet fdf3a1b01b Only use `greenback` if actor-runtime is up.. 2024-01-02 11:28:02 -05:00
Tyler Goodlet ce7b8a5e18 Drop unused walrus assign of `re` 2024-01-02 11:21:20 -05:00
Tyler Goodlet 00024181cd `StackLevelAdapter._log(stacklevel: int)` for custom levels..
Apparently (and i don't know if this was always broken [i feel like no?]
or is a recent change to stdlib's `logging` stuff) we need increment the
`stacklevel` input by one for our custom level methods now? Without this
you're going to see the path to the method's-callstack-frame on every
emission instead of to the caller's. I first noticed this when debugging
the workspace layer spawning in `modden.bigd` and then verified it in
other depended projects..

I guess we should add some tests for this as well XD
2024-01-02 10:38:04 -05:00
Tyler Goodlet 814384848d Use `import <name> as <name>,` style over `__all__` in pkg mod 2024-01-02 10:25:17 -05:00
Tyler Goodlet bea31f6d19 ._child: remove some unused imports.. 2024-01-02 10:24:39 -05:00
Tyler Goodlet 250275d98d Guarding for IPC failures in `._runtime._invoke()`
Took me longer then i wanted to figure out the source of
a failed-response to a remote-cancellation (in this case in `modden`
where a client was cancelling a workspace layer.. but disconnects before
receiving the ack msg) that was triggering an IPC error when sending the
error msg for the cancellation of a `Actor._cancel_task()`, but since
this (non-rpc) `._invoke()` task was trying to send to a now
disconnected canceller it was resulting in a `BrokenPipeError` (or similar)
error.

Now, we except for such IPC errors and only raise them when,
1. the transport `Channel` is for sure up (bc ow what's the point of
   trying to send an error on the thing that caused it..)
2. it's definitely for handling an RPC task

Similarly if the entire main invoke `try:` excepts,
- we only hide the call-stack frame from the debugger (with
  `__tracebackhide__: bool`) if it's an RPC task that has a connected
  channel since we always want to see the frame when debugging internal
  task or IPC failures.
- we don't bother trying to send errors to the context caller (actor)
  when it's a non-RPC request since failures on actor-runtime-internal
  tasks shouldn't really ever be reported remotely, only maybe raised
  locally.

Also some other tidying,
- this properly corrects for the self-cancel case where an RPC context
  is cancelled due to a local (runtime) task calling a method like
  `Actor.cancel_soon()`. We now set our own `.uid` as the
  `ContextCancelled.canceller` value so that other-end tasks know that
  the cancellation was due to a self-cancellation by the actor itself.
  We still need to properly test for this though!
- add a more detailed module doc-str.
- more explicit imports for `trio` core types throughout.
2024-01-02 10:23:45 -05:00
Tyler Goodlet f415fc43ce `.discovery.get_arbiter()`: add warning around this now deprecated usage 2023-12-11 19:37:45 -05:00
Tyler Goodlet 3f15923537 More thurough hard kill doc strings 2023-12-11 18:17:42 -05:00
Tyler Goodlet 87cd725adb Add `open_root_actor(ensure_registry: bool)`
Allows forcing the opened actor to either obtain the passed registry
addrs or raise a runtime error.
2023-11-07 16:45:24 -05:00
Tyler Goodlet 48accbd28f Fix doc string "its" typo.. 2023-11-06 15:44:21 -05:00
Tyler Goodlet 227c9ea173 Test with `any(portals)` since `gather_contexts()` will return `list[None | tuple]` 2023-11-06 15:43:43 -05:00
Tyler Goodlet d651f3d8e9 Tons of interpeer test cleanup
Drop all the nested `@acm` blocks and defunct comments from initial
validations. Add some todos for cases that are still unclear such as
whether the caller / streamer should have `.cancelled_caught == True` in
it's teardown.
2023-10-25 15:21:41 -04:00
Tyler Goodlet ef0cfc4b20 Get inter-peer suite passing with all `Context` state checks!
Definitely needs some cleaning and refinement but this gets us to stage
1 of being pretty frickin correct i'd say 💃
2023-10-23 18:24:23 -04:00
Tyler Goodlet ecb525a2bc Adjust test details where `Context.cancel()` is called
We can now make asserts on `.cancelled_caught` and `_remote_error` vs.
`_local_error`. Expect a runtime error when `Context.open_stream()` is
called AFTER `.cancel()` and the remote `ContextCancelled` hasn't
arrived (yet). Adjust to `'itself'` string in self-cancel case.
2023-10-23 17:49:02 -04:00
Tyler Goodlet b77d123edd Fix `Context.result()` call to be in runtime scope 2023-10-23 17:48:34 -04:00
Tyler Goodlet f4e63465de Tweak `Channel._cancel_called` comment 2023-10-23 17:47:55 -04:00
Tyler Goodlet df31047ecb Be ultra-correct in `Portal.open_context()`
This took way too long to get right but hopefully will give us grok-able
and correct context exit semantics going forward B)

The main fixes were:
- always shielding the `MsgStream.aclose()` call on teardown to avoid
  bubbling a `Cancelled`.
- properly absorbing any `ContextCancelled` in cases due to "self
  cancellation" using the new `Context.canceller` in the logic.
- capturing any error raised by the `Context.result()` call in the
  "normal exit, result received" case and setting it as the
  `Context._local_error` so that self-cancels can be easily measured via
  `Context.cancelled_caught` in same way as remote-error caused
  cancellations.
- extremely detailed comments around all of the cancellation-error cases
  to avoid ever getting confused about the control flow in the future XD
2023-10-23 17:34:28 -04:00
Tyler Goodlet 131674eabd Be mega-pedantic with `ContextCancelled` semantics
As part of extremely detailed inter-peer-actor testing, add much more
granular `Context` cancellation state tracking via the following (new)
fields:
- `.canceller: tuple[str, str]` the uuid of the actor responsible for
  the cancellation condition - always set by
  `Context._maybe_cancel_and_set_remote_error()` and replaces
  `._cancelled_remote` and `.cancel_called_remote`. If set, this value
  should normally always match a value from some `ContextCancelled`
  raised or caught by one side of the context.
- `._local_error` which is always set to the locally raised (and caller
  or callee task's scope-internal) error which caused any
  eventual cancellation/error condition and thus any closure of the
  context's per-task-side-`trio.Nursery`.
- `.cancelled_caught: bool` is now always `True` whenever the local task
  catches (or "silently absorbs") a `ContextCancelled` (a `ctxc`) that
  indeed originated from one of the context's linked tasks or any other
  context which raised its own `ctxc` in the current `.open_context()` scope.
  => whenever there is a case that no `ContextCancelled` was raised
  **in** the `.open_context().__aexit__()` (eg. `ctx.result()` called
  after a call `ctx.cancel()`), we still consider the context's as
  having "caught a cancellation" since the `ctxc` was indeed silently
  handled by the cancel requester; all other error cases are already
  represented by mirroring the state of the `._scope: trio.CancelScope`
  => IOW there should be **no case** where an error is **not raised** in
  the context's scope and `.cancelled_caught: bool == False`, i.e. no
  case where `._scope.cancelled_caught == False and ._local_error is not
  None`!
- always raise any `ctxc` from `.open_stream()` if `._cancel_called ==
  True` - if the cancellation request has not already resulted in
  a `._remote_error: ContextCancelled` we raise a `RuntimeError` to
  indicate improper usage to the guilty side's task code.
- make `._maybe_raise_remote_err()` a sync func and don't raise
  any `ctxc` which is matched against a `.canceller` determined to
  be the current actor, aka a "self cancel", and always set the
  `._local_error` to any such `ctxc`.
- `.side: str` taken from inside `.cancel()` and unused as of now since
  it might be better re-written as a similar `.is_opener() -> bool`?
- drop unused `._started_received: bool`..
- TONS and TONS of detailed comments/docs to attempt to explain all the
  possible cancellation/exit cases and how they should exhibit as either
  silent closes or raises from the `Context` API!

Adjust the `._runtime._invoke()` code to match:
- use `ctx._maybe_raise_remote_err()` in `._invoke()`.
- adjust to new `.canceller` property.
- more type hints.
- better `log.cancel()` msging around self-cancels vs. peer-cancels.
- always set the `._local_error: BaseException` for the "callee" task
  just like `Portal.open_context()` now will do B)

Prior we were raising any `Context._remote_error` directly and doing
(more or less) the same `ContextCancelled` "absorbing" logic (well
kinda) in block; instead delegate to the method
2023-10-23 16:24:54 -04:00
Tyler Goodlet 5a94e8fb5b Raise a `MessagingError` from the src error on msging edge cases 2023-10-23 14:34:12 -04:00
Tyler Goodlet 0518b3ab04 Move `MessagingError` into `._exceptions` set 2023-10-23 14:17:36 -04:00
Tyler Goodlet 2f0bed3018 Ignore `greenback` import error if not installed 2023-10-19 12:41:15 -04:00
Tyler Goodlet 9da3b63644 Change remaining internals to use `Actor.reg_addrs` 2023-10-19 12:40:37 -04:00
Tyler Goodlet 1d6f55543d Expose per-actor registry addrs via `.reg_addrs`
Since it's handy to be able to debug the *writing* of this instance var
(particularly when checking state passed down to a child in
`Actor._from_parent()`), rename and wrap the underlying
`Actor._reg_addrs` as a settable `@property` and add validation to
the `.setter` for sanity - actor discovery is a critical functionality.

Other tweaks:
- fix `.cancel_soon()` to pass expected argument..
- update internal runtime error message to be simpler and link to GH issues.
- use new `Actor.reg_addrs` throughout core.
2023-10-19 12:38:27 -04:00
Tyler Goodlet a3ed30e62b Get remaining suites passing..
..by ensuring `reg_addr` fixture value passthrough to subactor eps
2023-10-19 11:51:47 -04:00
Tyler Goodlet 42d621bba7 Always dynamically re-read the `._root._default_lo_addrs` value in `find_actor()` 2023-10-18 19:10:04 -04:00
Tyler Goodlet 2e81ccf5b4 Dump `.msgdata` in `RemoteActorError.__repr__()` 2023-10-18 19:09:07 -04:00
Tyler Goodlet 022bf8ce75 Ensure `registry_addrs` is always set to something 2023-10-18 19:08:35 -04:00
Tyler Goodlet 0e9457299c Port all tests to new `reg_addr` fixture name 2023-10-18 15:39:20 -04:00
Tyler Goodlet 6b1ceee19f Type out the full-fledged streaming ex. 2023-10-18 15:36:00 -04:00
Tyler Goodlet 1e689ee701 Rename fixture `arb_addr` -> `reg_addr` and set the session value globally as `._root._default_lo_addrs` 2023-10-18 15:35:35 -04:00
Tyler Goodlet 190845ce1d Add masked super timeout line to `do_hard_kill()` for would-be runtime hackers 2023-10-18 15:29:43 -04:00
Tyler Goodlet 0c74b04c83 Facepalm, `wait_for_actor()` dun take an addr `list`.. 2023-10-18 15:22:54 -04:00
Tyler Goodlet 215fec1d41 Change old `._debug._pause()` name, cherry to #362 re `greenback` 2023-10-18 15:01:04 -04:00
Tyler Goodlet fcc8cee9d3 ._root: set a `_default_lo_addrs` and apply it when not provided by caller 2023-10-18 14:12:58 -04:00
Tyler Goodlet ca3f7a1b6b Add a first serious inter-peer remote cancel suite
Tests that appropriate `Context` exit state, the relay of
a `ContextCancelled` error and its `.canceller: tuple[str, str]` value
are set when an inter-peer cancellation happens via an "out of band"
request method (in this case using `Portal.cancel_actor()` and that
cancellation is propagated "horizontally" to other peers. Verify that
any such cancellation scenario which also experiences an "error during
`ContextCancelled` handling" DOES NOT result in that further error being
suppressed and that the user's exception bubbles out of the
`Context.open_context()` block(s) appropriately!

Likely more tests to come as well as some factoring of the teardown
state checks where possible.

Pertains to serious testing the major work landing in #357
2023-10-18 13:59:08 -04:00
Tyler Goodlet 87c1113de4 Always set default reg addr in `find_actor()` if not defined 2023-10-18 13:20:29 -04:00
Tyler Goodlet 43b659dbe4 Tidy/clarify another `._runtime` comment 2023-10-18 13:19:34 -04:00
Tyler Goodlet 63b1488ab6 Get mega-pedantic in `Portal.open_context()`
Specifically in the `.__aexit__()` phase to ensure remote,
runtime-internal, and locally raised error-during-cancelled-handling
exceptions are NEVER masked by a local `ContextCancelled` or any
exception group of `trio.Cancelled`s.

Also adds a ton of details to doc strings including extreme detail
surrounding the `ContextCancelled` raising cases and their processing
inside `.open_context()`'s exception handler blocks.

Details, details:
- internal rename `err`/`_err` stuff to just be `scope_err` since it's
  effectively the error bubbled up from the context's surrounding (and
  cross-actor) "scope".
- always shield `._recv_chan.aclose()` to avoid any `Cancelled` from
  masking the `scope_err` with a runtime related `trio.Cancelled`.
- explicitly catch the specific set of `scope_err: BaseException` that
  we can reasonably expect to handle instead of the catch-all parent
  type including exception groups, cancels and KBIs.
2023-10-18 13:18:29 -04:00
Tyler Goodlet 7eb31f3fea Runtime import `.get_root()` in stdin hijacker to avoid import cycle 2023-10-17 16:52:31 -04:00
Tyler Goodlet 534e5d150d Drop `msg` kwarg from `Context.cancel()`
Well first off, turns out it's never used and generally speaking
doesn't seem to help much with "runtime hacking/debugging"; why would
we need to "fabricate" a msg when `.cancel()` is called to self-cancel?

Also (and since `._maybe_cancel_and_set_remote_error()` now takes an
`error: BaseException` as input and thus expects error-msg unpacking
prior to being called), we now manually set `Context._cancel_msg: dict`
just prior to any remote error assignment - so any case where we would
have fabbed a "cancel msg" near calling `.cancel()`, just do the manual
assign.

In this vein some other subtle changes:
- obviously don't set `._cancel_msg` in `.cancel()` since it's no longer
  an input.
- generally do walrus-style `error := unpack_error()` before applying
  and setting remote error-msg state.
- always raise any `._remote_error` in `.result()` instead of returning
  the exception instance and check before AND after the underlying mem
  chan read.
- add notes/todos around `raise self._remote_error from None` masking of
  (runtime) errors in `._maybe_raise_remote_err()` and use it inside
  `.result()` since we had the inverse duplicate logic there anyway..

Further, this adds and extends a ton of (internal) interface docs and
details comments around the `Context` API including many subtleties
pertaining to calling `._maybe_cancel_and_set_remote_error()`.
2023-10-17 16:50:52 -04:00
Tyler Goodlet e4a6223256 `._exceptions`: typing and error unpacking updates
Bump type annotations to 3.10+ style throughout module as well as fill
out doc strings a bit. Inside `unpack_error()` pop any `error_dict: dict`
and,
- return `None` early if not found,
- versus pass directly as `**error_dict` to the error constructor
  instead of a double field read.
2023-10-16 16:23:30 -04:00
Tyler Goodlet ab2664da70 Runtime level log on debug REPL exits 2023-10-16 15:46:21 -04:00
Tyler Goodlet ae326cbb9a Ignore kbis in `open_crash_handler()` by default 2023-10-16 15:45:34 -04:00
Tyler Goodlet 07cec02303 Add comments around diff between `C/context` refs 2023-10-16 15:45:02 -04:00
Tyler Goodlet 2fdb8fc25a Factor non-yield stream msg processing into helper
Since both `MsgStream.receive()` and `.receive_nowait()` need the same
raising logic when a non-stream msg arrives (so that maybe an
appropriate IPC translated error can be raised) move the `KeyError`
handler code into a new `._streaming._raise_from_no_yield_msg()` func
and call it from both methods to make the error-interface-raising
symmetrical across both methods.
2023-10-16 15:35:16 -04:00
Tyler Goodlet 6d951c526a Comment all `.pause(shield=True)` attempts again, need to solve cancel scope `.__exit__()` frame hiding issue.. 2023-10-10 09:55:11 -04:00
Tyler Goodlet 575a24adf1 Always raise remote (cancelled) error if set
Previously we weren't raising a remote error if the local scope was
cancelled during a call to `Context.result()` which is problematic if
the caller WAS NOT the requester for said remote cancellation; in that
case we still want a `ContextCancelled` raised with the `.canceller:
str` set to the cancelling actor uid.

Further fix a naming bug where the (seemingly older) `._remote_err` was
being set to such an error instead of `._remote_error` XD
2023-10-10 09:45:49 -04:00
Tyler Goodlet 919e462f88 Write more comprehensive `Portal.cancel_actor()` doc str 2023-10-08 15:57:18 -04:00
Tyler Goodlet a09b8560bb Oof, default reg addrs needs to be in `list[tuple]` form.. 2023-10-07 18:52:37 -04:00
Tyler Goodlet c4cd573b26 Drop pause line from ctx cancel handler block in test 2023-10-07 18:51:59 -04:00
Tyler Goodlet d24a9e158f Msg-ified `ContextCancelled`s sub-error type should always be just, its type.. 2023-10-07 18:51:03 -04:00
Tyler Goodlet 18a1634025 Add shielding support to `.pause()`
Implement it like you'd expect using simply a wrapping
`trio.CancelScope` which is itself shielded by the input `shield: bool`
B)

There's seemingly still some issues with the frame selection when the
REPL engages and not sure how to resolve it yet but at least this does
indeed work for practical purposes. Still needs a test obviously!
2023-10-06 15:49:23 -04:00
Tyler Goodlet 78c0d2b234 Start inter-peer cancellation test mod
Move over relevant test from the "context semantics" test module which
was already verifying peer-caused-`ContextCancelled.canceller: tuple`
error info and propagation during an inter-peer cancellation scenario.

Also begin a more general set of inter-peer cancellation tests starting
with the simplest case where when a peer is cancelled the parent should
NOT get an "muted" `trio.Cancelled` and instead
a `tractor.ContextCancelled` with a `.canceller: tuple` which points to
the sibling actor which requested the peer cancel.
2023-10-06 15:44:26 -04:00
Tyler Goodlet 4314a59327 Add post-mortem catch around failed transport addr binds to aid with runtime debugging 2023-10-03 10:54:46 -04:00
Tyler Goodlet e94f1261b5 Move `maybe_open_crash_handler()` CLI `--pdb`-driven wrapper to debug mod 2023-10-02 18:10:34 -04:00
Tyler Goodlet 86da79a854 Rename to `parse_maddr()` and fill out doc strings 2023-09-29 14:49:18 -04:00
Tyler Goodlet de89e3a9c4 Add libp2p style "multi-address" parser from `piker`
Details are in the module docs; this is a first draft with lotsa room
for refinement and extension.
2023-09-29 14:11:31 -04:00
Tyler Goodlet 7bed470f5c Start `.devx.cli` extensions for pop CLI frameworks
Starting of with just a `typer` (and thus transitively `click`)
`typer.Typer.callback` hook which allows passthrough of the `--ll
<loglevel: str>` and `--pdb <debug_mode: bool>` flags for use when
building CLIs that use the runtime Bo

Still needs lotsa refinement and obviously better docs but, the doc
string for `load_runtime_vars()` shows how to use the underlying
`.devx._debug.open_crash_handler()` via a wrapper that can be passed the
`--pdb` flag and then enable debug mode throughout the entire actor
system.
2023-09-28 15:36:24 -04:00
Tyler Goodlet fa9a9cfb1d Kick off `.devx` subpkg for our dev tools B)
Where `.devx` is "developer experience", a hopefully broad enough subpkg
name for all the slick stuff planned to augment working on the actor
runtime 💥

Move the `._debug` module into the new subpkg and adjust rest of core
code base to reflect import path change. Also add a new
`.devx._debug.open_crash_handler()` manager for wrapping any sync code
outside a `trio.run()` which is handy for eventual CLI addons for
popular frameworks like `click`/`typer`.
2023-09-28 14:14:50 -04:00
Tyler Goodlet 3d0e95513c Init-support for "multi homed" transports
Since we'd like to eventually allow a diverse set of transport
(protocol) methods and stacks, and a multi-peer discovery system for
distributed actor-tree applications, this reworks all runtime internals
to support multi-homing for any given tree on a logical host. In other
words any actor can now bind its transport server (currently only
unsecured TCP + `msgspec`) to more then one address available in its
(linux) network namespace. Further, registry actors (now dubbed
"registars" instead of "arbiters") can also similarly bind to multiple
network addresses and provide discovery services to remote actors via
multiple addresses which can now be provided at runtime startup.

Deats:
- adjust `._runtime` internals to use a `list[tuple[str, int]]` (and
  thus pluralized) socket address sequence where applicable for transport
  server socket binds, now exposed via `Actor.accept_addrs`:
  - `Actor.__init__()` now takes a `registry_addrs: list`.
  - `Actor.is_arbiter` -> `.is_registrar`.
  - `._arb_addr` -> `._reg_addrs: list[tuple]`.
  - always reg and de-reg from all registrars in `async_main()`.
  - only set the global runtime var `'_root_mailbox'` to the loopback
    address since normally all in-tree processes should have access to
    it, right?
  - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]`
- make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]`
  and defaults when not passed.
- change `ActorNursery.start_..()` methods take `bind_addrs: list` and
  pass down through the spawning layer(s) via the parent-seed-msg.
- generalize all `._discovery()` APIs to accept `registry_addrs`-like
  inputs and move all relevant subsystems to adopt the "registry" style
  naming instead of "arbiter":
  - make `find_actor()` support batched concurrent portal queries over
    all provided input addresses using `.trionics.gather_contexts()` Bo
  - syntax: move to using `async with <tuples>` 3.9+ style chained
    @acms.
  - a general modernization of the code to a python 3.9+ style.
  - start deprecation and change to "registry" naming / semantics:
    - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 16:25:21 -04:00
Tyler Goodlet ee151b00af Mk `gather_contexts()` support `@acm`s yielding `None`
We were using a `all(<yielded values>)` condition which obviously won't
work if the batched managers yield any non-truthy value. So instead see
the `unwrapped: dict` with the `id(mngrs)` and only unblock once all
values have been filled in to be something that is not that value.
2023-09-27 14:05:22 -04:00
Tyler Goodlet 22c14e235e Expose `Channel` @ pkg level, drop `_debug.pp()` alias 2023-08-18 10:18:25 -04:00
Tyler Goodlet 1102843087 Teensie tidy up on actor doc string 2023-08-18 10:10:36 -04:00
Tyler Goodlet e03bec5efc Move `.to_asyncio` to modern optional value type annots 2023-07-21 15:08:46 -04:00
Tyler Goodlet bee2c36072 Make `NamespacePath` work on object refs
Detect if the input ref is a non-func (like an `object` instance) in
which case grab its type name using `type()`. Wrap all the name-getting
into a new `_mk_fqpn()` static meth: gets the "fully qualified path
name" and returns path and name in tuple; port other methds to use it.
Refine and update the docs B)
2023-07-12 13:07:30 -04:00
Tyler Goodlet b36b3d522f Map `breakpoint()` built-in to new `.pause_from_sync()` ep 2023-07-07 15:35:52 -04:00
Tyler Goodlet 4ace8f6037 Fix frame-selection display on first REPL entry
For whatever reason pdb(p), and in general, will show the frame of the
*next* python instruction/LOC on initial entry (at least using
`.set_trace()`), as such remove the `try/finally` block in the sync
code entrypoint `.pause_from_sync()`, and also since doesn't seem like
we really need it anyway.

Further, and to this end:
- enable hidden frames support in our default config.
- fix/drop/mask all the frame ref-ing/mangling we had prior since it's no
  longer needed as well as manual `Lock` releasing which seems to work
  already by having the `greenback` spawned task do it's normal thing?
- move to no `Union` type annots.
- hide all frames that can add "this is the runtime confusion" to
  traces.
2023-07-07 14:51:44 -04:00
Tyler Goodlet 98a7326c85 ._runtime: log level tweaks, use crit for stale debug lock detection 2023-07-07 14:49:23 -04:00
Tyler Goodlet 46972df041 .log: more correct handling for `get_logger(__name__)` usage 2023-07-07 14:48:37 -04:00
Tyler Goodlet 565d7c3ee5 Add longer "required reading" list B) 2023-07-07 14:47:42 -04:00
Tyler Goodlet ac695a05bf Updates from latest `piker.data._sharedmem` changes 2023-06-22 17:16:17 -04:00
Tyler Goodlet fc56971a2d First proto: use `greenback` for sync func breakpointing
This works now for supporting a new `tractor.pause_from_sync()`
`tractor`-aware-replacement for `Pdb.set_trace()` from sync functions
which are also scheduled from our runtime. Uses `greenback` to do all
the magic of scheduling the bg `tractor._debug._pause()` task and
engaging the normal TTY locking machinery triggered by `await
tractor.breakpoint()`

Further this starts some public API renaming, making a switch to
`tractor.pause()` from `.breakpoint()` which IMO much better expresses
the semantics of the runtime intervention required to suffice
multi-process "breakpointing"; it also is an alternate name for the same
in computer science more generally: https://en.wikipedia.org/wiki/Breakpoint
It also avoids using the same name as the `breakpoint()` built-in which
is important since there **is alot more going on** when you call our
equivalent API.

Deats of that:
- add deprecation warning for `tractor.breakpoint()`
- add `tractor.pause()` and a shorthand, easier-to-type, alias `.pp()`
  for "pause-point" B)
- add `pause_from_sync()` as the new `breakpoint()`-from-sync-function
  hack which does all the `greenback` stuff for the user.

Still TODO:
- figure out where in the runtime and when to call
  `greenback.ensure_portal()`.
- fix the frame selection issue where
  `trio._core._ki._ki_protection_decorator:wrapper` seems to be always
  shown on REPL start as the selected frame..
2023-06-21 16:08:18 -04:00
Tyler Goodlet ee87cf0e29 Add a debug-mode-breakpoint-causes-hang case!
Only found this by luck more or less (while working on something in
a client project) and it turns out we can actually get to (yet another)
hang state where SIGINT will be ignored by the root actor on teardown..

I've added all the necessary logic flags to reproduce. We obviously need
a follow up bug issue and a test suite to replicate!

It appears as though the following are required based on very light
tinkering:
- infected asyncio mode active
- debug mode active
- the `trio` context must breakpoint *before* `.started()`-ing
- the `asyncio` must **not** error
2023-06-21 14:07:31 -04:00
Tyler Goodlet ebcb275cd8 Add (first-draft) infected-`asyncio` actor task uses debugger example 2023-06-21 14:07:31 -04:00
Tyler Goodlet f745da9fb2 Add `numpy` for testing optional integrated shm API layer 2023-06-15 12:20:20 -04:00
Tyler Goodlet 4f442efbd7 Pass `str` dtype for `use_str` case 2023-06-15 12:20:20 -04:00
Tyler Goodlet f9a84f0732 Allocate size-specced "empty" sequence from default values by type 2023-06-15 12:20:20 -04:00
Tyler Goodlet e0bf964ff0 Mod define `_USE_POSIX`, add a of of todos 2023-06-15 12:20:20 -04:00
Tyler Goodlet a9fc4c1b91 Parametrize rw test with variable frame sizes
Demonstrates fixed size frame-oriented reads by the child where the
parent only transmits a "read" stream msg on "frame fill events" such
that the child incrementally reads the shm list data (much like in
a real-time-buffered streaming system).
2023-06-15 12:20:20 -04:00
Tyler Goodlet b52ff270c5 Add `ShmList` slice support in `.__getitem__()` 2023-06-15 12:20:20 -04:00
Tyler Goodlet 1713ecd9f8 Rename token type to `NDToken` in the style of `nptyping` 2023-06-15 12:20:20 -04:00
Tyler Goodlet edb82fdd78 Don't require runtime (for now), type annot fixing 2023-06-15 12:20:20 -04:00
Tyler Goodlet 339d787cf8 Add repetitive attach to existing segment test 2023-06-15 12:20:20 -04:00
Tyler Goodlet c32b21b4b1 Add initial readers-writer shm list tests 2023-06-15 12:20:20 -04:00
Tyler Goodlet 71477290fc Add `ShmList` wrapping the stdlib's `ShareableList`
First attempt at getting `multiprocessing.shared_memory.ShareableList`
working; we wrap the stdlib type with a readonly attr and a `.key` for
cross-actor lookup. Also, rename all `numpy` specific routines to have
a `ndarray` suffix in the func names.
2023-06-15 12:20:20 -04:00
Tyler Goodlet 9716d86825 Initial module import from `piker.data._sharemem`
More or less a verbatim copy-paste minus some edgy variable naming and
internal `piker` module imports. There is a bunch of OHLC related
defaults that need to be dropped and we need to adjust to an optional
dependence on `numpy` by supporting shared lists as per the mp docs.
2023-06-15 12:20:20 -04:00
Tyler Goodlet 7507e269ec Just import `mp` top level in `._spawn` 2023-06-14 15:32:15 -04:00
Tyler Goodlet 17ae449160 Tidy up `typing` imports in broadcaster mod 2023-06-14 15:31:52 -04:00
Tyler Goodlet 6495688730 Drop `Optional` style from runtime mod 2023-05-25 16:00:05 -04:00
Tyler Goodlet a0276f41c2 Remote cancellation runtime-internal vars renames
- `Context._cancel_called_remote` -> `._cancelled_remote` since "called"
  implies the cancellation was "requested" when it could be due to
  another error and the actor uid is the value - only set once the far
  end task scope is terminated due to either error or cancel, which has
  nothing to do with *what* caused the cancellation.
- `Actor._cancel_called_remote` -> `._cancel_called_by_remote` which
  emphasizes that this variable is **only set** IFF some remote actor
  **requested that** this actor's runtime be cancelled via
  `Actor.cancel()`.
2023-05-19 14:31:55 -04:00
Tyler Goodlet ead9e418de Expose `allow_overruns` to `Portal.open_context()`
Turns out you can get a case where you might be opening multiple
ctx-streams concurrently and during the context opening phase you block
for all contexts to open, but then when you eventually start opening
streams some slow to start context has caused the others become in an
overrun state.. so we need to let the caller control whether that's an
error ;)

This also needs a test!
2023-05-15 10:00:45 -04:00
Tyler Goodlet 60791ed546 Oof, fix remaining `Actor.cancel()` in `Actor._from_parent()` 2023-05-15 10:00:45 -04:00
Tyler Goodlet 7293b82bcc Tweak doc string 2023-05-15 10:00:45 -04:00
Tyler Goodlet 20d75ff934 Move move context code into new `._context` mod 2023-05-15 10:00:45 -04:00
Tyler Goodlet 041d7da721 Drop caller cancels overrun test; covered in new tests 2023-05-15 10:00:45 -04:00
Tyler Goodlet 04e4397a8f Ignore drainer-task nursery RTE during context exit 2023-05-15 10:00:45 -04:00
Tyler Goodlet 968f13f9ef Set `Context._scope_nursery` on callee side too
Because obviously we probably want to support `allow_overruns` on the
remote callee side as well XD

Only found the bugs fixed in this patch this thanks to writing a much
more exhaustive test set for overrun cases B)
2023-05-15 10:00:45 -04:00
Tyler Goodlet f9911c22a4 Seriously cover all overrun cases
This actually caught further runtime bugs so it's gud i tried..
Add overrun-ignore enabled / disabled cases and error catching for all
of them. More or less this should cover every possible outcome when
it comes to setting `allow_overruns: bool` i hope XD
2023-05-15 10:00:45 -04:00
Tyler Goodlet 63adf73b4b Adjust aio test for silent cancellation by parent 2023-05-15 10:00:45 -04:00
Tyler Goodlet f1e9c0be93 Fix cluster test to use `allow_overruns` 2023-05-15 10:00:45 -04:00
Tyler Goodlet 6db656fecf Flip allocate log msgs to debug 2023-05-15 10:00:45 -04:00
Tyler Goodlet 6994d2026d Drop brackpressure usage from fan out tests 2023-05-15 10:00:45 -04:00
Tyler Goodlet c72026091e Remote `Context` cancellation semantics rework B)
This adds remote cancellation semantics to our `tractor.Context`
machinery to more closely match that of `trio.CancelScope` but
with operational differences to handle the nature of parallel tasks interoperating
across multiple memory boundaries:

- if an actor task cancels some context it has opened via
  `Context.cancel()`, the remote (scope linked) task will be cancelled
  using the normal `CancelScope` semantics of `trio` meaning the remote
  cancel scope surrounding the far side task is cancelled and
  `trio.Cancelled`s are expected to be raised in that scope as per
  normal `trio` operation, and in the case where no error is raised
  in that remote scope, a `ContextCancelled` error is raised inside the
  runtime machinery and relayed back to the opener/caller side of the
  context.
- if any actor task cancels a full remote actor runtime using
  `Portal.cancel_actor()` the same semantics as above apply except every
  other remote actor task which also has an open context with the actor
  which was cancelled will also be sent a `ContextCancelled` **but**
  with the `.canceller` field set to the uid of the original cancel
  requesting actor.

This changeset also includes a more "proper" solution to the issue of
"allowing overruns" during streaming without attempting to implement any
form of IPC streaming backpressure. Implementing task-granularity
backpressure cross-process turns out to be more or less impossible
without augmenting out streaming protocol (likely at the cost of
performance). Further allowing overruns requires special care since
any blocking of the runtime RPC msg loop task effectively can block
control msgs such as cancels and stream terminations.

The implementation details per abstraction layer are as follows.

._streaming.Context:
- add a new contructor factor func `mk_context()` which provides
  a strictly private init-er whilst allowing us to not have to define
  an `.__init__()` on the type def.
- add public `.cancel_called` and `.cancel_called_remote` properties.
- general rename of what was the internal `._backpressure` var to
  `._allow_overruns: bool`.
- move the old contents of `Actor._push_result()` into a new
  `._deliver_msg()` allowing for better encapsulation of per-ctx
  msg handling.
 - always check for received 'error' msgs and process them with the new
   `_maybe_cancel_and_set_remote_error()` **before** any msg delivery to
   the local task, thus guaranteeing error and cancellation handling
   despite any overflow handling.
- add a new `._drain_overflows()` task-method for use with new
  `._allow_overruns: bool = True` mode.
 - add back a `._scope_nursery: trio.Nursery` (allocated in
   `Portal.open_context()`) who's sole purpose is to spawn a single task
   which runs the above method; anything else is an error.
 - augment `._deliver_msg()` to start a task and run the above method
   when operating in no overrun mode; the task queues overflow msgs and
   attempts to send them to the underlying mem chan using a blocking
   `.send()` call.
 - on context exit, any existing "drainer task" will be cancelled and
   remaining overflow queued msgs are discarded with a warning.
- rename `._error` -> `_remote_error` and set it in a new method
  `_maybe_cancel_and_set_remote_error()` which is called before
  processing
- adjust `.result()` to always call `._maybe_raise_remote_err()` at its
  start such that whenever a `ContextCancelled` arrives we do logic for
  whether or not to immediately raise that error or ignore it due to the
  current actor being the one who requested the cancel, by checking the
  error's `.canceller` field.
 - set the default value of `._result` to be `id(Context()` thus avoiding
   conflict with any `.result()` actually being `False`..

._runtime.Actor:
- augment `.cancel()` and `._cancel_task()` and `.cancel_rpc_tasks()` to
  take a `requesting_uid: tuple` indicating the source actor of every
  cancellation request.
- pass through the new `Context._allow_overruns` through `.get_context()`
- call the new `Context._deliver_msg()` from `._push_result()` (since
  the factoring out that method's contents).

._runtime._invoke:
- `TastStatus.started()` back a `Context` (unless an error is raised)
  instead of the cancel scope to make it easy to set/get state on that
  context for the purposes of cancellation and remote error relay.
- always raise any remote error via `Context._maybe_raise_remote_err()`
  before doing any `ContextCancelled` logic.
- assign any `Context._cancel_called_remote` set by the `requesting_uid`
  cancel methods (mentioned above) to the `ContextCancelled.canceller`.

._runtime.process_messages:
- always pass a `requesting_uid: tuple` to `Actor.cancel()` and
  `._cancel_task` to that any corresponding `ContextCancelled.canceller`
  can be set inside `._invoke()`.
2023-05-15 10:00:45 -04:00
Tyler Goodlet 90e41016b9 Only tuplize `.canceller` if non-`None` 2023-05-15 10:00:45 -04:00
Tyler Goodlet f54c415060 Move `NoRuntime` import inside `current_actor()` to avoid cycle 2023-05-15 10:00:45 -04:00
Tyler Goodlet 03644f59cc Augment test cases for callee-returns-result early
Turns out stuff was totally broken in these cases because we're either
closing the underlying mem chan too early or not handling the
"allow_overruns" mode's cancellation correctly..
2023-05-15 10:00:45 -04:00
Tyler Goodlet 67f82c6ebd Add new remote error introspection attrs
To handle both remote cancellation this adds `ContextCanceled.canceller:
tuple` the uid of the cancel requesting actor and is expected to be set
by the runtime when servicing any remote cancel request. This makes it
possible for `ContextCancelled` receivers to know whether "their actor
runtime" is the source of the cancellation.

Also add an explicit `RemoteActor.src_actor_uid` which better formalizes
the notion of "which remote actor" the error originated from.

Both of these new attrs are expected to be packed in the `.msgdata` when
the errors are loaded locally.
2023-05-15 10:00:45 -04:00
Tyler Goodlet 71cd445319 Add new set of context cancellation tests
These will verify new changes to the runtime/messaging core which allows
us to adopt an "ignore cancel if requested by us" style handling of
`ContextCancelled` more like how `trio` does with
`trio.Nursery.cancel_scope.cancel()`. We now expect
a `ContextCancelled.canceller: tuple` which is set to the actor uid of
the actor which requested the cancellation which eventually resulted in
the remote error-msg.

Also adds some experimental tweaks to the "backpressure" test which it
turns out is very problematic in coordination with context cancellation
since blocking on the feed mem chan to some task will block the ipc msg
loop and thus handling of cancellation.. More to come to both the test
and core to address this hopefully since right now this test is failing.
2023-05-15 10:00:45 -04:00
Tyler Goodlet 220b244508 Log waiter task cancelling msg as cancel-level 2023-05-15 10:00:45 -04:00
Tyler Goodlet 831790377b Assign `RemoteActorError` boxed error type for context cancelleds 2023-05-15 10:00:45 -04:00
Tyler Goodlet e80e0a551f Change a bunch of log levels to cancel, including any `ContextCancelled` handling 2023-05-15 10:00:45 -04:00
Tyler Goodlet b3f9251eda Add some log-level method doc-strings 2023-05-15 10:00:45 -04:00
Tyler Goodlet 903537ce04 Tweak context doc str 2023-05-15 10:00:45 -04:00
Tyler Goodlet d75343106b More single doc-strs in discovery mod 2023-05-15 10:00:45 -04:00
Tyler Goodlet cfb2bc0fee Enable `Context` backpressure by default; avoid startup race-crashes? 2023-05-15 10:00:45 -04:00
67 changed files with 12779 additions and 3271 deletions

View File

@ -3,8 +3,8 @@
|gh_actions|
|docs|
``tractor`` is a `structured concurrent`_, multi-processing_ runtime
built on trio_.
``tractor`` is a `structured concurrent`_, (optionally
distributed_) multi-processing_ runtime built on trio_.
Fundamentally, ``tractor`` gives you parallelism via
``trio``-"*actors*": independent Python processes (aka
@ -17,11 +17,20 @@ protocol" constructed on top of multiple Pythons each running a ``trio``
scheduled runtime - a call to ``trio.run()``.
We believe the system adheres to the `3 axioms`_ of an "`actor model`_"
but likely *does not* look like what *you* probably think an "actor
model" looks like, and that's *intentional*.
but likely **does not** look like what **you** probably *think* an "actor
model" looks like, and that's **intentional**.
The first step to grok ``tractor`` is to get the basics of ``trio`` down.
A great place to start is the `trio docs`_ and this `blog post`_.
Where do i start!?
------------------
The first step to grok ``tractor`` is to get an intermediate
knowledge of ``trio`` and **structured concurrency** B)
Some great places to start are,
- the seminal `blog post`_
- obviously the `trio docs`_
- wikipedia's nascent SC_ page
- the fancy diagrams @ libdill-docs_
Features
@ -593,6 +602,7 @@ matrix seems too hip, we're also mostly all in the the `trio gitter
channel`_!
.. _structured concurrent: https://trio.discourse.group/t/concise-definition-of-structured-concurrency/228
.. _distributed: https://en.wikipedia.org/wiki/Distributed_computing
.. _multi-processing: https://en.wikipedia.org/wiki/Multiprocessing
.. _trio: https://github.com/python-trio/trio
.. _nurseries: https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/#nurseries-a-structured-replacement-for-go-statements
@ -611,8 +621,9 @@ channel`_!
.. _trio docs: https://trio.readthedocs.io/en/latest/
.. _blog post: https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/
.. _structured concurrency: https://en.wikipedia.org/wiki/Structured_concurrency
.. _SC: https://en.wikipedia.org/wiki/Structured_concurrency
.. _libdill-docs: https://sustrik.github.io/libdill/structured-concurrency.html
.. _structured chadcurrency: https://en.wikipedia.org/wiki/Structured_concurrency
.. _structured concurrency: https://en.wikipedia.org/wiki/Structured_concurrency
.. _unrequirements: https://en.wikipedia.org/wiki/Actor_model#Direct_communication_and_asynchrony
.. _async generators: https://www.python.org/dev/peps/pep-0525/
.. _trio-parallel: https://github.com/richardsheridan/trio-parallel

View File

@ -6,47 +6,115 @@ been an outage) and we want to ensure that despite being in debug mode
actor tree will eventually be cancelled without leaving any zombies.
'''
import trio
from contextlib import asynccontextmanager as acm
from functools import partial
from tractor import (
open_nursery,
context,
Context,
ContextCancelled,
MsgStream,
_testing,
)
import trio
import pytest
async def break_channel_silently_then_error(
async def break_ipc(
stream: MsgStream,
):
async for msg in stream:
await stream.send(msg)
method: str|None = None,
pre_close: bool = False,
# XXX: close the channel right after an error is raised
# purposely breaking the IPC transport to make sure the parent
# doesn't get stuck in debug or hang on the connection join.
# this more or less simulates an infinite msg-receive hang on
# the other end.
await stream._ctx.chan.send(None)
assert 0
def_method: str = 'eof',
) -> None:
'''
XXX: close the channel right after an error is raised
purposely breaking the IPC transport to make sure the parent
doesn't get stuck in debug or hang on the connection join.
this more or less simulates an infinite msg-receive hang on
the other end.
async def close_stream_and_error(
stream: MsgStream,
):
async for msg in stream:
await stream.send(msg)
# wipe out channel right before raising
await stream._ctx.chan.send(None)
'''
# close channel via IPC prot msging before
# any transport breakage
if pre_close:
await stream.aclose()
assert 0
method: str = method or def_method
print(
'#################################\n'
'Simulating CHILD-side IPC BREAK!\n'
f'method: {method}\n'
f'pre `.aclose()`: {pre_close}\n'
'#################################\n'
)
match method:
case 'trans_aclose':
await stream._ctx.chan.transport.stream.aclose()
case 'eof':
await stream._ctx.chan.transport.stream.send_eof()
case 'msg':
await stream._ctx.chan.send(None)
# TODO: the actual real-world simulated cases like
# transport layer hangs and/or lower layer 2-gens type
# scenarios..
#
# -[ ] already have some issues for this general testing
# area:
# - https://github.com/goodboy/tractor/issues/97
# - https://github.com/goodboy/tractor/issues/124
# - PR from @guille:
# https://github.com/goodboy/tractor/pull/149
# case 'hang':
# TODO: framework research:
#
# - https://github.com/GuoTengda1993/pynetem
# - https://github.com/shopify/toxiproxy
# - https://manpages.ubuntu.com/manpages/trusty/man1/wirefilter.1.html
case _:
raise RuntimeError(
f'IPC break method unsupported: {method}'
)
async def break_ipc_then_error(
stream: MsgStream,
break_ipc_with: str|None = None,
pre_close: bool = False,
):
await break_ipc(
stream=stream,
method=break_ipc_with,
pre_close=pre_close,
)
async for msg in stream:
await stream.send(msg)
assert 0
async def iter_ipc_stream(
stream: MsgStream,
break_ipc_with: str|None = None,
pre_close: bool = False,
):
async for msg in stream:
await stream.send(msg)
@context
async def recv_and_spawn_net_killers(
ctx: Context,
break_ipc_after: bool | int = False,
break_ipc_after: bool|int = False,
pre_close: bool = False,
) -> None:
'''
@ -61,26 +129,53 @@ async def recv_and_spawn_net_killers(
async for i in stream:
print(f'child echoing {i}')
await stream.send(i)
if (
break_ipc_after
and i > break_ipc_after
and
i >= break_ipc_after
):
'#################################\n'
'Simulating child-side IPC BREAK!\n'
'#################################'
n.start_soon(break_channel_silently_then_error, stream)
n.start_soon(close_stream_and_error, stream)
n.start_soon(
iter_ipc_stream,
stream,
)
n.start_soon(
partial(
break_ipc_then_error,
stream=stream,
pre_close=pre_close,
)
)
@acm
async def stuff_hangin_ctlc(timeout: float = 1) -> None:
with trio.move_on_after(timeout) as cs:
yield timeout
if cs.cancelled_caught:
# pretend to be a user seeing no streaming action
# thinking it's a hang, and then hitting ctl-c..
print(
f"i'm a user on the PARENT side and thingz hangin "
f'after timeout={timeout} ???\n\n'
'MASHING CTlR-C..!?\n'
)
raise KeyboardInterrupt
async def main(
debug_mode: bool = False,
start_method: str = 'trio',
loglevel: str = 'cancel',
# by default we break the parent IPC first (if configured to break
# at all), but this can be changed so the child does first (even if
# both are set to break).
break_parent_ipc_after: int | bool = False,
break_child_ipc_after: int | bool = False,
break_parent_ipc_after: int|bool = False,
break_child_ipc_after: int|bool = False,
pre_close: bool = False,
) -> None:
@ -91,60 +186,123 @@ async def main(
# NOTE: even debugger is used we shouldn't get
# a hang since it never engages due to broken IPC
debug_mode=debug_mode,
loglevel='warning',
loglevel=loglevel,
) as an,
):
sub_name: str = 'chitty_hijo'
portal = await an.start_actor(
'chitty_hijo',
sub_name,
enable_modules=[__name__],
)
async with portal.open_context(
recv_and_spawn_net_killers,
break_ipc_after=break_child_ipc_after,
async with (
stuff_hangin_ctlc(timeout=2) as timeout,
_testing.expect_ctxc(
yay=(
break_parent_ipc_after
or break_child_ipc_after
),
# TODO: we CAN'T remove this right?
# since we need the ctxc to bubble up from either
# the stream API after the `None` msg is sent
# (which actually implicitly cancels all remote
# tasks in the hijo) or from simluated
# KBI-mash-from-user
# or should we expect that a KBI triggers the ctxc
# and KBI in an eg?
reraise=True,
),
) as (ctx, sent):
portal.open_context(
recv_and_spawn_net_killers,
break_ipc_after=break_child_ipc_after,
pre_close=pre_close,
) as (ctx, sent),
):
rx_eoc: bool = False
ipc_break_sent: bool = False
async with ctx.open_stream() as stream:
for i in range(1000):
if (
break_parent_ipc_after
and i > break_parent_ipc_after
and
i > break_parent_ipc_after
and
not ipc_break_sent
):
print(
'#################################\n'
'Simulating parent-side IPC BREAK!\n'
'#################################'
'Simulating PARENT-side IPC BREAK!\n'
'#################################\n'
)
await stream._ctx.chan.send(None)
# TODO: other methods? see break func above.
# await stream._ctx.chan.send(None)
# await stream._ctx.chan.transport.stream.send_eof()
await stream._ctx.chan.transport.stream.aclose()
ipc_break_sent = True
# it actually breaks right here in the
# mp_spawn/forkserver backends and thus the zombie
# reaper never even kicks in?
print(f'parent sending {i}')
await stream.send(i)
try:
await stream.send(i)
except ContextCancelled as ctxc:
print(
'parent received ctxc on `stream.send()`\n'
f'{ctxc}\n'
)
assert 'root' in ctxc.canceller
assert sub_name in ctx.canceller
with trio.move_on_after(2) as cs:
# TODO: is this needed or no?
raise
# timeout: int = 1
# with trio.move_on_after(timeout) as cs:
async with stuff_hangin_ctlc() as timeout:
print(
f'PARENT `stream.receive()` with timeout={timeout}\n'
)
# NOTE: in the parent side IPC failure case this
# will raise an ``EndOfChannel`` after the child
# is killed and sends a stop msg back to it's
# caller/this-parent.
rx = await stream.receive()
try:
rx = await stream.receive()
print(
"I'm a happy PARENT user and echoed to me is\n"
f'{rx}\n'
)
except trio.EndOfChannel:
rx_eoc: bool = True
print('MsgStream got EoC for PARENT')
raise
print(f"I'm a happy user and echoed to me is {rx}")
print(
'Streaming finished and we got Eoc.\n'
'Canceling `.open_context()` in root with\n'
'CTlR-C..'
)
if rx_eoc:
assert stream.closed
try:
await stream.send(i)
pytest.fail('stream not closed?')
except (
trio.ClosedResourceError,
trio.EndOfChannel,
) as send_err:
if rx_eoc:
assert send_err is stream._eoc
else:
assert send_err is stream._closed
if cs.cancelled_caught:
# pretend to be a user seeing no streaming action
# thinking it's a hang, and then hitting ctl-c..
print("YOO i'm a user anddd thingz hangin..")
print(
"YOO i'm mad send side dun but thingz hangin..\n"
'MASHING CTlR-C Ctl-c..'
)
raise KeyboardInterrupt
raise KeyboardInterrupt
if __name__ == '__main__':

View File

@ -0,0 +1,119 @@
import asyncio
import trio
import tractor
from tractor import to_asyncio
async def aio_sleep_forever():
await asyncio.sleep(float('inf'))
async def bp_then_error(
to_trio: trio.MemorySendChannel,
from_trio: asyncio.Queue,
raise_after_bp: bool = True,
) -> None:
# sync with ``trio``-side (caller) task
to_trio.send_nowait('start')
# NOTE: what happens here inside the hook needs some refinement..
# => seems like it's still `._debug._set_trace()` but
# we set `Lock.local_task_in_debug = 'sync'`, we probably want
# some further, at least, meta-data about the task/actoq in debug
# in terms of making it clear it's asyncio mucking about.
breakpoint()
# short checkpoint / delay
await asyncio.sleep(0.5)
if raise_after_bp:
raise ValueError('blah')
# TODO: test case with this so that it gets cancelled?
else:
# XXX NOTE: this is required in order to get the SIGINT-ignored
# hang case documented in the module script section!
await aio_sleep_forever()
@tractor.context
async def trio_ctx(
ctx: tractor.Context,
bp_before_started: bool = False,
):
# this will block until the ``asyncio`` task sends a "first"
# message, see first line in above func.
async with (
to_asyncio.open_channel_from(
bp_then_error,
raise_after_bp=not bp_before_started,
) as (first, chan),
trio.open_nursery() as n,
):
assert first == 'start'
if bp_before_started:
await tractor.breakpoint()
await ctx.started(first)
n.start_soon(
to_asyncio.run_task,
aio_sleep_forever,
)
await trio.sleep_forever()
async def main(
bps_all_over: bool = False,
) -> None:
async with tractor.open_nursery(
# debug_mode=True,
) as n:
p = await n.start_actor(
'aio_daemon',
enable_modules=[__name__],
infect_asyncio=True,
debug_mode=True,
loglevel='cancel',
)
async with p.open_context(
trio_ctx,
bp_before_started=bps_all_over,
) as (ctx, first):
assert first == 'start'
if bps_all_over:
await tractor.breakpoint()
# await trio.sleep_forever()
await ctx.cancel()
assert 0
# TODO: case where we cancel from trio-side while asyncio task
# has debugger lock?
# await p.cancel_actor()
if __name__ == '__main__':
# works fine B)
trio.run(main)
# will hang and ignores SIGINT !!
# NOTE: you'll need to send a SIGQUIT (via ctl-\) to kill it
# manually..
# trio.run(main, True)

View File

@ -0,0 +1,9 @@
'''
Reproduce a bug where enabling debug mode for a sub-actor actually causes
a hang on teardown...
'''
import asyncio
import trio
import tractor

View File

@ -32,7 +32,7 @@ async def main():
try:
await p1.run(name_error)
except tractor.RemoteActorError as rae:
assert rae.type is NameError
assert rae.boxed_type is NameError
async for i in stream:

View File

@ -0,0 +1,73 @@
import trio
import tractor
def sync_pause(
use_builtin: bool = True,
error: bool = False,
):
if use_builtin:
breakpoint()
else:
tractor.pause_from_sync()
if error:
raise RuntimeError('yoyo sync code error')
@tractor.context
async def start_n_sync_pause(
ctx: tractor.Context,
):
# sync to requesting peer
await ctx.started()
actor: tractor.Actor = tractor.current_actor()
print(f'entering SYNC PAUSE in {actor.uid}')
sync_pause()
print(f'back from SYNC PAUSE in {actor.uid}')
async def main() -> None:
async with tractor.open_nursery(
debug_mode=True,
) as an:
p: tractor.Portal = await an.start_actor(
'subactor',
enable_modules=[__name__],
# infect_asyncio=True,
debug_mode=True,
loglevel='cancel',
)
# TODO: 3 sub-actor usage cases:
# -[ ] via a `.run_in_actor()` call
# -[ ] via a `.run()`
# -[ ] via a `.open_context()`
#
async with p.open_context(
start_n_sync_pause,
) as (ctx, first):
assert first is None
await tractor.pause()
sync_pause()
# TODO: make this work!!
await trio.to_thread.run_sync(
sync_pause,
abandon_on_cancel=False,
)
await ctx.cancel()
# TODO: case where we cancel from trio-side while asyncio task
# has debugger lock?
await p.cancel_actor()
if __name__ == '__main__':
trio.run(main)

View File

@ -65,21 +65,28 @@ async def aggregate(seed):
print("AGGREGATOR COMPLETE!")
# this is the main actor and *arbiter*
async def main():
# a nursery which spawns "actors"
async with tractor.open_nursery(
arbiter_addr=('127.0.0.1', 1616)
) as nursery:
async def main() -> list[int]:
'''
This is the "root" actor's main task's entrypoint.
By default (and if not otherwise specified) that root process
also acts as a "registry actor" / "registrar" on the localhost
for the purposes of multi-actor "service discovery".
'''
# yes, a nursery which spawns `trio`-"actors" B)
nursery: tractor.ActorNursery
async with tractor.open_nursery() as nursery:
seed = int(1e3)
pre_start = time.time()
portal = await nursery.start_actor(
portal: tractor.Portal = await nursery.start_actor(
name='aggregator',
enable_modules=[__name__],
)
stream: tractor.MsgStream
async with portal.open_stream_from(
aggregate,
seed=seed,

View File

@ -8,7 +8,10 @@ This uses no extra threads, fancy semaphores or futures; all we need
is ``tractor``'s channels.
"""
from contextlib import asynccontextmanager
from contextlib import (
asynccontextmanager as acm,
aclosing,
)
from typing import Callable
import itertools
import math
@ -16,7 +19,6 @@ import time
import tractor
import trio
from async_generator import aclosing
PRIMES = [
@ -44,7 +46,7 @@ async def is_prime(n):
return True
@asynccontextmanager
@acm
async def worker_pool(workers=4):
"""Though it's a trivial special case for ``tractor``, the well
known "worker pool" seems to be the defacto "but, I want this

View File

@ -13,7 +13,7 @@ async def simple_rpc(
'''
# signal to parent that we're up much like
# ``trio_typing.TaskStatus.started()``
# ``trio.TaskStatus.started()``
await ctx.started(data + 1)
async with ctx.open_stream() as stream:

View File

@ -26,3 +26,23 @@ all_bullets = true
directory = "trivial"
name = "Trivial/Internal Changes"
showcontent = true
[tool.pytest.ini_options]
minversion = '6.0'
testpaths = [
'tests'
]
addopts = [
# TODO: figure out why this isn't working..
'--rootdir=./tests',
'--import-mode=importlib',
# don't show frickin captured logs AGAIN in the report..
'--show-capture=no',
]
log_cli = false
# TODO: maybe some of these layout choices?
# https://docs.pytest.org/en/8.0.x/explanation/goodpractices.html#choosing-a-test-layout-import-rules
# pythonpath = "src"

View File

@ -6,3 +6,4 @@ mypy
trio_typing
pexpect
towncrier
numpy

View File

@ -26,7 +26,7 @@ with open('docs/README.rst', encoding='utf-8') as f:
setup(
name="tractor",
version='0.1.0a6dev0', # alpha zone
description='structured concurrrent `trio`-"actors"',
description='structured concurrent `trio`-"actors"',
long_description=readme,
license='AGPLv3',
author='Tyler Goodlet',
@ -36,20 +36,24 @@ setup(
platforms=['linux', 'windows'],
packages=[
'tractor',
'tractor.experimental',
'tractor.trionics',
'tractor.experimental', # wacky ideas
'tractor.trionics', # trio extensions
'tractor.msg', # lowlevel data types
'tractor.devx', # "dev-experience"
],
install_requires=[
# trio related
# proper range spec:
# https://packaging.python.org/en/latest/discussions/install-requires-vs-requirements/#id5
'trio >= 0.22',
'async_generator',
'trio_typing',
'exceptiongroup',
'trio >= 0.24',
# 'async_generator', # in stdlib mostly!
# 'trio_typing', # trio==0.23.0 has type hints!
# 'exceptiongroup', # in stdlib as of 3.11!
# tooling
'stackscope',
'tricycle',
'trio_typing',
'colorlog',
@ -61,16 +65,15 @@ setup(
# debug mode REPL
'pdbp',
# TODO: distributed transport using
# linux kernel networking
# 'pyroute2',
# pip ref docs on these specs:
# https://pip.pypa.io/en/stable/reference/requirement-specifiers/#examples
# and pep:
# https://peps.python.org/pep-0440/#version-specifiers
# windows deps workaround for ``pdbpp``
# https://github.com/pdbpp/pdbpp/issues/498
# https://github.com/pdbpp/fancycompleter/issues/37
'pyreadline3 ; platform_system == "Windows"',
],
tests_require=['pytest'],
python_requires=">=3.10",

View File

@ -7,94 +7,19 @@ import os
import random
import signal
import platform
import pathlib
import time
import inspect
from functools import partial, wraps
import pytest
import trio
import tractor
from tractor._testing import (
examples_dir as examples_dir,
tractor_test as tractor_test,
expect_ctxc as expect_ctxc,
)
# TODO: include wtv plugin(s) we build in `._testing.pytest`?
pytest_plugins = ['pytester']
def tractor_test(fn):
"""
Use:
@tractor_test
async def test_whatever():
await ...
If fixtures:
- ``arb_addr`` (a socket addr tuple where arbiter is listening)
- ``loglevel`` (logging level passed to tractor internals)
- ``start_method`` (subprocess spawning backend)
are defined in the `pytest` fixture space they will be automatically
injected to tests declaring these funcargs.
"""
@wraps(fn)
def wrapper(
*args,
loglevel=None,
arb_addr=None,
start_method=None,
**kwargs
):
# __tracebackhide__ = True
if 'arb_addr' in inspect.signature(fn).parameters:
# injects test suite fixture value to test as well
# as `run()`
kwargs['arb_addr'] = arb_addr
if 'loglevel' in inspect.signature(fn).parameters:
# allows test suites to define a 'loglevel' fixture
# that activates the internal logging
kwargs['loglevel'] = loglevel
if start_method is None:
if platform.system() == "Windows":
start_method = 'trio'
if 'start_method' in inspect.signature(fn).parameters:
# set of subprocess spawning backends
kwargs['start_method'] = start_method
if kwargs:
# use explicit root actor start
async def _main():
async with tractor.open_root_actor(
# **kwargs,
arbiter_addr=arb_addr,
loglevel=loglevel,
start_method=start_method,
# TODO: only enable when pytest is passed --pdb
# debug_mode=True,
):
await fn(*args, **kwargs)
main = _main
else:
# use implicit root actor start
main = partial(fn, *args, **kwargs)
return trio.run(main)
return wrapper
_arb_addr = '127.0.0.1', random.randint(1000, 9999)
# Sending signal.SIGINT on subprocess fails on windows. Use CTRL_* alternatives
if platform.system() == 'Windows':
_KILL_SIGNAL = signal.CTRL_BREAK_EVENT
@ -114,41 +39,45 @@ no_windows = pytest.mark.skipif(
)
def repodir() -> pathlib.Path:
'''
Return the abspath to the repo directory.
'''
# 2 parents up to step up through tests/<repo_dir>
return pathlib.Path(__file__).parent.parent.absolute()
def examples_dir() -> pathlib.Path:
'''
Return the abspath to the examples directory as `pathlib.Path`.
'''
return repodir() / 'examples'
def pytest_addoption(parser):
parser.addoption(
"--ll", action="store", dest='loglevel',
"--ll",
action="store",
dest='loglevel',
default='ERROR', help="logging level to set when testing"
)
parser.addoption(
"--spawn-backend", action="store", dest='spawn_backend',
"--spawn-backend",
action="store",
dest='spawn_backend',
default='trio',
help="Processing spawning backend to use for test run",
)
parser.addoption(
"--tpdb", "--debug-mode",
action="store_true",
dest='tractor_debug_mode',
# default=False,
help=(
'Enable a flag that can be used by tests to to set the '
'`debug_mode: bool` for engaging the internal '
'multi-proc debugger sys.'
),
)
def pytest_configure(config):
backend = config.option.spawn_backend
tractor._spawn.try_set_start_method(backend)
@pytest.fixture(scope='session')
def debug_mode(request):
return request.config.option.tractor_debug_mode
@pytest.fixture(scope='session', autouse=True)
def loglevel(request):
orig = tractor.log._default_loglevel
@ -168,14 +97,35 @@ _ci_env: bool = os.environ.get('CI', False)
@pytest.fixture(scope='session')
def ci_env() -> bool:
"""Detect CI envoirment.
"""
'''
Detect CI envoirment.
'''
return _ci_env
# TODO: also move this to `._testing` for now?
# -[ ] possibly generalize and re-use for multi-tree spawning
# along with the new stuff for multi-addrs in distribute_dis
# branch?
#
# choose randomly at import time
_reg_addr: tuple[str, int] = (
'127.0.0.1',
random.randint(1000, 9999),
)
@pytest.fixture(scope='session')
def arb_addr():
return _arb_addr
def reg_addr() -> tuple[str, int]:
# globally override the runtime to the per-test-session-dynamic
# addr so that all tests never conflict with any other actor
# tree using the default.
from tractor import _root
_root._default_lo_addrs = [_reg_addr]
return _reg_addr
def pytest_generate_tests(metafunc):
@ -212,34 +162,40 @@ def sig_prog(proc, sig):
assert ret
# TODO: factor into @cm and move to `._testing`?
@pytest.fixture
def daemon(
loglevel: str,
testdir,
arb_addr: tuple[str, int],
reg_addr: tuple[str, int],
):
'''
Run a daemon actor as a "remote arbiter".
Run a daemon root actor as a separate actor-process tree and
"remote registrar" for discovery-protocol related tests.
'''
if loglevel in ('trace', 'debug'):
# too much logging will lock up the subproc (smh)
loglevel = 'info'
# XXX: too much logging will lock up the subproc (smh)
loglevel: str = 'info'
cmdargs = [
sys.executable, '-c',
"import tractor; tractor.run_daemon([], registry_addr={}, loglevel={})"
.format(
arb_addr,
"'{}'".format(loglevel) if loglevel else None)
code: str = (
"import tractor; "
"tractor.run_daemon([], registry_addrs={reg_addrs}, loglevel={ll})"
).format(
reg_addrs=str([reg_addr]),
ll="'{}'".format(loglevel) if loglevel else None,
)
cmd: list[str] = [
sys.executable,
'-c', code,
]
kwargs = dict()
kwargs = {}
if platform.system() == 'Windows':
# without this, tests hang on windows forever
kwargs['creationflags'] = subprocess.CREATE_NEW_PROCESS_GROUP
proc = testdir.popen(
cmdargs,
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
**kwargs,

View File

@ -3,22 +3,29 @@ Sketchy network blackoutz, ugly byzantine gens, puedes eschuchar la
cancelacion?..
'''
import itertools
from functools import partial
from types import ModuleType
import pytest
from _pytest.pathlib import import_path
import trio
import tractor
from conftest import (
from tractor._testing import (
examples_dir,
)
@pytest.mark.parametrize(
'debug_mode',
[False, True],
ids=['no_debug_mode', 'debug_mode'],
'pre_aclose_msgstream',
[
False,
True,
],
ids=[
'no_msgstream_aclose',
'pre_aclose_msgstream',
],
)
@pytest.mark.parametrize(
'ipc_break',
@ -63,8 +70,10 @@ from conftest import (
)
def test_ipc_channel_break_during_stream(
debug_mode: bool,
loglevel: str,
spawn_backend: str,
ipc_break: dict | None,
ipc_break: dict|None,
pre_aclose_msgstream: bool,
):
'''
Ensure we can have an IPC channel break its connection during
@ -83,70 +92,130 @@ def test_ipc_channel_break_during_stream(
# requires the user to do ctl-c to cancel the actor tree.
expect_final_exc = trio.ClosedResourceError
mod = import_path(
mod: ModuleType = import_path(
examples_dir() / 'advanced_faults' / 'ipc_failure_during_stream.py',
root=examples_dir(),
)
expect_final_exc = KeyboardInterrupt
# when ONLY the child breaks we expect the parent to get a closed
# resource error on the next `MsgStream.receive()` and then fail out
# and cancel the child from there.
# by def we expect KBI from user after a simulated "hang
# period" wherein the user eventually hits ctl-c to kill the
# root-actor tree.
expect_final_exc: BaseException = KeyboardInterrupt
if (
# only expect EoC if trans is broken on the child side,
ipc_break['break_child_ipc_after'] is not False
# AND we tell the child to call `MsgStream.aclose()`.
and pre_aclose_msgstream
):
# expect_final_exc = trio.EndOfChannel
# ^XXX NOPE! XXX^ since now `.open_stream()` absorbs this
# gracefully!
expect_final_exc = KeyboardInterrupt
# only child breaks
(
ipc_break['break_child_ipc_after']
and ipc_break['break_parent_ipc_after'] is False
)
# both break but, parent breaks first
or (
ipc_break['break_child_ipc_after'] is not False
and (
ipc_break['break_parent_ipc_after']
> ipc_break['break_child_ipc_after']
)
# NOTE when ONLY the child breaks or it breaks BEFORE the
# parent we expect the parent to get a closed resource error
# on the next `MsgStream.receive()` and then fail out and
# cancel the child from there.
#
# ONLY CHILD breaks
if (
ipc_break['break_child_ipc_after']
and
ipc_break['break_parent_ipc_after'] is False
):
# NOTE: we DO NOT expect this any more since
# the child side's channel will be broken silently
# and nothing on the parent side will indicate this!
# expect_final_exc = trio.ClosedResourceError
# NOTE: child will send a 'stop' msg before it breaks
# the transport channel BUT, that will be absorbed by the
# `ctx.open_stream()` block and thus the `.open_context()`
# should hang, after which the test script simulates
# a user sending ctl-c by raising a KBI.
if pre_aclose_msgstream:
expect_final_exc = KeyboardInterrupt
# XXX OLD XXX
# if child calls `MsgStream.aclose()` then expect EoC.
# ^ XXX not any more ^ since eoc is always absorbed
# gracefully and NOT bubbled to the `.open_context()`
# block!
# expect_final_exc = trio.EndOfChannel
# BOTH but, CHILD breaks FIRST
elif (
ipc_break['break_child_ipc_after'] is not False
and (
ipc_break['break_parent_ipc_after']
> ipc_break['break_child_ipc_after']
)
):
if pre_aclose_msgstream:
expect_final_exc = KeyboardInterrupt
# NOTE when the parent IPC side dies (even if the child's does as well
# but the child fails BEFORE the parent) we always expect the
# IPC layer to raise a closed-resource, NEVER do we expect
# a stop msg since the parent-side ctx apis will error out
# IMMEDIATELY before the child ever sends any 'stop' msg.
#
# ONLY PARENT breaks
elif (
ipc_break['break_parent_ipc_after']
and
ipc_break['break_child_ipc_after'] is False
):
expect_final_exc = trio.ClosedResourceError
# when the parent IPC side dies (even if the child's does as well
# but the child fails BEFORE the parent) we expect the channel to be
# sent a stop msg from the child at some point which will signal the
# parent that the stream has been terminated.
# NOTE: when the parent breaks "after" the child you get this same
# case as well, the child breaks the IPC channel with a stop msg
# before any closure takes place.
# BOTH but, PARENT breaks FIRST
elif (
# only parent breaks
(
ipc_break['break_parent_ipc_after'] is not False
and (
ipc_break['break_child_ipc_after']
>
ipc_break['break_parent_ipc_after']
and ipc_break['break_child_ipc_after'] is False
)
# both break but, child breaks first
or (
ipc_break['break_parent_ipc_after'] is not False
and (
ipc_break['break_child_ipc_after']
> ipc_break['break_parent_ipc_after']
)
)
):
expect_final_exc = trio.EndOfChannel
expect_final_exc = trio.ClosedResourceError
with pytest.raises(expect_final_exc):
trio.run(
partial(
mod.main,
debug_mode=debug_mode,
start_method=spawn_backend,
**ipc_break,
with pytest.raises(
expected_exception=(
expect_final_exc,
ExceptionGroup,
),
) as excinfo:
try:
trio.run(
partial(
mod.main,
debug_mode=debug_mode,
start_method=spawn_backend,
loglevel=loglevel,
pre_close=pre_aclose_msgstream,
**ipc_break,
)
)
except KeyboardInterrupt as kbi:
_err = kbi
if expect_final_exc is not KeyboardInterrupt:
pytest.fail(
'Rxed unexpected KBI !?\n'
f'{repr(kbi)}'
)
raise
# get raw instance from pytest wrapper
value = excinfo.value
if isinstance(value, ExceptionGroup):
value = next(
itertools.dropwhile(
lambda exc: not isinstance(exc, expect_final_exc),
value.exceptions,
)
)
assert value
@tractor.context
@ -169,25 +238,29 @@ def test_stream_closed_right_after_ipc_break_and_zombie_lord_engages():
'''
async def main():
async with tractor.open_nursery() as n:
portal = await n.start_actor(
'ipc_breaker',
enable_modules=[__name__],
)
with trio.fail_after(3):
async with tractor.open_nursery() as n:
portal = await n.start_actor(
'ipc_breaker',
enable_modules=[__name__],
)
with trio.move_on_after(1):
async with (
portal.open_context(
break_ipc_after_started
) as (ctx, sent),
):
async with ctx.open_stream():
await trio.sleep(0.5)
with trio.move_on_after(1):
async with (
portal.open_context(
break_ipc_after_started
) as (ctx, sent),
):
async with ctx.open_stream():
await trio.sleep(0.5)
print('parent waiting on context')
print('parent waiting on context')
print('parent exited context')
raise KeyboardInterrupt
print(
'parent exited context\n'
'parent raising KBI..\n'
)
raise KeyboardInterrupt
with pytest.raises(KeyboardInterrupt):
trio.run(main)

View File

@ -6,6 +6,7 @@ from collections import Counter
import itertools
import platform
import pytest
import trio
import tractor
@ -143,8 +144,16 @@ def test_dynamic_pub_sub():
try:
trio.run(main)
except trio.TooSlowError:
pass
except (
trio.TooSlowError,
ExceptionGroup,
) as err:
if isinstance(err, ExceptionGroup):
for suberr in err.exceptions:
if isinstance(suberr, trio.TooSlowError):
break
else:
pytest.fail('Never got a `TooSlowError` ?')
@tractor.context
@ -298,44 +307,69 @@ async def inf_streamer(
async with (
ctx.open_stream() as stream,
trio.open_nursery() as n,
trio.open_nursery() as tn,
):
async def bail_on_sentinel():
async def close_stream_on_sentinel():
async for msg in stream:
if msg == 'done':
print(
'streamer RXed "done" sentinel msg!\n'
'CLOSING `MsgStream`!'
)
await stream.aclose()
else:
print(f'streamer received {msg}')
else:
print('streamer exited recv loop')
# start termination detector
n.start_soon(bail_on_sentinel)
tn.start_soon(close_stream_on_sentinel)
for val in itertools.count():
cap: int = 10000 # so that we don't spin forever when bug..
for val in range(cap):
try:
print(f'streamer sending {val}')
await stream.send(val)
if val > cap:
raise RuntimeError(
'Streamer never cancelled by setinel?'
)
await trio.sleep(0.001)
# close out the stream gracefully
except trio.ClosedResourceError:
# close out the stream gracefully
print('transport closed on streamer side!')
assert stream.closed
break
else:
raise RuntimeError(
'Streamer not cancelled before finished sending?'
)
print('terminating streamer')
print('streamer exited .open_streamer() block')
def test_local_task_fanout_from_stream():
def test_local_task_fanout_from_stream(
debug_mode: bool,
):
'''
Single stream with multiple local consumer tasks using the
``MsgStream.subscribe()` api.
Ensure all tasks receive all values after stream completes sending.
Ensure all tasks receive all values after stream completes
sending.
'''
consumers = 22
consumers: int = 22
async def main():
counts = Counter()
async with tractor.open_nursery() as tn:
p = await tn.start_actor(
async with tractor.open_nursery(
debug_mode=debug_mode,
) as tn:
p: tractor.Portal = await tn.start_actor(
'inf_streamer',
enable_modules=[__name__],
)
@ -343,7 +377,6 @@ def test_local_task_fanout_from_stream():
p.open_context(inf_streamer) as (ctx, _),
ctx.open_stream() as stream,
):
async def pull_and_count(name: str):
# name = trio.lowlevel.current_task().name
async with stream.subscribe() as recver:
@ -352,7 +385,7 @@ def test_local_task_fanout_from_stream():
tractor.trionics.BroadcastReceiver
)
async for val in recver:
# print(f'{name}: {val}')
print(f'bx {name} rx: {val}')
counts[name] += 1
print(f'{name} bcaster ended')
@ -362,10 +395,14 @@ def test_local_task_fanout_from_stream():
with trio.fail_after(3):
async with trio.open_nursery() as nurse:
for i in range(consumers):
nurse.start_soon(pull_and_count, i)
nurse.start_soon(
pull_and_count,
i,
)
# delay to let bcast consumers pull msgs
await trio.sleep(0.5)
print('\nterminating')
print('terminating nursery of bcast rxer consumers!')
await stream.send('done')
print('closed stream connection')

View File

@ -8,15 +8,13 @@ import platform
import time
from itertools import repeat
from exceptiongroup import (
BaseExceptionGroup,
ExceptionGroup,
)
import pytest
import trio
import tractor
from conftest import tractor_test, no_windows
from tractor._testing import (
tractor_test,
)
from conftest import no_windows
def is_win():
@ -47,17 +45,19 @@ async def do_nuthin():
],
ids=['no_args', 'unexpected_args'],
)
def test_remote_error(arb_addr, args_err):
"""Verify an error raised in a subactor that is propagated
def test_remote_error(reg_addr, args_err):
'''
Verify an error raised in a subactor that is propagated
to the parent nursery, contains the underlying boxed builtin
error type info and causes cancellation and reraising all the
way up the stack.
"""
'''
args, errtype = args_err
async def main():
async with tractor.open_nursery(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
) as nursery:
# on a remote type error caused by bad input args
@ -65,7 +65,9 @@ def test_remote_error(arb_addr, args_err):
# an exception group outside the nursery since the error
# here and the far end task error are one in the same?
portal = await nursery.run_in_actor(
assert_err, name='errorer', **args
assert_err,
name='errorer',
**args
)
# get result(s) from main task
@ -75,7 +77,7 @@ def test_remote_error(arb_addr, args_err):
# of this actor nursery.
await portal.result()
except tractor.RemoteActorError as err:
assert err.type == errtype
assert err.boxed_type == errtype
print("Look Maa that actor failed hard, hehh")
raise
@ -84,7 +86,7 @@ def test_remote_error(arb_addr, args_err):
with pytest.raises(tractor.RemoteActorError) as excinfo:
trio.run(main)
assert excinfo.value.type == errtype
assert excinfo.value.boxed_type == errtype
else:
# the root task will also error on the `.result()` call
@ -94,10 +96,10 @@ def test_remote_error(arb_addr, args_err):
# ensure boxed errors
for exc in excinfo.value.exceptions:
assert exc.type == errtype
assert exc.boxed_type == errtype
def test_multierror(arb_addr):
def test_multierror(reg_addr):
'''
Verify we raise a ``BaseExceptionGroup`` out of a nursery where
more then one actor errors.
@ -105,7 +107,7 @@ def test_multierror(arb_addr):
'''
async def main():
async with tractor.open_nursery(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
) as nursery:
await nursery.run_in_actor(assert_err, name='errorer1')
@ -115,7 +117,7 @@ def test_multierror(arb_addr):
try:
await portal2.result()
except tractor.RemoteActorError as err:
assert err.type == AssertionError
assert err.boxed_type == AssertionError
print("Look Maa that first actor failed hard, hehh")
raise
@ -130,14 +132,14 @@ def test_multierror(arb_addr):
@pytest.mark.parametrize(
'num_subactors', range(25, 26),
)
def test_multierror_fast_nursery(arb_addr, start_method, num_subactors, delay):
def test_multierror_fast_nursery(reg_addr, start_method, num_subactors, delay):
"""Verify we raise a ``BaseExceptionGroup`` out of a nursery where
more then one actor errors and also with a delay before failure
to test failure during an ongoing spawning.
"""
async def main():
async with tractor.open_nursery(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
) as nursery:
for i in range(num_subactors):
@ -167,7 +169,7 @@ def test_multierror_fast_nursery(arb_addr, start_method, num_subactors, delay):
for exc in exceptions:
assert isinstance(exc, tractor.RemoteActorError)
assert exc.type == AssertionError
assert exc.boxed_type == AssertionError
async def do_nothing():
@ -175,15 +177,20 @@ async def do_nothing():
@pytest.mark.parametrize('mechanism', ['nursery_cancel', KeyboardInterrupt])
def test_cancel_single_subactor(arb_addr, mechanism):
"""Ensure a ``ActorNursery.start_actor()`` spawned subactor
def test_cancel_single_subactor(reg_addr, mechanism):
'''
Ensure a ``ActorNursery.start_actor()`` spawned subactor
cancels when the nursery is cancelled.
"""
'''
async def spawn_actor():
"""Spawn an actor that blocks indefinitely.
"""
'''
Spawn an actor that blocks indefinitely then cancel via
either `ActorNursery.cancel()` or an exception raise.
'''
async with tractor.open_nursery(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
) as nursery:
portal = await nursery.start_actor(
@ -303,7 +310,7 @@ async def test_some_cancels_all(num_actors_and_errs, start_method, loglevel):
await portal.run(func, **kwargs)
except tractor.RemoteActorError as err:
assert err.type == err_type
assert err.boxed_type == err_type
# we only expect this first error to propogate
# (all other daemons are cancelled before they
# can be scheduled)
@ -322,11 +329,11 @@ async def test_some_cancels_all(num_actors_and_errs, start_method, loglevel):
assert len(err.exceptions) == num_actors
for exc in err.exceptions:
if isinstance(exc, tractor.RemoteActorError):
assert exc.type == err_type
assert exc.boxed_type == err_type
else:
assert isinstance(exc, trio.Cancelled)
elif isinstance(err, tractor.RemoteActorError):
assert err.type == err_type
assert err.boxed_type == err_type
assert n.cancelled is True
assert not n._children
@ -405,7 +412,7 @@ async def test_nested_multierrors(loglevel, start_method):
elif isinstance(subexc, tractor.RemoteActorError):
# on windows it seems we can't exactly be sure wtf
# will happen..
assert subexc.type in (
assert subexc.boxed_type in (
tractor.RemoteActorError,
trio.Cancelled,
BaseExceptionGroup,
@ -415,7 +422,7 @@ async def test_nested_multierrors(loglevel, start_method):
for subsub in subexc.exceptions:
if subsub in (tractor.RemoteActorError,):
subsub = subsub.type
subsub = subsub.boxed_type
assert type(subsub) in (
trio.Cancelled,
@ -430,16 +437,16 @@ async def test_nested_multierrors(loglevel, start_method):
# we get back the (sent) cancel signal instead
if is_win():
if isinstance(subexc, tractor.RemoteActorError):
assert subexc.type in (
assert subexc.boxed_type in (
BaseExceptionGroup,
tractor.RemoteActorError
)
else:
assert isinstance(subexc, BaseExceptionGroup)
else:
assert subexc.type is ExceptionGroup
assert subexc.boxed_type is ExceptionGroup
else:
assert subexc.type in (
assert subexc.boxed_type in (
tractor.RemoteActorError,
trio.Cancelled
)

View File

@ -6,14 +6,15 @@ sub-sub-actor daemons.
'''
from typing import Optional
import asyncio
from contextlib import asynccontextmanager as acm
from contextlib import (
asynccontextmanager as acm,
aclosing,
)
import pytest
import trio
from trio_typing import TaskStatus
import tractor
from tractor import RemoteActorError
from async_generator import aclosing
async def aio_streamer(
@ -141,7 +142,7 @@ async def open_actor_local_nursery(
)
def test_actor_managed_trio_nursery_task_error_cancels_aio(
asyncio_mode: bool,
arb_addr
reg_addr: tuple,
):
'''
Verify that a ``trio`` nursery created managed in a child actor
@ -170,4 +171,4 @@ def test_actor_managed_trio_nursery_task_error_cancels_aio(
# verify boxed error
err = excinfo.value
assert isinstance(err.type(), NameError)
assert err.boxed_type is NameError

View File

@ -5,9 +5,7 @@ import trio
import tractor
from tractor import open_actor_cluster
from tractor.trionics import gather_contexts
from conftest import tractor_test
from tractor._testing import tractor_test
MESSAGE = 'tractoring at full speed'
@ -49,7 +47,7 @@ async def worker(
await ctx.started()
async with ctx.open_stream(
backpressure=True,
allow_overruns=True,
) as stream:
# TODO: this with the below assert causes a hang bug?

File diff suppressed because it is too large Load Diff

View File

@ -10,12 +10,13 @@ TODO:
- wonder if any of it'll work on OS X?
"""
from functools import partial
import itertools
from os import path
# from os import path
from typing import Optional
import platform
import pathlib
import sys
# import sys
import time
import pytest
@ -25,8 +26,14 @@ from pexpect.exceptions import (
EOF,
)
from conftest import (
from tractor.devx._debug import (
_pause_msg,
_crash_msg,
)
from tractor._testing import (
examples_dir,
)
from conftest import (
_ci_env,
)
@ -78,7 +85,7 @@ has_nested_actors = pytest.mark.has_nested_actors
def spawn(
start_method,
testdir,
arb_addr,
reg_addr,
) -> 'pexpect.spawn':
if start_method != 'trio':
@ -123,20 +130,52 @@ def expect(
raise
def in_prompt_msg(
prompt: str,
parts: list[str],
pause_on_false: bool = False,
print_prompt_on_false: bool = True,
) -> bool:
'''
Predicate check if (the prompt's) std-streams output has all
`str`-parts in it.
Can be used in test asserts for bulk matching expected
log/REPL output for a given `pdb` interact point.
'''
for part in parts:
if part not in prompt:
if pause_on_false:
import pdbp
pdbp.set_trace()
if print_prompt_on_false:
print(prompt)
return False
return True
def assert_before(
child,
patts: list[str],
**kwargs,
) -> None:
before = str(child.before.decode())
# as in before the prompt end
before: str = str(child.before.decode())
assert in_prompt_msg(
prompt=before,
parts=patts,
for patt in patts:
try:
assert patt in before
except AssertionError:
print(before)
raise
**kwargs
)
@pytest.fixture(
@ -166,7 +205,7 @@ def ctlc(
# XXX: disable pygments highlighting for auto-tests
# since some envs (like actions CI) will struggle
# the the added color-char encoding..
from tractor._debug import TractorConfig
from tractor.devx._debug import TractorConfig
TractorConfig.use_pygements = False
yield use_ctlc
@ -195,7 +234,10 @@ def test_root_actor_error(spawn, user_in_out):
before = str(child.before.decode())
# make sure expected logging and error arrives
assert "Attaching to pdb in crashed actor: ('root'" in before
assert in_prompt_msg(
before,
[_crash_msg, "('root'"]
)
assert 'AssertionError' in before
# send user command
@ -332,7 +374,10 @@ def test_subactor_error(
child.expect(PROMPT)
before = str(child.before.decode())
assert "Attaching to pdb in crashed actor: ('name_error'" in before
assert in_prompt_msg(
before,
[_crash_msg, "('name_error'"]
)
if do_next:
child.sendline('n')
@ -353,9 +398,15 @@ def test_subactor_error(
before = str(child.before.decode())
# root actor gets debugger engaged
assert "Attaching to pdb in crashed actor: ('root'" in before
assert in_prompt_msg(
before,
[_crash_msg, "('root'"]
)
# error is a remote error propagated from the subactor
assert "RemoteActorError: ('name_error'" in before
assert in_prompt_msg(
before,
[_crash_msg, "('name_error'"]
)
# another round
if ctlc:
@ -380,7 +431,10 @@ def test_subactor_breakpoint(
child.expect(PROMPT)
before = str(child.before.decode())
assert "Attaching pdb to actor: ('breakpoint_forever'" in before
assert in_prompt_msg(
before,
[_pause_msg, "('breakpoint_forever'"]
)
# do some "next" commands to demonstrate recurrent breakpoint
# entries
@ -396,7 +450,10 @@ def test_subactor_breakpoint(
child.sendline('continue')
child.expect(PROMPT)
before = str(child.before.decode())
assert "Attaching pdb to actor: ('breakpoint_forever'" in before
assert in_prompt_msg(
before,
[_pause_msg, "('breakpoint_forever'"]
)
if ctlc:
do_ctlc(child)
@ -441,7 +498,10 @@ def test_multi_subactors(
child.expect(PROMPT)
before = str(child.before.decode())
assert "Attaching pdb to actor: ('breakpoint_forever'" in before
assert in_prompt_msg(
before,
[_pause_msg, "('breakpoint_forever'"]
)
if ctlc:
do_ctlc(child)
@ -461,7 +521,10 @@ def test_multi_subactors(
# first name_error failure
child.expect(PROMPT)
before = str(child.before.decode())
assert "Attaching to pdb in crashed actor: ('name_error'" in before
assert in_prompt_msg(
before,
[_crash_msg, "('name_error'"]
)
assert "NameError" in before
if ctlc:
@ -487,7 +550,10 @@ def test_multi_subactors(
child.sendline('c')
child.expect(PROMPT)
before = str(child.before.decode())
assert "Attaching pdb to actor: ('breakpoint_forever'" in before
assert in_prompt_msg(
before,
[_pause_msg, "('breakpoint_forever'"]
)
if ctlc:
do_ctlc(child)
@ -527,17 +593,21 @@ def test_multi_subactors(
child.expect(PROMPT)
before = str(child.before.decode())
assert_before(child, [
# debugger attaches to root
"Attaching to pdb in crashed actor: ('root'",
assert_before(
child, [
# debugger attaches to root
# "Attaching to pdb in crashed actor: ('root'",
_crash_msg,
"('root'",
# expect a multierror with exceptions for each sub-actor
"RemoteActorError: ('breakpoint_forever'",
"RemoteActorError: ('name_error'",
"RemoteActorError: ('spawn_error'",
"RemoteActorError: ('name_error_1'",
'bdb.BdbQuit',
])
# expect a multierror with exceptions for each sub-actor
"RemoteActorError: ('breakpoint_forever'",
"RemoteActorError: ('name_error'",
"RemoteActorError: ('spawn_error'",
"RemoteActorError: ('name_error_1'",
'bdb.BdbQuit',
]
)
if ctlc:
do_ctlc(child)
@ -574,15 +644,22 @@ def test_multi_daemon_subactors(
# the root's tty lock first so anticipate either crash
# message on the first entry.
bp_forever_msg = "Attaching pdb to actor: ('bp_forever'"
bp_forev_parts = [_pause_msg, "('bp_forever'"]
bp_forev_in_msg = partial(
in_prompt_msg,
parts=bp_forev_parts,
)
name_error_msg = "NameError: name 'doggypants' is not defined"
name_error_parts = [name_error_msg]
before = str(child.before.decode())
if bp_forever_msg in before:
next_msg = name_error_msg
if bp_forev_in_msg(prompt=before):
next_parts = name_error_parts
elif name_error_msg in before:
next_msg = bp_forever_msg
next_parts = bp_forev_parts
else:
raise ValueError("Neither log msg was found !?")
@ -599,7 +676,10 @@ def test_multi_daemon_subactors(
child.sendline('c')
child.expect(PROMPT)
assert_before(child, [next_msg])
assert_before(
child,
next_parts,
)
# XXX: hooray the root clobbering the child here was fixed!
# IMO, this demonstrates the true power of SC system design.
@ -607,7 +687,7 @@ def test_multi_daemon_subactors(
# now the root actor won't clobber the bp_forever child
# during it's first access to the debug lock, but will instead
# wait for the lock to release, by the edge triggered
# ``_debug.Lock.no_remote_has_tty`` event before sending cancel messages
# ``devx._debug.Lock.no_remote_has_tty`` event before sending cancel messages
# (via portals) to its underlings B)
# at some point here there should have been some warning msg from
@ -623,9 +703,15 @@ def test_multi_daemon_subactors(
child.expect(PROMPT)
try:
assert_before(child, [bp_forever_msg])
assert_before(
child,
bp_forev_parts,
)
except AssertionError:
assert_before(child, [name_error_msg])
assert_before(
child,
name_error_parts,
)
else:
if ctlc:
@ -637,7 +723,10 @@ def test_multi_daemon_subactors(
child.sendline('c')
child.expect(PROMPT)
assert_before(child, [name_error_msg])
assert_before(
child,
name_error_parts,
)
# wait for final error in root
# where it crashs with boxed error
@ -647,7 +736,7 @@ def test_multi_daemon_subactors(
child.expect(PROMPT)
assert_before(
child,
[bp_forever_msg]
bp_forev_parts
)
except AssertionError:
break
@ -656,7 +745,9 @@ def test_multi_daemon_subactors(
child,
[
# boxed error raised in root task
"Attaching to pdb in crashed actor: ('root'",
# "Attaching to pdb in crashed actor: ('root'",
_crash_msg,
"('root'",
"_exceptions.RemoteActorError: ('name_error'",
]
)
@ -770,7 +861,7 @@ def test_multi_nested_subactors_error_through_nurseries(
child = spawn('multi_nested_subactors_error_up_through_nurseries')
timed_out_early: bool = False
# timed_out_early: bool = False
for send_char in itertools.cycle(['c', 'q']):
try:
@ -871,11 +962,14 @@ def test_root_nursery_cancels_before_child_releases_tty_lock(
if not timed_out_early:
before = str(child.before.decode())
assert_before(child, [
"tractor._exceptions.RemoteActorError: ('spawner0'",
"tractor._exceptions.RemoteActorError: ('name_error'",
"NameError: name 'doggypants' is not defined",
])
assert_before(
child,
[
"tractor._exceptions.RemoteActorError: ('spawner0'",
"tractor._exceptions.RemoteActorError: ('name_error'",
"NameError: name 'doggypants' is not defined",
],
)
def test_root_cancels_child_context_during_startup(
@ -909,8 +1003,10 @@ def test_different_debug_mode_per_actor(
# only one actor should enter the debugger
before = str(child.before.decode())
assert "Attaching to pdb in crashed actor: ('debugged_boi'" in before
assert "RuntimeError" in before
assert in_prompt_msg(
before,
[_crash_msg, "('debugged_boi'", "RuntimeError"],
)
if ctlc:
do_ctlc(child)
@ -931,3 +1027,67 @@ def test_different_debug_mode_per_actor(
# instead crashed completely
assert "tractor._exceptions.RemoteActorError: ('crash_boi'" in before
assert "RuntimeError" in before
def test_pause_from_sync(
spawn,
ctlc: bool
):
'''
Verify we can use the `pdbp` REPL from sync functions AND from
any thread spawned with `trio.to_thread.run_sync()`.
`examples/debugging/sync_bp.py`
'''
child = spawn('sync_bp')
child.expect(PROMPT)
assert_before(
child,
[
'`greenback` portal opened!',
# pre-prompt line
_pause_msg, "('root'",
]
)
if ctlc:
do_ctlc(child)
child.sendline('c')
child.expect(PROMPT)
# XXX shouldn't see gb loaded again
before = str(child.before.decode())
assert not in_prompt_msg(
before,
['`greenback` portal opened!'],
)
assert_before(
child,
[_pause_msg, "('root'",],
)
if ctlc:
do_ctlc(child)
child.sendline('c')
child.expect(PROMPT)
assert_before(
child,
[_pause_msg, "('subactor'",],
)
if ctlc:
do_ctlc(child)
child.sendline('c')
child.expect(PROMPT)
# non-main thread case
# TODO: should we agument the pre-prompt msg in this case?
assert_before(
child,
[_pause_msg, "('root'",],
)
if ctlc:
do_ctlc(child)
child.sendline('c')
child.expect(pexpect.EOF)

View File

@ -9,25 +9,24 @@ import itertools
import pytest
import tractor
from tractor._testing import tractor_test
import trio
from conftest import tractor_test
@tractor_test
async def test_reg_then_unreg(arb_addr):
async def test_reg_then_unreg(reg_addr):
actor = tractor.current_actor()
assert actor.is_arbiter
assert len(actor._registry) == 1 # only self is registered
async with tractor.open_nursery(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
) as n:
portal = await n.start_actor('actor', enable_modules=[__name__])
uid = portal.channel.uid
async with tractor.get_arbiter(*arb_addr) as aportal:
async with tractor.get_arbiter(*reg_addr) as aportal:
# this local actor should be the arbiter
assert actor is aportal.actor
@ -53,15 +52,27 @@ async def hi():
return the_line.format(tractor.current_actor().name)
async def say_hello(other_actor):
async def say_hello(
other_actor: str,
reg_addr: tuple[str, int],
):
await trio.sleep(1) # wait for other actor to spawn
async with tractor.find_actor(other_actor) as portal:
async with tractor.find_actor(
other_actor,
registry_addrs=[reg_addr],
) as portal:
assert portal is not None
return await portal.run(__name__, 'hi')
async def say_hello_use_wait(other_actor):
async with tractor.wait_for_actor(other_actor) as portal:
async def say_hello_use_wait(
other_actor: str,
reg_addr: tuple[str, int],
):
async with tractor.wait_for_actor(
other_actor,
registry_addr=reg_addr,
) as portal:
assert portal is not None
result = await portal.run(__name__, 'hi')
return result
@ -69,21 +80,29 @@ async def say_hello_use_wait(other_actor):
@tractor_test
@pytest.mark.parametrize('func', [say_hello, say_hello_use_wait])
async def test_trynamic_trio(func, start_method, arb_addr):
"""Main tractor entry point, the "master" process (for now
acts as the "director").
"""
async def test_trynamic_trio(
func,
start_method,
reg_addr,
):
'''
Root actor acting as the "director" and running one-shot-task-actors
for the directed subs.
'''
async with tractor.open_nursery() as n:
print("Alright... Action!")
donny = await n.run_in_actor(
func,
other_actor='gretchen',
reg_addr=reg_addr,
name='donny',
)
gretchen = await n.run_in_actor(
func,
other_actor='donny',
reg_addr=reg_addr,
name='gretchen',
)
print(await gretchen.result())
@ -131,7 +150,7 @@ async def unpack_reg(actor_or_portal):
async def spawn_and_check_registry(
arb_addr: tuple,
reg_addr: tuple,
use_signal: bool,
remote_arbiter: bool = False,
with_streaming: bool = False,
@ -139,9 +158,9 @@ async def spawn_and_check_registry(
) -> None:
async with tractor.open_root_actor(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
):
async with tractor.get_arbiter(*arb_addr) as portal:
async with tractor.get_arbiter(*reg_addr) as portal:
# runtime needs to be up to call this
actor = tractor.current_actor()
@ -213,17 +232,19 @@ async def spawn_and_check_registry(
def test_subactors_unregister_on_cancel(
start_method,
use_signal,
arb_addr,
reg_addr,
with_streaming,
):
"""Verify that cancelling a nursery results in all subactors
'''
Verify that cancelling a nursery results in all subactors
deregistering themselves with the arbiter.
"""
'''
with pytest.raises(KeyboardInterrupt):
trio.run(
partial(
spawn_and_check_registry,
arb_addr,
reg_addr,
use_signal,
remote_arbiter=False,
with_streaming=with_streaming,
@ -237,7 +258,7 @@ def test_subactors_unregister_on_cancel_remote_daemon(
daemon,
start_method,
use_signal,
arb_addr,
reg_addr,
with_streaming,
):
"""Verify that cancelling a nursery results in all subactors
@ -248,7 +269,7 @@ def test_subactors_unregister_on_cancel_remote_daemon(
trio.run(
partial(
spawn_and_check_registry,
arb_addr,
reg_addr,
use_signal,
remote_arbiter=True,
with_streaming=with_streaming,
@ -262,7 +283,7 @@ async def streamer(agen):
async def close_chans_before_nursery(
arb_addr: tuple,
reg_addr: tuple,
use_signal: bool,
remote_arbiter: bool = False,
) -> None:
@ -275,9 +296,9 @@ async def close_chans_before_nursery(
entries_at_end = 1
async with tractor.open_root_actor(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
):
async with tractor.get_arbiter(*arb_addr) as aportal:
async with tractor.get_arbiter(*reg_addr) as aportal:
try:
get_reg = partial(unpack_reg, aportal)
@ -329,7 +350,7 @@ async def close_chans_before_nursery(
def test_close_channel_explicit(
start_method,
use_signal,
arb_addr,
reg_addr,
):
"""Verify that closing a stream explicitly and killing the actor's
"root nursery" **before** the containing nursery tears down also
@ -339,7 +360,7 @@ def test_close_channel_explicit(
trio.run(
partial(
close_chans_before_nursery,
arb_addr,
reg_addr,
use_signal,
remote_arbiter=False,
),
@ -351,7 +372,7 @@ def test_close_channel_explicit_remote_arbiter(
daemon,
start_method,
use_signal,
arb_addr,
reg_addr,
):
"""Verify that closing a stream explicitly and killing the actor's
"root nursery" **before** the containing nursery tears down also
@ -361,7 +382,7 @@ def test_close_channel_explicit_remote_arbiter(
trio.run(
partial(
close_chans_before_nursery,
arb_addr,
reg_addr,
use_signal,
remote_arbiter=True,
),

View File

@ -11,8 +11,7 @@ import platform
import shutil
import pytest
from conftest import (
from tractor._testing import (
examples_dir,
)
@ -21,7 +20,7 @@ from conftest import (
def run_example_in_subproc(
loglevel: str,
testdir,
arb_addr: tuple[str, int],
reg_addr: tuple[str, int],
):
@contextmanager

View File

@ -8,15 +8,16 @@ import builtins
import itertools
import importlib
from exceptiongroup import BaseExceptionGroup
import pytest
import trio
import tractor
from tractor import (
to_asyncio,
RemoteActorError,
ContextCancelled,
)
from tractor.trionics import BroadcastReceiver
from tractor._testing import expect_ctxc
async def sleep_and_err(
@ -46,7 +47,7 @@ async def trio_cancels_single_aio_task():
await tractor.to_asyncio.run_task(sleep_forever)
def test_trio_cancels_aio_on_actor_side(arb_addr):
def test_trio_cancels_aio_on_actor_side(reg_addr):
'''
Spawn an infected actor that is cancelled by the ``trio`` side
task using std cancel scope apis.
@ -54,7 +55,7 @@ def test_trio_cancels_aio_on_actor_side(arb_addr):
'''
async def main():
async with tractor.open_nursery(
arbiter_addr=arb_addr
registry_addrs=[reg_addr]
) as n:
await n.run_in_actor(
trio_cancels_single_aio_task,
@ -67,7 +68,7 @@ def test_trio_cancels_aio_on_actor_side(arb_addr):
async def asyncio_actor(
target: str,
expect_err: Optional[Exception] = None
expect_err: Exception|None = None
) -> None:
@ -93,7 +94,7 @@ async def asyncio_actor(
raise
def test_aio_simple_error(arb_addr):
def test_aio_simple_error(reg_addr):
'''
Verify a simple remote asyncio error propagates back through trio
to the parent actor.
@ -102,7 +103,7 @@ def test_aio_simple_error(arb_addr):
'''
async def main():
async with tractor.open_nursery(
arbiter_addr=arb_addr
registry_addrs=[reg_addr]
) as n:
await n.run_in_actor(
asyncio_actor,
@ -111,15 +112,26 @@ def test_aio_simple_error(arb_addr):
infect_asyncio=True,
)
with pytest.raises(RemoteActorError) as excinfo:
with pytest.raises(
expected_exception=(RemoteActorError, ExceptionGroup),
) as excinfo:
trio.run(main)
err = excinfo.value
# might get multiple `trio.Cancelled`s as well inside an inception
if isinstance(err, ExceptionGroup):
err = next(itertools.dropwhile(
lambda exc: not isinstance(exc, tractor.RemoteActorError),
err.exceptions
))
assert err
assert isinstance(err, RemoteActorError)
assert err.type == AssertionError
assert err.boxed_type == AssertionError
def test_tractor_cancels_aio(arb_addr):
def test_tractor_cancels_aio(reg_addr):
'''
Verify we can cancel a spawned asyncio task gracefully.
@ -138,7 +150,7 @@ def test_tractor_cancels_aio(arb_addr):
trio.run(main)
def test_trio_cancels_aio(arb_addr):
def test_trio_cancels_aio(reg_addr):
'''
Much like the above test with ``tractor.Portal.cancel_actor()``
except we just use a standard ``trio`` cancellation api.
@ -189,11 +201,12 @@ async def trio_ctx(
@pytest.mark.parametrize(
'parent_cancels', [False, True],
'parent_cancels',
['context', 'actor', False],
ids='parent_actor_cancels_child={}'.format
)
def test_context_spawns_aio_task_that_errors(
arb_addr,
reg_addr,
parent_cancels: bool,
):
'''
@ -213,26 +226,53 @@ def test_context_spawns_aio_task_that_errors(
# debug_mode=True,
loglevel='cancel',
)
async with p.open_context(
trio_ctx,
) as (ctx, first):
async with (
expect_ctxc(
yay=parent_cancels == 'actor',
),
p.open_context(
trio_ctx,
) as (ctx, first),
):
assert first == 'start'
if parent_cancels:
if parent_cancels == 'actor':
await p.cancel_actor()
await trio.sleep_forever()
elif parent_cancels == 'context':
await ctx.cancel()
with pytest.raises(RemoteActorError) as excinfo:
trio.run(main)
else:
await trio.sleep_forever()
async with expect_ctxc(
yay=parent_cancels == 'actor',
):
await ctx.result()
if parent_cancels == 'context':
# to tear down sub-acor
await p.cancel_actor()
return ctx.outcome
err = excinfo.value
assert isinstance(err, RemoteActorError)
if parent_cancels:
assert err.type == trio.Cancelled
# bc the parent made the cancel request,
# the error is not raised locally but instead
# the context is exited silently
res = trio.run(main)
assert isinstance(res, ContextCancelled)
assert 'root' in res.canceller[0]
else:
assert err.type == AssertionError
expect = RemoteActorError
with pytest.raises(expect) as excinfo:
trio.run(main)
err = excinfo.value
assert isinstance(err, expect)
assert err.boxed_type == AssertionError
async def aio_cancel():
@ -248,7 +288,7 @@ async def aio_cancel():
await sleep_forever()
def test_aio_cancelled_from_aio_causes_trio_cancelled(arb_addr):
def test_aio_cancelled_from_aio_causes_trio_cancelled(reg_addr):
async def main():
async with tractor.open_nursery() as n:
@ -259,11 +299,22 @@ def test_aio_cancelled_from_aio_causes_trio_cancelled(arb_addr):
infect_asyncio=True,
)
with pytest.raises(RemoteActorError) as excinfo:
with pytest.raises(
expected_exception=(RemoteActorError, ExceptionGroup),
) as excinfo:
trio.run(main)
# might get multiple `trio.Cancelled`s as well inside an inception
err = excinfo.value
if isinstance(err, ExceptionGroup):
err = next(itertools.dropwhile(
lambda exc: not isinstance(exc, tractor.RemoteActorError),
err.exceptions
))
assert err
# ensure boxed error is correct
assert excinfo.value.type == to_asyncio.AsyncioCancelled
assert err.boxed_type == to_asyncio.AsyncioCancelled
# TODO: verify open_channel_from will fail on this..
@ -385,7 +436,7 @@ async def stream_from_aio(
'fan_out', [False, True],
ids='fan_out_w_chan_subscribe={}'.format
)
def test_basic_interloop_channel_stream(arb_addr, fan_out):
def test_basic_interloop_channel_stream(reg_addr, fan_out):
async def main():
async with tractor.open_nursery() as n:
portal = await n.run_in_actor(
@ -399,7 +450,7 @@ def test_basic_interloop_channel_stream(arb_addr, fan_out):
# TODO: parametrize the above test and avoid the duplication here?
def test_trio_error_cancels_intertask_chan(arb_addr):
def test_trio_error_cancels_intertask_chan(reg_addr):
async def main():
async with tractor.open_nursery() as n:
portal = await n.run_in_actor(
@ -415,10 +466,10 @@ def test_trio_error_cancels_intertask_chan(arb_addr):
# ensure boxed errors
for exc in excinfo.value.exceptions:
assert exc.type == Exception
assert exc.boxed_type == Exception
def test_trio_closes_early_and_channel_exits(arb_addr):
def test_trio_closes_early_and_channel_exits(reg_addr):
async def main():
async with tractor.open_nursery() as n:
portal = await n.run_in_actor(
@ -433,7 +484,7 @@ def test_trio_closes_early_and_channel_exits(arb_addr):
trio.run(main)
def test_aio_errors_and_channel_propagates_and_closes(arb_addr):
def test_aio_errors_and_channel_propagates_and_closes(reg_addr):
async def main():
async with tractor.open_nursery() as n:
portal = await n.run_in_actor(
@ -449,7 +500,7 @@ def test_aio_errors_and_channel_propagates_and_closes(arb_addr):
# ensure boxed errors
for exc in excinfo.value.exceptions:
assert exc.type == Exception
assert exc.boxed_type == Exception
@tractor.context
@ -510,7 +561,7 @@ async def trio_to_aio_echo_server(
ids='raise_error={}'.format,
)
def test_echoserver_detailed_mechanics(
arb_addr,
reg_addr,
raise_error_mid_stream,
):
@ -550,7 +601,8 @@ def test_echoserver_detailed_mechanics(
pass
else:
pytest.fail(
"stream wasn't stopped after sentinel?!")
'stream not stopped after sentinel ?!'
)
# TODO: the case where this blocks and
# is cancelled by kbi or out of task cancellation
@ -562,3 +614,37 @@ def test_echoserver_detailed_mechanics(
else:
trio.run(main)
# TODO: debug_mode tests once we get support for `asyncio`!
#
# -[ ] need tests to wrap both scripts:
# - [ ] infected_asyncio_echo_server.py
# - [ ] debugging/asyncio_bp.py
# -[ ] consider moving ^ (some of) these ^ to `test_debugger`?
#
# -[ ] missing impl outstanding includes:
# - [x] for sync pauses we need to ensure we open yet another
# `greenback` portal in the asyncio task
# => completed using `.bestow_portal(task)` inside
# `.to_asyncio._run_asyncio_task()` right?
# -[ ] translation func to get from `asyncio` task calling to
# `._debug.wait_for_parent_stdin_hijack()` which does root
# call to do TTY locking.
#
def test_sync_breakpoint():
'''
Verify we can do sync-func/code breakpointing using the
`breakpoint()` builtin inside infected mode actors.
'''
pytest.xfail('This support is not implemented yet!')
def test_debug_mode_crash_handling():
'''
Verify mult-actor crash handling works with a combo of infected-`asyncio`-mode
and normal `trio` actors despite nested process trees.
'''
pytest.xfail('This support is not implemented yet!')

File diff suppressed because it is too large Load Diff

View File

@ -9,7 +9,7 @@ import trio
import tractor
import pytest
from conftest import tractor_test
from tractor._testing import tractor_test
def test_must_define_ctx():
@ -55,7 +55,7 @@ async def context_stream(
async def stream_from_single_subactor(
arb_addr,
reg_addr,
start_method,
stream_func,
):
@ -64,7 +64,7 @@ async def stream_from_single_subactor(
# only one per host address, spawns an actor if None
async with tractor.open_nursery(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
start_method=start_method,
) as nursery:
@ -115,13 +115,13 @@ async def stream_from_single_subactor(
@pytest.mark.parametrize(
'stream_func', [async_gen_stream, context_stream]
)
def test_stream_from_single_subactor(arb_addr, start_method, stream_func):
def test_stream_from_single_subactor(reg_addr, start_method, stream_func):
"""Verify streaming from a spawned async generator.
"""
trio.run(
partial(
stream_from_single_subactor,
arb_addr,
reg_addr,
start_method,
stream_func=stream_func,
),
@ -225,14 +225,14 @@ async def a_quadruple_example():
return result_stream
async def cancel_after(wait, arb_addr):
async with tractor.open_root_actor(arbiter_addr=arb_addr):
async def cancel_after(wait, reg_addr):
async with tractor.open_root_actor(registry_addrs=[reg_addr]):
with trio.move_on_after(wait):
return await a_quadruple_example()
@pytest.fixture(scope='module')
def time_quad_ex(arb_addr, ci_env, spawn_backend):
def time_quad_ex(reg_addr, ci_env, spawn_backend):
if spawn_backend == 'mp':
"""no idea but the mp *nix runs are flaking out here often...
"""
@ -240,7 +240,7 @@ def time_quad_ex(arb_addr, ci_env, spawn_backend):
timeout = 7 if platform.system() in ('Windows', 'Darwin') else 4
start = time.time()
results = trio.run(cancel_after, timeout, arb_addr)
results = trio.run(cancel_after, timeout, reg_addr)
diff = time.time() - start
assert results
return results, diff
@ -260,14 +260,14 @@ def test_a_quadruple_example(time_quad_ex, ci_env, spawn_backend):
list(map(lambda i: i/10, range(3, 9)))
)
def test_not_fast_enough_quad(
arb_addr, time_quad_ex, cancel_delay, ci_env, spawn_backend
reg_addr, time_quad_ex, cancel_delay, ci_env, spawn_backend
):
"""Verify we can cancel midway through the quad example and all actors
cancel gracefully.
"""
results, diff = time_quad_ex
delay = max(diff - cancel_delay, 0)
results = trio.run(cancel_after, delay, arb_addr)
results = trio.run(cancel_after, delay, reg_addr)
system = platform.system()
if system in ('Windows', 'Darwin') and results is not None:
# In CI envoirments it seems later runs are quicker then the first
@ -280,7 +280,7 @@ def test_not_fast_enough_quad(
@tractor_test
async def test_respawn_consumer_task(
arb_addr,
reg_addr,
spawn_backend,
loglevel,
):

View File

@ -7,7 +7,7 @@ import pytest
import trio
import tractor
from conftest import tractor_test
from tractor._testing import tractor_test
@pytest.mark.trio
@ -24,7 +24,7 @@ async def test_no_runtime():
@tractor_test
async def test_self_is_registered(arb_addr):
async def test_self_is_registered(reg_addr):
"Verify waiting on the arbiter to register itself using the standard api."
actor = tractor.current_actor()
assert actor.is_arbiter
@ -34,20 +34,20 @@ async def test_self_is_registered(arb_addr):
@tractor_test
async def test_self_is_registered_localportal(arb_addr):
async def test_self_is_registered_localportal(reg_addr):
"Verify waiting on the arbiter to register itself using a local portal."
actor = tractor.current_actor()
assert actor.is_arbiter
async with tractor.get_arbiter(*arb_addr) as portal:
async with tractor.get_arbiter(*reg_addr) as portal:
assert isinstance(portal, tractor._portal.LocalPortal)
with trio.fail_after(0.2):
sockaddr = await portal.run_from_ns(
'self', 'wait_for_actor', name='root')
assert sockaddr[0] == arb_addr
assert sockaddr[0] == reg_addr
def test_local_actor_async_func(arb_addr):
def test_local_actor_async_func(reg_addr):
"""Verify a simple async function in-process.
"""
nums = []
@ -55,7 +55,7 @@ def test_local_actor_async_func(arb_addr):
async def print_loop():
async with tractor.open_root_actor(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
):
# arbiter is started in-proc if dne
assert tractor.current_actor().is_arbiter

View File

@ -7,8 +7,10 @@ import time
import pytest
import trio
import tractor
from conftest import (
from tractor._testing import (
tractor_test,
)
from conftest import (
sig_prog,
_INT_SIGNAL,
_INT_RETURN_CODE,
@ -28,9 +30,9 @@ def test_abort_on_sigint(daemon):
@tractor_test
async def test_cancel_remote_arbiter(daemon, arb_addr):
async def test_cancel_remote_arbiter(daemon, reg_addr):
assert not tractor.current_actor().is_arbiter
async with tractor.get_arbiter(*arb_addr) as portal:
async with tractor.get_arbiter(*reg_addr) as portal:
await portal.cancel_actor()
time.sleep(0.1)
@ -39,16 +41,16 @@ async def test_cancel_remote_arbiter(daemon, arb_addr):
# no arbiter socket should exist
with pytest.raises(OSError):
async with tractor.get_arbiter(*arb_addr) as portal:
async with tractor.get_arbiter(*reg_addr) as portal:
pass
def test_register_duplicate_name(daemon, arb_addr):
def test_register_duplicate_name(daemon, reg_addr):
async def main():
async with tractor.open_nursery(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
) as n:
assert not tractor.current_actor().is_arbiter

View File

@ -5,8 +5,7 @@ import pytest
import trio
import tractor
from tractor.experimental import msgpub
from conftest import tractor_test
from tractor._testing import tractor_test
def test_type_checks():
@ -160,7 +159,7 @@ async def test_required_args(callwith_expecterror):
)
def test_multi_actor_subs_arbiter_pub(
loglevel,
arb_addr,
reg_addr,
pub_actor,
):
"""Try out the neato @pub decorator system.
@ -170,7 +169,7 @@ def test_multi_actor_subs_arbiter_pub(
async def main():
async with tractor.open_nursery(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
enable_modules=[__name__],
) as n:
@ -255,12 +254,12 @@ def test_multi_actor_subs_arbiter_pub(
def test_single_subactor_pub_multitask_subs(
loglevel,
arb_addr,
reg_addr,
):
async def main():
async with tractor.open_nursery(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
enable_modules=[__name__],
) as n:

View File

@ -34,7 +34,6 @@ def test_resource_only_entered_once(key_on):
global _resource
_resource = 0
kwargs = {}
key = None
if key_on == 'key_value':
key = 'some_common_key'
@ -139,7 +138,7 @@ def test_open_local_sub_to_stream():
N local tasks using ``trionics.maybe_open_context():``.
'''
timeout = 3 if platform.system() != "Windows" else 10
timeout: float = 3.6 if platform.system() != "Windows" else 10
async def main():

View File

@ -1,6 +1,8 @@
"""
RPC related
"""
'''
RPC (or maybe better labelled as "RTS: remote task scheduling"?)
related API and error checks.
'''
import itertools
import pytest
@ -13,9 +15,19 @@ async def sleep_back_actor(
func_name,
func_defined,
exposed_mods,
*,
reg_addr: tuple,
):
if actor_name:
async with tractor.find_actor(actor_name) as portal:
async with tractor.find_actor(
actor_name,
# NOTE: must be set manually since
# the subactor doesn't have the reg_addr
# fixture code run in it!
# TODO: maybe we should just set this once in the
# _state mod and derive to all children?
registry_addrs=[reg_addr],
) as portal:
try:
await portal.run(__name__, func_name)
except tractor.RemoteActorError as err:
@ -24,7 +36,7 @@ async def sleep_back_actor(
if not exposed_mods:
expect = tractor.ModuleNotExposed
assert err.type is expect
assert err.boxed_type is expect
raise
else:
await trio.sleep(float('inf'))
@ -42,14 +54,25 @@ async def short_sleep():
(['tmp_mod'], 'import doggy', ModuleNotFoundError),
(['tmp_mod'], '4doggy', SyntaxError),
],
ids=['no_mods', 'this_mod', 'this_mod_bad_func', 'fail_to_import',
'fail_on_syntax'],
ids=[
'no_mods',
'this_mod',
'this_mod_bad_func',
'fail_to_import',
'fail_on_syntax',
],
)
def test_rpc_errors(arb_addr, to_call, testdir):
"""Test errors when making various RPC requests to an actor
def test_rpc_errors(
reg_addr,
to_call,
testdir,
):
'''
Test errors when making various RPC requests to an actor
that either doesn't have the requested module exposed or doesn't define
the named function.
"""
'''
exposed_mods, funcname, inside_err = to_call
subactor_exposed_mods = []
func_defined = globals().get(funcname, False)
@ -77,8 +100,13 @@ def test_rpc_errors(arb_addr, to_call, testdir):
# spawn a subactor which calls us back
async with tractor.open_nursery(
arbiter_addr=arb_addr,
registry_addrs=[reg_addr],
enable_modules=exposed_mods.copy(),
# NOTE: will halt test in REPL if uncommented, so only
# do that if actually debugging subactor but keep it
# disabled for the test.
# debug_mode=True,
) as n:
actor = tractor.current_actor()
@ -95,6 +123,7 @@ def test_rpc_errors(arb_addr, to_call, testdir):
exposed_mods=exposed_mods,
func_defined=True if func_defined else False,
enable_modules=subactor_exposed_mods,
reg_addr=reg_addr,
)
def run():
@ -105,18 +134,20 @@ def test_rpc_errors(arb_addr, to_call, testdir):
run()
else:
# underlying errors aren't propagated upwards (yet)
with pytest.raises(remote_err) as err:
with pytest.raises(
expected_exception=(remote_err, ExceptionGroup),
) as err:
run()
# get raw instance from pytest wrapper
value = err.value
# might get multiple `trio.Cancelled`s as well inside an inception
if isinstance(value, trio.MultiError):
if isinstance(value, ExceptionGroup):
value = next(itertools.dropwhile(
lambda exc: not isinstance(exc, tractor.RemoteActorError),
value.exceptions
))
if getattr(value, 'type', None):
assert value.type is inside_err
assert value.boxed_type is inside_err

View File

@ -8,7 +8,7 @@ import pytest
import trio
import tractor
from conftest import tractor_test
from tractor._testing import tractor_test
_file_path: str = ''
@ -64,7 +64,8 @@ async def test_lifetime_stack_wipes_tmpfile(
except (
tractor.RemoteActorError,
tractor.BaseExceptionGroup,
# tractor.BaseExceptionGroup,
BaseExceptionGroup,
):
pass

167
tests/test_shm.py 100644
View File

@ -0,0 +1,167 @@
"""
Shared mem primitives and APIs.
"""
import uuid
# import numpy
import pytest
import trio
import tractor
from tractor._shm import (
open_shm_list,
attach_shm_list,
)
@tractor.context
async def child_attach_shml_alot(
ctx: tractor.Context,
shm_key: str,
) -> None:
await ctx.started(shm_key)
# now try to attach a boatload of times in a loop..
for _ in range(1000):
shml = attach_shm_list(
key=shm_key,
readonly=False,
)
assert shml.shm.name == shm_key
await trio.sleep(0.001)
def test_child_attaches_alot():
async def main():
async with tractor.open_nursery() as an:
# allocate writeable list in parent
key = f'shml_{uuid.uuid4()}'
shml = open_shm_list(
key=key,
)
portal = await an.start_actor(
'shm_attacher',
enable_modules=[__name__],
)
async with (
portal.open_context(
child_attach_shml_alot,
shm_key=shml.key,
) as (ctx, start_val),
):
assert start_val == key
await ctx.result()
await portal.cancel_actor()
trio.run(main)
@tractor.context
async def child_read_shm_list(
ctx: tractor.Context,
shm_key: str,
use_str: bool,
frame_size: int,
) -> None:
# attach in child
shml = attach_shm_list(
key=shm_key,
# dtype=str if use_str else float,
)
await ctx.started(shml.key)
async with ctx.open_stream() as stream:
async for i in stream:
print(f'(child): reading shm list index: {i}')
if use_str:
expect = str(float(i))
else:
expect = float(i)
if frame_size == 1:
val = shml[i]
assert expect == val
print(f'(child): reading value: {val}')
else:
frame = shml[i - frame_size:i]
print(f'(child): reading frame: {frame}')
@pytest.mark.parametrize(
'use_str',
[False, True],
ids=lambda i: f'use_str_values={i}',
)
@pytest.mark.parametrize(
'frame_size',
[1, 2**6, 2**10],
ids=lambda i: f'frame_size={i}',
)
def test_parent_writer_child_reader(
use_str: bool,
frame_size: int,
):
async def main():
async with tractor.open_nursery(
# debug_mode=True,
) as an:
portal = await an.start_actor(
'shm_reader',
enable_modules=[__name__],
debug_mode=True,
)
# allocate writeable list in parent
key = 'shm_list'
seq_size = int(2 * 2 ** 10)
shml = open_shm_list(
key=key,
size=seq_size,
dtype=str if use_str else float,
readonly=False,
)
async with (
portal.open_context(
child_read_shm_list,
shm_key=key,
use_str=use_str,
frame_size=frame_size,
) as (ctx, sent),
ctx.open_stream() as stream,
):
assert sent == key
for i in range(seq_size):
val = float(i)
if use_str:
val = str(val)
# print(f'(parent): writing {val}')
shml[i] = val
# only on frame fills do we
# signal to the child that a frame's
# worth is ready.
if (i % frame_size) == 0:
print(f'(parent): signalling frame full on {val}')
await stream.send(i)
else:
print(f'(parent): signalling final frame on {val}')
await stream.send(i)
await portal.cancel_actor()
trio.run(main)

View File

@ -8,7 +8,7 @@ import pytest
import trio
import tractor
from conftest import tractor_test
from tractor._testing import tractor_test
data_to_pass_down = {'doggy': 10, 'kitty': 4}
@ -16,14 +16,14 @@ data_to_pass_down = {'doggy': 10, 'kitty': 4}
async def spawn(
is_arbiter: bool,
data: dict,
arb_addr: tuple[str, int],
reg_addr: tuple[str, int],
):
namespaces = [__name__]
await trio.sleep(0.1)
async with tractor.open_root_actor(
arbiter_addr=arb_addr,
arbiter_addr=reg_addr,
):
actor = tractor.current_actor()
@ -32,8 +32,7 @@ async def spawn(
if actor.is_arbiter:
async with tractor.open_nursery(
) as nursery:
async with tractor.open_nursery() as nursery:
# forks here
portal = await nursery.run_in_actor(
@ -41,7 +40,7 @@ async def spawn(
is_arbiter=False,
name='sub-actor',
data=data,
arb_addr=arb_addr,
reg_addr=reg_addr,
enable_modules=namespaces,
)
@ -55,12 +54,14 @@ async def spawn(
return 10
def test_local_arbiter_subactor_global_state(arb_addr):
def test_local_arbiter_subactor_global_state(
reg_addr,
):
result = trio.run(
spawn,
True,
data_to_pass_down,
arb_addr,
reg_addr,
)
assert result == 10
@ -140,7 +141,7 @@ async def check_loglevel(level):
def test_loglevel_propagated_to_subactor(
start_method,
capfd,
arb_addr,
reg_addr,
):
if start_method == 'mp_forkserver':
pytest.skip(
@ -152,7 +153,7 @@ def test_loglevel_propagated_to_subactor(
async with tractor.open_nursery(
name='arbiter',
start_method=start_method,
arbiter_addr=arb_addr,
arbiter_addr=reg_addr,
) as tn:
await tn.run_in_actor(

View File

@ -66,13 +66,13 @@ async def ensure_sequence(
async def open_sequence_streamer(
sequence: list[int],
arb_addr: tuple[str, int],
reg_addr: tuple[str, int],
start_method: str,
) -> tractor.MsgStream:
async with tractor.open_nursery(
arbiter_addr=arb_addr,
arbiter_addr=reg_addr,
start_method=start_method,
) as tn:
@ -86,14 +86,14 @@ async def open_sequence_streamer(
) as (ctx, first):
assert first is None
async with ctx.open_stream(backpressure=True) as stream:
async with ctx.open_stream(allow_overruns=True) as stream:
yield stream
await portal.cancel_actor()
def test_stream_fan_out_to_local_subscriptions(
arb_addr,
reg_addr,
start_method,
):
@ -103,7 +103,7 @@ def test_stream_fan_out_to_local_subscriptions(
async with open_sequence_streamer(
sequence,
arb_addr,
reg_addr,
start_method,
) as stream:
@ -138,7 +138,7 @@ def test_stream_fan_out_to_local_subscriptions(
]
)
def test_consumer_and_parent_maybe_lag(
arb_addr,
reg_addr,
start_method,
task_delays,
):
@ -150,7 +150,7 @@ def test_consumer_and_parent_maybe_lag(
async with open_sequence_streamer(
sequence,
arb_addr,
reg_addr,
start_method,
) as stream:
@ -211,7 +211,7 @@ def test_consumer_and_parent_maybe_lag(
def test_faster_task_to_recv_is_cancelled_by_slower(
arb_addr,
reg_addr,
start_method,
):
'''
@ -225,7 +225,7 @@ def test_faster_task_to_recv_is_cancelled_by_slower(
async with open_sequence_streamer(
sequence,
arb_addr,
reg_addr,
start_method,
) as stream:
@ -302,7 +302,7 @@ def test_subscribe_errors_after_close():
def test_ensure_slow_consumers_lag_out(
arb_addr,
reg_addr,
start_method,
):
'''This is a pure local task test; no tractor
@ -413,8 +413,8 @@ def test_ensure_slow_consumers_lag_out(
seq = brx._state.subs[brx.key]
assert seq == len(brx._state.queue) - 1
# all backpressured entries in the underlying
# channel should have been copied into the caster
# all no_overruns entries in the underlying
# channel should have been copied into the bcaster
# queue trailing-window
async for i in rx:
print(f'bped: {i}')

View File

@ -5,7 +5,7 @@ want to see changed.
'''
import pytest
import trio
from trio_typing import TaskStatus
from trio import TaskStatus
@pytest.mark.parametrize(

View File

@ -15,72 +15,50 @@
# along with this program. If not, see <https://www.gnu.org/licenses/>.
"""
tractor: structured concurrent "actors".
tractor: structured concurrent ``trio``-"actors".
"""
from exceptiongroup import BaseExceptionGroup
from ._clustering import open_actor_cluster
from ._ipc import Channel
from ._clustering import (
open_actor_cluster as open_actor_cluster,
)
from ._context import (
Context as Context, # the type
context as context, # a func-decorator
)
from ._streaming import (
Context,
MsgStream,
stream,
context,
MsgStream as MsgStream,
stream as stream,
)
from ._discovery import (
get_arbiter,
find_actor,
wait_for_actor,
query_actor,
get_arbiter as get_arbiter,
find_actor as find_actor,
wait_for_actor as wait_for_actor,
query_actor as query_actor,
)
from ._supervise import (
open_nursery as open_nursery,
ActorNursery as ActorNursery,
)
from ._supervise import open_nursery
from ._state import (
current_actor,
is_root_process,
current_actor as current_actor,
is_root_process as is_root_process,
)
from ._exceptions import (
RemoteActorError,
ModuleNotExposed,
ContextCancelled,
RemoteActorError as RemoteActorError,
ModuleNotExposed as ModuleNotExposed,
ContextCancelled as ContextCancelled,
)
from ._debug import (
breakpoint,
post_mortem,
from .devx import (
breakpoint as breakpoint,
pause as pause,
pause_from_sync as pause_from_sync,
post_mortem as post_mortem,
)
from . import msg
from . import msg as msg
from ._root import (
run_daemon,
open_root_actor,
run_daemon as run_daemon,
open_root_actor as open_root_actor,
)
from ._portal import Portal
from ._runtime import Actor
__all__ = [
'Actor',
'Channel',
'Context',
'ContextCancelled',
'ModuleNotExposed',
'MsgStream',
'BaseExceptionGroup',
'Portal',
'RemoteActorError',
'breakpoint',
'context',
'current_actor',
'find_actor',
'get_arbiter',
'is_root_process',
'msg',
'open_actor_cluster',
'open_nursery',
'open_root_actor',
'post_mortem',
'query_actor',
'run_daemon',
'stream',
'to_asyncio',
'wait_for_actor',
]
from ._ipc import Channel as Channel
from ._portal import Portal as Portal
from ._runtime import Actor as Actor

View File

@ -18,8 +18,6 @@
This is the "bootloader" for actors started using the native trio backend.
"""
import sys
import trio
import argparse
from ast import literal_eval
@ -37,8 +35,6 @@ def parse_ipaddr(arg):
return (str(host), int(port))
from ._entry import _trio_main
if __name__ == "__main__":
parser = argparse.ArgumentParser()

2487
tractor/_context.py 100644

File diff suppressed because it is too large Load Diff

View File

@ -15,50 +15,82 @@
# along with this program. If not, see <https://www.gnu.org/licenses/>.
"""
Actor discovery API.
Discovery (protocols) API for automatic addressing and location
management of (service) actors.
"""
from __future__ import annotations
from typing import (
Optional,
Union,
AsyncGenerator,
AsyncContextManager,
TYPE_CHECKING,
)
from contextlib import asynccontextmanager as acm
import warnings
from .trionics import gather_contexts
from ._ipc import _connect_chan, Channel
from ._portal import (
Portal,
open_portal,
LocalPortal,
)
from ._state import current_actor, _runtime_vars
from ._state import (
current_actor,
_runtime_vars,
)
if TYPE_CHECKING:
from ._runtime import Actor
@acm
async def get_arbiter(
async def get_registry(
host: str,
port: int,
) -> AsyncGenerator[Union[Portal, LocalPortal], None]:
'''Return a portal instance connected to a local or remote
) -> AsyncGenerator[
Portal | LocalPortal | None,
None,
]:
'''
Return a portal instance connected to a local or remote
arbiter.
'''
actor = current_actor()
if not actor:
raise RuntimeError("No actor instance has been defined yet?")
if actor.is_arbiter:
if actor.is_registrar:
# we're already the arbiter
# (likely a re-entrant call from the arbiter actor)
yield LocalPortal(actor, Channel((host, port)))
yield LocalPortal(
actor,
Channel((host, port))
)
else:
async with _connect_chan(host, port) as chan:
async with (
_connect_chan(host, port) as chan,
open_portal(chan) as regstr_ptl,
):
yield regstr_ptl
async with open_portal(chan) as arb_portal:
yield arb_portal
# TODO: deprecate and this remove _arbiter form!
@acm
async def get_arbiter(*args, **kwargs):
warnings.warn(
'`tractor.get_arbiter()` is now deprecated!\n'
'Use `.get_registry()` instead!',
DeprecationWarning,
stacklevel=2,
)
async with get_registry(*args, **kwargs) as to_yield:
yield to_yield
@acm
@ -66,51 +98,80 @@ async def get_root(
**kwargs,
) -> AsyncGenerator[Portal, None]:
# TODO: rename mailbox to `_root_maddr` when we finally
# add and impl libp2p multi-addrs?
host, port = _runtime_vars['_root_mailbox']
assert host is not None
async with _connect_chan(host, port) as chan:
async with open_portal(chan, **kwargs) as portal:
yield portal
async with (
_connect_chan(host, port) as chan,
open_portal(chan, **kwargs) as portal,
):
yield portal
@acm
async def query_actor(
name: str,
arbiter_sockaddr: Optional[tuple[str, int]] = None,
arbiter_sockaddr: tuple[str, int] | None = None,
regaddr: tuple[str, int] | None = None,
) -> AsyncGenerator[tuple[str, int], None]:
) -> AsyncGenerator[
tuple[str, int] | None,
None,
]:
'''
Simple address lookup for a given actor name.
Make a transport address lookup for an actor name to a specific
registrar.
Returns the (socket) address or ``None``.
Returns the (socket) address or ``None`` if no entry under that
name exists for the given registrar listening @ `regaddr`.
'''
actor = current_actor()
async with get_arbiter(
*arbiter_sockaddr or actor._arb_addr
) as arb_portal:
actor: Actor = current_actor()
if (
name == 'registrar'
and actor.is_registrar
):
raise RuntimeError(
'The current actor IS the registry!?'
)
sockaddr = await arb_portal.run_from_ns(
if arbiter_sockaddr is not None:
warnings.warn(
'`tractor.query_actor(regaddr=<blah>)` is deprecated.\n'
'Use `registry_addrs: list[tuple]` instead!',
DeprecationWarning,
stacklevel=2,
)
regaddr: list[tuple[str, int]] = arbiter_sockaddr
reg_portal: Portal
regaddr: tuple[str, int] = regaddr or actor.reg_addrs[0]
async with get_registry(*regaddr) as reg_portal:
# TODO: return portals to all available actors - for now
# just the last one that registered
sockaddr: tuple[str, int] = await reg_portal.run_from_ns(
'self',
'find_actor',
name=name,
)
# TODO: return portals to all available actors - for now just
# the last one that registered
if name == 'arbiter' and actor.is_arbiter:
raise RuntimeError("The current actor is the arbiter")
yield sockaddr if sockaddr else None
yield sockaddr
@acm
async def find_actor(
name: str,
arbiter_sockaddr: tuple[str, int] | None = None
arbiter_sockaddr: tuple[str, int]|None = None,
registry_addrs: list[tuple[str, int]]|None = None,
) -> AsyncGenerator[Optional[Portal], None]:
only_first: bool = True,
raise_on_none: bool = False,
) -> AsyncGenerator[
Portal | list[Portal] | None,
None,
]:
'''
Ask the arbiter to find actor(s) by name.
@ -118,39 +179,116 @@ async def find_actor(
known to the arbiter.
'''
async with query_actor(
name=name,
arbiter_sockaddr=arbiter_sockaddr,
) as sockaddr:
if arbiter_sockaddr is not None:
warnings.warn(
'`tractor.find_actor(arbiter_sockaddr=<blah>)` is deprecated.\n'
'Use `registry_addrs: list[tuple]` instead!',
DeprecationWarning,
stacklevel=2,
)
registry_addrs: list[tuple[str, int]] = [arbiter_sockaddr]
if sockaddr:
async with _connect_chan(*sockaddr) as chan:
async with open_portal(chan) as portal:
yield portal
else:
@acm
async def maybe_open_portal_from_reg_addr(
addr: tuple[str, int],
):
async with query_actor(
name=name,
regaddr=addr,
) as sockaddr:
if sockaddr:
async with _connect_chan(*sockaddr) as chan:
async with open_portal(chan) as portal:
yield portal
else:
yield None
if not registry_addrs:
# XXX NOTE: make sure to dynamically read the value on
# every call since something may change it globally (eg.
# like in our discovery test suite)!
from . import _root
registry_addrs = (
_runtime_vars['_registry_addrs']
or
_root._default_lo_addrs
)
maybe_portals: list[
AsyncContextManager[tuple[str, int]]
] = list(
maybe_open_portal_from_reg_addr(addr)
for addr in registry_addrs
)
async with gather_contexts(
mngrs=maybe_portals,
) as portals:
# log.runtime(
# 'Gathered portals:\n'
# f'{portals}'
# )
# NOTE: `gather_contexts()` will return a
# `tuple[None, None, ..., None]` if no contact
# can be made with any regstrar at any of the
# N provided addrs!
if not any(portals):
if raise_on_none:
raise RuntimeError(
f'No actor "{name}" found registered @ {registry_addrs}'
)
yield None
return
portals: list[Portal] = list(portals)
if only_first:
yield portals[0]
else:
# TODO: currently this may return multiple portals
# given there are multi-homed or multiple registrars..
# SO, we probably need de-duplication logic?
yield portals
@acm
async def wait_for_actor(
name: str,
arbiter_sockaddr: tuple[str, int] | None = None
arbiter_sockaddr: tuple[str, int] | None = None,
registry_addr: tuple[str, int] | None = None,
) -> AsyncGenerator[Portal, None]:
"""Wait on an actor to register with the arbiter.
'''
Wait on an actor to register with the arbiter.
A portal to the first registered actor is returned.
"""
actor = current_actor()
async with get_arbiter(
*arbiter_sockaddr or actor._arb_addr,
) as arb_portal:
sockaddrs = await arb_portal.run_from_ns(
'''
actor: Actor = current_actor()
if arbiter_sockaddr is not None:
warnings.warn(
'`tractor.wait_for_actor(arbiter_sockaddr=<foo>)` is deprecated.\n'
'Use `registry_addr: tuple` instead!',
DeprecationWarning,
stacklevel=2,
)
registry_addr: tuple[str, int] = arbiter_sockaddr
# TODO: use `.trionics.gather_contexts()` like
# above in `find_actor()` as well?
reg_portal: Portal
regaddr: tuple[str, int] = registry_addr or actor.reg_addrs[0]
async with get_registry(*regaddr) as reg_portal:
sockaddrs = await reg_portal.run_from_ns(
'self',
'wait_for_actor',
name=name,
)
sockaddr = sockaddrs[-1]
# get latest registered addr by default?
# TODO: offer multi-portal yields in multi-homed case?
sockaddr: tuple[str, int] = sockaddrs[-1]
async with _connect_chan(*sockaddr) as chan:
async with open_portal(chan) as portal:

View File

@ -47,8 +47,8 @@ log = get_logger(__name__)
def _mp_main(
actor: Actor, # type: ignore
accept_addr: tuple[str, int],
actor: Actor,
accept_addrs: list[tuple[str, int]],
forkserver_info: tuple[Any, Any, Any, Any, Any],
start_method: SpawnMethodKey,
parent_addr: tuple[str, int] | None = None,
@ -77,8 +77,8 @@ def _mp_main(
log.debug(f"parent_addr is {parent_addr}")
trio_main = partial(
async_main,
actor,
accept_addr,
actor=actor,
accept_addrs=accept_addrs,
parent_addr=parent_addr
)
try:
@ -96,7 +96,7 @@ def _mp_main(
def _trio_main(
actor: Actor, # type: ignore
actor: Actor,
*,
parent_addr: tuple[str, int] | None = None,
infect_asyncio: bool = False,
@ -106,25 +106,29 @@ def _trio_main(
Entry point for a `trio_run_in_process` subactor.
'''
log.info(f"Started new trio process for {actor.uid}")
if actor.loglevel is not None:
log.info(
f"Setting loglevel for {actor.uid} to {actor.loglevel}")
get_console_log(actor.loglevel)
log.info(
f"Started {actor.uid}")
_state._current_actor = actor
log.debug(f"parent_addr is {parent_addr}")
trio_main = partial(
async_main,
actor,
parent_addr=parent_addr
)
if actor.loglevel is not None:
get_console_log(actor.loglevel)
import os
actor_info: str = (
f'|_{actor}\n'
f' uid: {actor.uid}\n'
f' pid: {os.getpid()}\n'
f' parent_addr: {parent_addr}\n'
f' loglevel: {actor.loglevel}\n'
)
log.info(
'Started new trio process:\n'
+
actor_info
)
try:
if infect_asyncio:
actor._infected_aio = True
@ -132,7 +136,15 @@ def _trio_main(
else:
trio.run(trio_main)
except KeyboardInterrupt:
log.warning(f"Actor {actor.uid} received KBI")
log.cancel(
'Actor received KBI\n'
+
actor_info
)
finally:
log.info(f"Actor {actor.uid} terminated")
log.info(
'Actor terminated\n'
+
actor_info
)

View File

@ -14,22 +14,34 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
"""
'''
Our classy exception set.
"""
'''
from __future__ import annotations
import builtins
import importlib
from pprint import pformat
from typing import (
Any,
Optional,
Type,
TYPE_CHECKING,
)
import importlib
import builtins
import textwrap
import traceback
import exceptiongroup as eg
import trio
from tractor._state import current_actor
from tractor.log import get_logger
if TYPE_CHECKING:
from ._context import Context
from .log import StackLevelAdapter
from ._stream import MsgStream
from ._ipc import Channel
log = get_logger('tractor')
_this_mod = importlib.import_module(__name__)
@ -38,36 +50,381 @@ class ActorFailure(Exception):
"General actor failure"
class InternalError(RuntimeError):
'''
Entirely unexpected internal machinery error indicating
a completely invalid state or interface.
'''
_body_fields: list[str] = [
'boxed_type',
'src_type',
# TODO: format this better if we're going to include it.
# 'relay_path',
'src_uid',
# only in sub-types
'canceller',
'sender',
]
_msgdata_keys: list[str] = [
'boxed_type_str',
] + _body_fields
def get_err_type(type_name: str) -> BaseException|None:
'''
Look up an exception type by name from the set of locally
known namespaces:
- `builtins`
- `tractor._exceptions`
- `trio`
'''
for ns in [
builtins,
_this_mod,
trio,
]:
if type_ref := getattr(
ns,
type_name,
False,
):
return type_ref
# TODO: rename to just `RemoteError`?
class RemoteActorError(Exception):
# TODO: local recontruction of remote exception deats
"Remote actor exception bundled locally"
'''
A box(ing) type which bundles a remote actor `BaseException` for
(near identical, and only if possible,) local object/instance
re-construction in the local process memory domain.
Normally each instance is expected to be constructed from
a special "error" IPC msg sent by some remote actor-runtime.
'''
reprol_fields: list[str] = [
'src_uid',
'relay_path',
]
def __init__(
self,
message: str,
suberror_type: Optional[Type[BaseException]] = None,
boxed_type: Type[BaseException]|None = None,
**msgdata
) -> None:
super().__init__(message)
self.type = suberror_type
self.msgdata = msgdata
# TODO: maybe a better name?
# - .errtype
# - .retype
# - .boxed_errtype
# - .boxed_type
# - .remote_type
# also pertains to our long long oustanding issue XD
# https://github.com/goodboy/tractor/issues/5
#
# TODO: always set ._boxed_type` as `None` by default
# and instead render if from `.boxed_type_str`?
self._boxed_type: BaseException = boxed_type
self._src_type: BaseException|None = None
self.msgdata: dict[str, Any] = msgdata
# TODO: mask out eventually or place in `pack_error()`
# pre-`return` lines?
# sanity on inceptions
if boxed_type is RemoteActorError:
assert self.src_type_str != 'RemoteActorError'
assert self.src_uid not in self.relay_path
# ensure type-str matches and round-tripping from that
# str results in same error type.
#
# TODO NOTE: this is currently exclusively for the
# `ContextCancelled(boxed_type=trio.Cancelled)` case as is
# used inside `._rpc._invoke()` atm though probably we
# should better emphasize that special (one off?) case
# either by customizing `ContextCancelled.__init__()` or
# through a special factor func?
elif boxed_type:
if not self.msgdata.get('boxed_type_str'):
self.msgdata['boxed_type_str'] = str(
type(boxed_type).__name__
)
assert self.boxed_type_str == self.msgdata['boxed_type_str']
assert self.boxed_type is boxed_type
@property
def src_type_str(self) -> str:
'''
String-name of the source error's type.
This should be the same as `.boxed_type_str` when unpacked
at the first relay/hop's receiving actor.
'''
return self.msgdata['src_type_str']
@property
def src_type(self) -> str:
'''
Error type raised by original remote faulting actor.
'''
if self._src_type is None:
self._src_type = get_err_type(
self.msgdata['src_type_str']
)
return self._src_type
@property
def boxed_type_str(self) -> str:
'''
String-name of the (last hop's) boxed error type.
'''
return self.msgdata['boxed_type_str']
@property
def boxed_type(self) -> str:
'''
Error type boxed by last actor IPC hop.
'''
if self._boxed_type is None:
self._boxed_type = get_err_type(
self.msgdata['boxed_type_str']
)
return self._boxed_type
@property
def relay_path(self) -> list[tuple]:
'''
Return the list of actors which consecutively relayed
a boxed `RemoteActorError` the src error up until THIS
actor's hop.
NOTE: a `list` field with the same name is expected to be
passed/updated in `.msgdata`.
'''
return self.msgdata['relay_path']
@property
def relay_uid(self) -> tuple[str, str]|None:
return tuple(
self.msgdata['relay_path'][-1]
)
@property
def src_uid(self) -> tuple[str, str]|None:
if src_uid := (
self.msgdata.get('src_uid')
):
return tuple(src_uid)
# TODO: use path lookup instead?
# return tuple(
# self.msgdata['relay_path'][0]
# )
@property
def tb_str(
self,
indent: str = ' '*3,
) -> str:
if remote_tb := self.msgdata.get('tb_str'):
return textwrap.indent(
remote_tb,
prefix=indent,
)
return ''
def _mk_fields_str(
self,
fields: list[str],
end_char: str = '\n',
) -> str:
_repr: str = ''
for key in fields:
val: Any|None = (
getattr(self, key, None)
or
self.msgdata.get(key)
)
# TODO: for `.relay_path` on multiline?
# if not isinstance(val, str):
# val_str = pformat(val)
# else:
val_str: str = repr(val)
if val:
_repr += f'{key}={val_str}{end_char}'
return _repr
def reprol(self) -> str:
'''
Represent this error for "one line" display, like in
a field of our `Context.__repr__()` output.
'''
# TODO: use this matryoshka emjoi XD
# => 🪆
reprol_str: str = f'{type(self).__name__}('
_repr: str = self._mk_fields_str(
self.reprol_fields,
end_char=' ',
)
return (
reprol_str
+
_repr
)
def __repr__(self) -> str:
'''
Nicely formatted boxed error meta data + traceback.
'''
fields: str = self._mk_fields_str(
_body_fields,
)
fields: str = textwrap.indent(
fields,
# prefix=' '*2,
prefix=' |_',
)
indent: str = ''*1
body: str = (
f'{fields}'
f' |\n'
f' ------ - ------\n\n'
f'{self.tb_str}\n'
f' ------ - ------\n'
f' _|\n'
)
if indent:
body: str = textwrap.indent(
body,
prefix=indent,
)
return (
f'<{type(self).__name__}(\n'
f'{body}'
')>'
)
def unwrap(
self,
) -> BaseException:
'''
Unpack the inner-most source error from it's original IPC msg data.
We attempt to reconstruct (as best as we can) the original
`Exception` from as it would have been raised in the
failing actor's remote env.
'''
src_type_ref: Type[BaseException] = self.src_type
if not src_type_ref:
raise TypeError(
'Failed to lookup src error type:\n'
f'{self.src_type_str}'
)
# TODO: better tb insertion and all the fancier dunder
# metadata stuff as per `.__context__` etc. and friends:
# https://github.com/python-trio/trio/issues/611
return src_type_ref(self.tb_str)
# TODO: local recontruction of nested inception for a given
# "hop" / relay-node in this error's relay_path?
# => so would render a `RAE[RAE[RAE[Exception]]]` instance
# with all inner errors unpacked?
# -[ ] if this is useful shouldn't be too hard to impl right?
# def unbox(self) -> BaseException:
# '''
# Unbox to the prior relays (aka last boxing actor's)
# inner error.
# '''
# if not self.relay_path:
# return self.unwrap()
# # TODO..
# # return self.boxed_type(
# # boxed_type=get_type_ref(..
# raise NotImplementedError
class InternalActorError(RemoteActorError):
"""Remote internal ``tractor`` error indicating
failure of some primitive or machinery.
"""
'''
(Remote) internal `tractor` error indicating failure of some
primitive, machinery state or lowlevel task that should never
occur.
'''
class ContextCancelled(RemoteActorError):
'''
Inter-actor task context was cancelled by either a call to
``Portal.cancel_actor()`` or ``Context.cancel()``.
'''
reprol_fields: list[str] = [
'canceller',
]
@property
def canceller(self) -> tuple[str, str]|None:
'''
Return the (maybe) `Actor.uid` for the requesting-author
of this ctxc.
Emit a warning msg when `.canceller` has not been set,
which usually idicates that a `None` msg-loop setinel was
sent before expected in the runtime. This can happen in
a few situations:
- (simulating) an IPC transport network outage
- a (malicious) pkt sent specifically to cancel an actor's
runtime non-gracefully without ensuring ongoing RPC tasks are
incrementally cancelled as is done with:
`Actor`
|_`.cancel()`
|_`.cancel_soon()`
|_`._cancel_task()`
'''
value = self.msgdata.get('canceller')
if value:
return tuple(value)
log.warning(
'IPC Context cancelled without a requesting actor?\n'
'Maybe the IPC transport ended abruptly?\n\n'
f'{self}'
)
# TODO: to make `.__repr__()` work uniformly?
# src_actor_uid = canceller
class TransportClosed(trio.ClosedResourceError):
"Underlying channel transport was closed prior to use"
class ContextCancelled(RemoteActorError):
"Inter-actor task context cancelled itself on the callee side."
class NoResult(RuntimeError):
"No final result is expected for this actor"
@ -80,8 +437,22 @@ class NoRuntime(RuntimeError):
"The root actor has not been initialized yet"
class StreamOverrun(trio.TooSlowError):
"This stream was overrun by sender"
class StreamOverrun(
RemoteActorError,
trio.TooSlowError,
):
reprol_fields: list[str] = [
'sender',
]
'''
This stream was overrun by sender
'''
@property
def sender(self) -> tuple[str, str] | None:
value = self.msgdata.get('sender')
if value:
return tuple(value)
class AsyncioCancelled(Exception):
@ -92,71 +463,141 @@ class AsyncioCancelled(Exception):
'''
class MessagingError(Exception):
'Some kind of unexpected SC messaging dialog issue'
def pack_error(
exc: BaseException,
tb=None,
exc: BaseException|RemoteActorError,
) -> dict[str, Any]:
"""Create an "error message" for tranmission over
a channel (aka the wire).
"""
tb: str|None = None,
cid: str|None = None,
) -> dict[str, dict]:
'''
Create an "error message" which boxes a locally caught
exception's meta-data and encodes it for wire transport via an
IPC `Channel`; expected to be unpacked (and thus unboxed) on
the receiver side using `unpack_error()` below.
'''
if tb:
tb_str = ''.join(traceback.format_tb(tb))
else:
tb_str = traceback.format_exc()
return {
'error': {
'tb_str': tb_str,
'type_str': type(exc).__name__,
}
}
error_msg: dict[ # for IPC
str,
str | tuple[str, str]
] = {}
our_uid: tuple = current_actor().uid
if (
isinstance(exc, RemoteActorError)
):
error_msg.update(exc.msgdata)
# an onion/inception we need to pack
if (
type(exc) is RemoteActorError
and (boxed := exc.boxed_type)
and boxed != RemoteActorError
):
# sanity on source error (if needed when tweaking this)
assert (src_type := exc.src_type) != RemoteActorError
assert error_msg['src_type_str'] != 'RemoteActorError'
assert error_msg['src_type_str'] == src_type.__name__
assert error_msg['src_uid'] != our_uid
# set the boxed type to be another boxed type thus
# creating an "inception" when unpacked by
# `unpack_error()` in another actor who gets "relayed"
# this error Bo
#
# NOTE on WHY: since we are re-boxing and already
# boxed src error, we want to overwrite the original
# `boxed_type_str` and instead set it to the type of
# the input `exc` type.
error_msg['boxed_type_str'] = 'RemoteActorError'
else:
error_msg['src_uid'] = our_uid
error_msg['src_type_str'] = type(exc).__name__
error_msg['boxed_type_str'] = type(exc).__name__
# XXX alawys append us the last relay in error propagation path
error_msg.setdefault(
'relay_path',
[],
).append(our_uid)
# XXX NOTE: always ensure the traceback-str is from the
# locally raised error (**not** the prior relay's boxed
# content's `.msgdata`).
error_msg['tb_str'] = tb_str
pkt: dict = {'error': error_msg}
if cid:
pkt['cid'] = cid
return pkt
def unpack_error(
msg: dict[str, Any],
chan=None,
err_type=RemoteActorError
) -> Exception:
chan: Channel|None = None,
box_type: RemoteActorError = RemoteActorError,
hide_tb: bool = True,
) -> None|Exception:
'''
Unpack an 'error' message from the wire
into a local ``RemoteActorError``.
into a local `RemoteActorError` (subtype).
NOTE: this routine DOES not RAISE the embedded remote error,
which is the responsibilitiy of the caller.
'''
__tracebackhide__ = True
error = msg['error']
__tracebackhide__: bool = hide_tb
tb_str = error.get('tb_str', '')
message = f"{chan.uid}\n" + tb_str
type_name = error['type_str']
suberror_type: Type[BaseException] = Exception
error_dict: dict[str, dict] | None
if (
error_dict := msg.get('error')
) is None:
# no error field, nothing to unpack.
return None
if type_name == 'ContextCancelled':
err_type = ContextCancelled
suberror_type = trio.Cancelled
# retrieve the remote error's msg encoded details
tb_str: str = error_dict.get('tb_str', '')
message: str = (
f'{chan.uid}\n'
+
tb_str
)
else: # try to lookup a suitable local error type
for ns in [
builtins,
_this_mod,
eg,
trio,
]:
try:
suberror_type = getattr(ns, type_name)
break
except AttributeError:
continue
# try to lookup a suitable error type from the local runtime
# env then use it to construct a local instance.
boxed_type_str: str = error_dict['boxed_type_str']
boxed_type: Type[BaseException] = get_err_type(boxed_type_str)
exc = err_type(
if boxed_type_str == 'ContextCancelled':
box_type = ContextCancelled
assert boxed_type is box_type
# TODO: already included by `_this_mod` in else loop right?
#
# we have an inception/onion-error so ensure
# we include the relay_path info and the
# original source error.
elif boxed_type_str == 'RemoteActorError':
assert boxed_type is RemoteActorError
assert len(error_dict['relay_path']) >= 1
exc = box_type(
message,
suberror_type=suberror_type,
# unpack other fields into error type init
**msg['error'],
**error_dict,
)
return exc
@ -164,14 +605,132 @@ def unpack_error(
def is_multi_cancelled(exc: BaseException) -> bool:
'''
Predicate to determine if a possible ``eg.BaseExceptionGroup`` contains
Predicate to determine if a possible ``BaseExceptionGroup`` contains
only ``trio.Cancelled`` sub-exceptions (and is likely the result of
cancelling a collection of subtasks.
'''
if isinstance(exc, eg.BaseExceptionGroup):
# if isinstance(exc, eg.BaseExceptionGroup):
if isinstance(exc, BaseExceptionGroup):
return exc.subgroup(
lambda exc: isinstance(exc, trio.Cancelled)
) is not None
return False
def _raise_from_no_key_in_msg(
ctx: Context,
msg: dict,
src_err: KeyError,
log: StackLevelAdapter, # caller specific `log` obj
expect_key: str = 'yield',
stream: MsgStream | None = None,
# allow "deeper" tbs when debugging B^o
hide_tb: bool = True,
) -> bool:
'''
Raise an appopriate local error when a
`MsgStream` msg arrives which does not
contain the expected (at least under normal
operation) `'yield'` field.
`Context` and any embedded `MsgStream` termination,
as well as remote task errors are handled in order
of priority as:
- any 'error' msg is re-boxed and raised locally as
-> `RemoteActorError`|`ContextCancelled`
- a `MsgStream` 'stop' msg is constructed, assigned
and raised locally as -> `trio.EndOfChannel`
- All other mis-keyed msgss (like say a "final result"
'return' msg, normally delivered from `Context.result()`)
are re-boxed inside a `MessagingError` with an explicit
exc content describing the missing IPC-msg-key.
'''
__tracebackhide__: bool = hide_tb
# an internal error should never get here
try:
cid: str = msg['cid']
except KeyError as src_err:
raise MessagingError(
f'IPC `Context` rx-ed msg without a ctx-id (cid)!?\n'
f'cid: {cid}\n\n'
f'{pformat(msg)}\n'
) from src_err
# TODO: test that shows stream raising an expected error!!!
# raise the error message in a boxed exception type!
if msg.get('error'):
raise unpack_error(
msg,
ctx.chan,
hide_tb=hide_tb,
) from None
# `MsgStream` termination msg.
# TODO: does it make more sense to pack
# the stream._eoc outside this in the calleer always?
elif (
msg.get('stop')
or (
stream
and stream._eoc
)
):
log.debug(
f'Context[{cid}] stream was stopped by remote side\n'
f'cid: {cid}\n'
)
# TODO: if the a local task is already blocking on
# a `Context.result()` and thus a `.receive()` on the
# rx-chan, we close the chan and set state ensuring that
# an eoc is raised!
# XXX: this causes ``ReceiveChannel.__anext__()`` to
# raise a ``StopAsyncIteration`` **and** in our catch
# block below it will trigger ``.aclose()``.
eoc = trio.EndOfChannel(
f'Context stream ended due to msg:\n\n'
f'{pformat(msg)}\n'
)
# XXX: important to set so that a new `.receive()`
# call (likely by another task using a broadcast receiver)
# doesn't accidentally pull the `return` message
# value out of the underlying feed mem chan which is
# destined for the `Context.result()` call during ctx-exit!
stream._eoc: Exception = eoc
# in case there already is some underlying remote error
# that arrived which is probably the source of this stream
# closure
ctx.maybe_raise()
raise eoc from src_err
if (
stream
and stream._closed
):
raise trio.ClosedResourceError('This stream was closed')
# always re-raise the source error if no translation error case
# is activated above.
_type: str = 'Stream' if stream else 'Context'
raise MessagingError(
f"{_type} was expecting a '{expect_key}' message"
" BUT received a non-error msg:\n"
f'{pformat(msg)}'
) from src_err

View File

@ -19,34 +19,33 @@ Inter-process comms abstractions
"""
from __future__ import annotations
import platform
import struct
import typing
from collections.abc import (
AsyncGenerator,
AsyncIterator,
)
from contextlib import asynccontextmanager as acm
import platform
from pprint import pformat
import struct
import typing
from typing import (
Any,
runtime_checkable,
Optional,
Protocol,
Type,
TypeVar,
)
from tricycle import BufferedReceiveStream
import msgspec
from tricycle import BufferedReceiveStream
import trio
from async_generator import asynccontextmanager
from .log import get_logger
from ._exceptions import TransportClosed
from tractor.log import get_logger
from tractor._exceptions import TransportClosed
log = get_logger(__name__)
_is_windows = platform.system() == 'Windows'
log = get_logger(__name__)
def get_stream_addrs(stream: trio.SocketStream) -> tuple:
@ -112,6 +111,13 @@ class MsgpackTCPStream(MsgTransport):
using the ``msgspec`` codec lib.
'''
layer_key: int = 4
name_key: str = 'tcp'
# TODO: better naming for this?
# -[ ] check how libp2p does naming for such things?
codec_key: str = 'msgpack'
def __init__(
self,
stream: trio.SocketStream,
@ -199,7 +205,17 @@ class MsgpackTCPStream(MsgTransport):
else:
raise
async def send(self, msg: Any) -> None:
async def send(
self,
msg: Any,
# hide_tb: bool = False,
) -> None:
'''
Send a msgpack coded blob-as-msg over TCP.
'''
# __tracebackhide__: bool = hide_tb
async with self._send_lock:
bytes_data: bytes = self.encode(msg)
@ -267,7 +283,7 @@ class Channel:
def __init__(
self,
destaddr: Optional[tuple[str, int]],
destaddr: tuple[str, int]|None,
msg_transport_type_key: tuple[str, str] = ('msgpack', 'tcp'),
@ -285,18 +301,29 @@ class Channel:
# Either created in ``.connect()`` or passed in by
# user in ``.from_stream()``.
self._stream: Optional[trio.SocketStream] = None
self.msgstream: Optional[MsgTransport] = None
self._stream: trio.SocketStream|None = None
self._transport: MsgTransport|None = None
# set after handshake - always uid of far end
self.uid: Optional[tuple[str, str]] = None
self.uid: tuple[str, str]|None = None
self._agen = self._aiter_recv()
self._exc: Optional[Exception] = None # set if far end actor errors
self._exc: Exception|None = None # set if far end actor errors
self._closed: bool = False
# flag set on ``Portal.cancel_actor()`` indicating
# remote (peer) cancellation of the far end actor runtime.
self._cancel_called: bool = False # set on ``Portal.cancel_actor()``
# flag set by ``Portal.cancel_actor()`` indicating remote
# (possibly peer) cancellation of the far end actor
# runtime.
self._cancel_called: bool = False
@property
def msgstream(self) -> MsgTransport:
log.info('`Channel.msgstream` is an old name, use `._transport`')
return self._transport
@property
def transport(self) -> MsgTransport:
return self._transport
@classmethod
def from_stream(
@ -307,37 +334,44 @@ class Channel:
) -> Channel:
src, dst = get_stream_addrs(stream)
chan = Channel(destaddr=dst, **kwargs)
chan = Channel(
destaddr=dst,
**kwargs,
)
# set immediately here from provided instance
chan._stream = stream
chan._stream: trio.SocketStream = stream
chan.set_msg_transport(stream)
return chan
def set_msg_transport(
self,
stream: trio.SocketStream,
type_key: Optional[tuple[str, str]] = None,
type_key: tuple[str, str]|None = None,
) -> MsgTransport:
type_key = type_key or self._transport_key
self.msgstream = get_msg_transport(type_key)(stream)
return self.msgstream
self._transport = get_msg_transport(type_key)(stream)
return self._transport
def __repr__(self) -> str:
if self.msgstream:
return repr(
self.msgstream.stream.socket._sock).replace( # type: ignore
"socket.socket", "Channel")
return object.__repr__(self)
if not self._transport:
return '<Channel with inactive transport?>'
return repr(
self._transport.stream.socket._sock
).replace( # type: ignore
"socket.socket",
"Channel",
)
@property
def laddr(self) -> Optional[tuple[str, int]]:
return self.msgstream.laddr if self.msgstream else None
def laddr(self) -> tuple[str, int]|None:
return self._transport.laddr if self._transport else None
@property
def raddr(self) -> Optional[tuple[str, int]]:
return self.msgstream.raddr if self.msgstream else None
def raddr(self) -> tuple[str, int]|None:
return self._transport.raddr if self._transport else None
async def connect(
self,
@ -356,26 +390,42 @@ class Channel:
*destaddr,
**kwargs
)
msgstream = self.set_msg_transport(stream)
transport = self.set_msg_transport(stream)
log.transport(
f'Opened channel[{type(msgstream)}]: {self.laddr} -> {self.raddr}'
f'Opened channel[{type(transport)}]: {self.laddr} -> {self.raddr}'
)
return msgstream
return transport
async def send(self, item: Any) -> None:
async def send(
self,
payload: Any,
log.transport(f"send `{item}`") # type: ignore
assert self.msgstream
# hide_tb: bool = False,
await self.msgstream.send(item)
) -> None:
'''
Send a coded msg-blob over the transport.
'''
# __tracebackhide__: bool = hide_tb
log.transport(
'=> send IPC msg:\n\n'
f'{pformat(payload)}\n'
) # type: ignore
assert self._transport
await self._transport.send(
payload,
# hide_tb=hide_tb,
)
async def recv(self) -> Any:
assert self.msgstream
return await self.msgstream.recv()
assert self._transport
return await self._transport.recv()
# try:
# return await self.msgstream.recv()
# return await self._transport.recv()
# except trio.BrokenResourceError:
# if self._autorecon:
# await self._reconnect()
@ -388,8 +438,8 @@ class Channel:
f'Closing channel to {self.uid} '
f'{self.laddr} -> {self.raddr}'
)
assert self.msgstream
await self.msgstream.stream.aclose()
assert self._transport
await self._transport.stream.aclose()
self._closed = True
async def __aenter__(self):
@ -440,16 +490,16 @@ class Channel:
Async iterate items from underlying stream.
'''
assert self.msgstream
assert self._transport
while True:
try:
async for item in self.msgstream:
async for item in self._transport:
yield item
# sent = yield item
# if sent is not None:
# # optimization, passing None through all the
# # time is pointless
# await self.msgstream.send(sent)
# await self._transport.send(sent)
except trio.BrokenResourceError:
# if not self._autorecon:
@ -462,12 +512,14 @@ class Channel:
# continue
def connected(self) -> bool:
return self.msgstream.connected() if self.msgstream else False
return self._transport.connected() if self._transport else False
@asynccontextmanager
@acm
async def _connect_chan(
host: str, port: int
host: str,
port: int
) -> typing.AsyncGenerator[Channel, None]:
'''
Create and connect a channel with disconnect on context manager

View File

@ -0,0 +1,151 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
Multiaddress parser and utils according the spec(s) defined by
`libp2p` and used in dependent project such as `ipfs`:
- https://docs.libp2p.io/concepts/fundamentals/addressing/
- https://github.com/libp2p/specs/blob/master/addressing/README.md
'''
from typing import Iterator
from bidict import bidict
# TODO: see if we can leverage libp2p ecosys projects instead of
# rolling our own (parser) impls of the above addressing specs:
# - https://github.com/libp2p/py-libp2p
# - https://docs.libp2p.io/concepts/nat/circuit-relay/#relay-addresses
# prots: bidict[int, str] = bidict({
prots: bidict[int, str] = {
'ipv4': 3,
'ipv6': 3,
'wg': 3,
'tcp': 4,
'udp': 4,
# TODO: support the next-gen shite Bo
# 'quic': 4,
# 'ssh': 7, # via rsyscall bootstrapping
}
prot_params: dict[str, tuple[str]] = {
'ipv4': ('addr',),
'ipv6': ('addr',),
'wg': ('addr', 'port', 'pubkey'),
'tcp': ('port',),
'udp': ('port',),
# 'quic': ('port',),
# 'ssh': ('port',),
}
def iter_prot_layers(
multiaddr: str,
) -> Iterator[
tuple[
int,
list[str]
]
]:
'''
Unpack a libp2p style "multiaddress" into multiple "segments"
for each "layer" of the protocoll stack (in OSI terms).
'''
tokens: list[str] = multiaddr.split('/')
root, tokens = tokens[0], tokens[1:]
assert not root # there is a root '/' on LHS
itokens = iter(tokens)
prot: str | None = None
params: list[str] = []
for token in itokens:
# every prot path should start with a known
# key-str.
if token in prots:
if prot is None:
prot: str = token
else:
yield prot, params
prot = token
params = []
elif token not in prots:
params.append(token)
else:
yield prot, params
def parse_maddr(
multiaddr: str,
) -> dict[str, str | int | dict]:
'''
Parse a libp2p style "multiaddress" into its distinct protocol
segments where each segment is of the form:
`../<protocol>/<param0>/<param1>/../<paramN>`
and is loaded into a (order preserving) `layers: dict[str,
dict[str, Any]` which holds each protocol-layer-segment of the
original `str` path as a separate entry according to its approx
OSI "layer number".
Any `paramN` in the path must be distinctly defined by a str-token in the
(module global) `prot_params` table.
For eg. for wireguard which requires an address, port number and publickey
the protocol params are specified as the entry:
'wg': ('addr', 'port', 'pubkey'),
and are thus parsed from a maddr in that order:
`'/wg/1.1.1.1/51820/<pubkey>'`
'''
layers: dict[str, str | int | dict] = {}
for (
prot_key,
params,
) in iter_prot_layers(multiaddr):
layer: int = prots[prot_key] # OSI layer used for sorting
ep: dict[str, int | str] = {'layer': layer}
layers[prot_key] = ep
# TODO; validation and resolving of names:
# - each param via a validator provided as part of the
# prot_params def? (also see `"port"` case below..)
# - do a resolv step that will check addrs against
# any loaded network.resolv: dict[str, str]
rparams: list = list(reversed(params))
for key in prot_params[prot_key]:
val: str | int = rparams.pop()
# TODO: UGHH, dunno what we should do for validation
# here, put it in the params spec somehow?
if key == 'port':
val = int(val)
ep[key] = val
return layers

View File

@ -15,38 +15,46 @@
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
Memory boundary "Portals": an API for structured
concurrency linked tasks running in disparate memory domains.
Memory "portal" contruct.
"Memory portals" are both an API and set of IPC wrapping primitives
for managing structured concurrency "cancel-scope linked" tasks
running in disparate virtual memory domains - at least in different
OS processes, possibly on different (hardware) hosts.
'''
from __future__ import annotations
from contextlib import asynccontextmanager as acm
import importlib
import inspect
from typing import (
Any, Optional,
Callable, AsyncGenerator,
Type,
Any,
Callable,
AsyncGenerator,
# Type,
)
from functools import partial
from dataclasses import dataclass
from pprint import pformat
import warnings
import trio
from async_generator import asynccontextmanager
from .trionics import maybe_open_nursery
from ._state import current_actor
from ._state import (
current_actor,
)
from ._ipc import Channel
from .log import get_logger
from .msg import NamespacePath
from ._exceptions import (
unpack_error,
NoResult,
ContextCancelled,
)
from ._context import (
Context,
open_context_from_portal,
)
from ._streaming import (
Context,
MsgStream,
)
@ -54,34 +62,47 @@ from ._streaming import (
log = get_logger(__name__)
# TODO: rename to `unwrap_result()` and use
# `._raise_from_no_key_in_msg()` (after tweak to
# accept a `chan: Channel` arg) in key block!
def _unwrap_msg(
msg: dict[str, Any],
channel: Channel
channel: Channel,
hide_tb: bool = True,
) -> Any:
__tracebackhide__ = True
'''
Unwrap a final result from a `{return: <Any>}` IPC msg.
'''
__tracebackhide__: bool = hide_tb
try:
return msg['return']
except KeyError:
except KeyError as ke:
# internal error should never get here
assert msg.get('cid'), "Received internal error at portal?"
raise unpack_error(msg, channel) from None
assert msg.get('cid'), (
"Received internal error at portal?"
)
class MessagingError(Exception):
'Some kind of unexpected SC messaging dialog issue'
raise unpack_error(
msg,
channel
) from ke
class Portal:
'''
A 'portal' to a(n) (remote) ``Actor``.
A 'portal' to a memory-domain-separated `Actor`.
A portal is "opened" (and eventually closed) by one side of an
inter-actor communication context. The side which opens the portal
is equivalent to a "caller" in function parlance and usually is
either the called actor's parent (in process tree hierarchy terms)
or a client interested in scheduling work to be done remotely in a
far process.
process which has a separate (virtual) memory domain.
The portal api allows the "caller" actor to invoke remote routines
and receive results through an underlying ``tractor.Channel`` as
@ -91,22 +112,34 @@ class Portal:
like having a "portal" between the seperate actor memory spaces.
'''
# the timeout for a remote cancel request sent to
# a(n) (peer) actor.
cancel_timeout = 0.5
# global timeout for remote cancel requests sent to
# connected (peer) actors.
cancel_timeout: float = 0.5
def __init__(self, channel: Channel) -> None:
self.channel = channel
self.chan = channel
# during the portal's lifetime
self._result_msg: Optional[dict] = None
self._result_msg: dict|None = None
# When set to a ``Context`` (when _submit_for_result is called)
# it is expected that ``result()`` will be awaited at some
# point.
self._expect_result: Optional[Context] = None
self._expect_result: Context | None = None
self._streams: set[MsgStream] = set()
self.actor = current_actor()
@property
def channel(self) -> Channel:
'''
Proxy to legacy attr name..
Consider the shorter `Portal.chan` instead of `.channel` ;)
'''
log.debug(
'Consider the shorter `Portal.chan` instead of `.channel` ;)'
)
return self.chan
async def _submit_for_result(
self,
ns: str,
@ -114,14 +147,14 @@ class Portal:
**kwargs
) -> None:
assert self._expect_result is None, \
"A pending main result has already been submitted"
assert self._expect_result is None, (
"A pending main result has already been submitted"
)
self._expect_result = await self.actor.start_remote_task(
self.channel,
ns,
func,
kwargs
nsf=NamespacePath(f'{ns}:{func}'),
kwargs=kwargs
)
async def _return_once(
@ -131,7 +164,7 @@ class Portal:
) -> dict[str, Any]:
assert ctx._remote_func_type == 'asyncfunc' # single response
msg = await ctx._recv_chan.receive()
msg: dict = await ctx._recv_chan.receive()
return msg
async def result(self) -> Any:
@ -162,7 +195,10 @@ class Portal:
self._expect_result
)
return _unwrap_msg(self._result_msg, self.channel)
return _unwrap_msg(
self._result_msg,
self.channel,
)
async def _cancel_streams(self):
# terminate all locally running async generator
@ -193,30 +229,57 @@ class Portal:
) -> bool:
'''
Cancel the actor on the other end of this portal.
Cancel the actor runtime (and thus process) on the far
end of this portal.
**NOTE** THIS CANCELS THE ENTIRE RUNTIME AND THE
SUBPROCESS, it DOES NOT just cancel the remote task. If you
want to have a handle to cancel a remote ``tri.Task`` look
at `.open_context()` and the definition of
`._context.Context.cancel()` which CAN be used for this
purpose.
'''
if not self.channel.connected():
log.cancel("This channel is already closed can't cancel")
chan: Channel = self.channel
if not chan.connected():
log.runtime(
'This channel is already closed, skipping cancel request..'
)
return False
reminfo: str = (
f'`Portal.cancel_actor()` => {self.channel.uid}\n'
f' |_{chan}\n'
)
log.cancel(
f"Sending actor cancel request to {self.channel.uid} on "
f"{self.channel}")
self.channel._cancel_called = True
f'Sending runtime `.cancel()` request to peer\n\n'
f'{reminfo}'
)
self.channel._cancel_called: bool = True
try:
# send cancel cmd - might not get response
# XXX: sure would be nice to make this work with a proper shield
with trio.move_on_after(timeout or self.cancel_timeout) as cs:
cs.shield = True
await self.run_from_ns('self', 'cancel')
# XXX: sure would be nice to make this work with
# a proper shield
with trio.move_on_after(
timeout
or
self.cancel_timeout
) as cs:
cs.shield: bool = True
await self.run_from_ns(
'self',
'cancel',
)
return True
if cs.cancelled_caught:
log.cancel(f"May have failed to cancel {self.channel.uid}")
# may timeout and we never get an ack (obvi racy)
# but that doesn't mean it wasn't cancelled.
log.debug(
'May have failed to cancel peer?\n'
f'{reminfo}'
)
# if we get here some weird cancellation case happened
return False
@ -225,9 +288,11 @@ class Portal:
trio.ClosedResourceError,
trio.BrokenResourceError,
):
log.cancel(
f"{self.channel} for {self.channel.uid} was already "
"closed or broken?")
log.debug(
'IPC chan for actor already closed or broken?\n\n'
f'{self.channel.uid}\n'
f' |_{self.channel}\n'
)
return False
async def run_from_ns(
@ -246,27 +311,33 @@ class Portal:
Note::
A special namespace `self` can be used to invoke `Actor`
instance methods in the remote runtime. Currently this
should only be used solely for ``tractor`` runtime
internals.
A special namespace `self` can be used to invoke `Actor`
instance methods in the remote runtime. Currently this
should only ever be used for `Actor` (method) runtime
internals!
'''
nsf = NamespacePath(
f'{namespace_path}:{function_name}'
)
ctx = await self.actor.start_remote_task(
self.channel,
namespace_path,
function_name,
kwargs,
chan=self.channel,
nsf=nsf,
kwargs=kwargs,
)
ctx._portal = self
msg = await self._return_once(ctx)
return _unwrap_msg(msg, self.channel)
return _unwrap_msg(
msg,
self.channel,
)
async def run(
self,
func: str,
fn_name: Optional[str] = None,
fn_name: str|None = None,
**kwargs
) -> Any:
'''
Submit a remote function to be scheduled and run by actor, in
@ -285,8 +356,9 @@ class Portal:
DeprecationWarning,
stacklevel=2,
)
fn_mod_path = func
fn_mod_path: str = func
assert isinstance(fn_name, str)
nsf = NamespacePath(f'{fn_mod_path}:{fn_name}')
else: # function reference was passed directly
if (
@ -299,13 +371,12 @@ class Portal:
raise TypeError(
f'{func} must be a non-streaming async function!')
fn_mod_path, fn_name = NamespacePath.from_ref(func).to_tuple()
nsf = NamespacePath.from_ref(func)
ctx = await self.actor.start_remote_task(
self.channel,
fn_mod_path,
fn_name,
kwargs,
nsf=nsf,
kwargs=kwargs,
)
ctx._portal = self
return _unwrap_msg(
@ -313,7 +384,7 @@ class Portal:
self.channel,
)
@asynccontextmanager
@acm
async def open_stream_from(
self,
async_gen_func: Callable, # typing: ignore
@ -329,13 +400,10 @@ class Portal:
raise TypeError(
f'{async_gen_func} must be an async generator function!')
fn_mod_path, fn_name = NamespacePath.from_ref(
async_gen_func).to_tuple()
ctx = await self.actor.start_remote_task(
ctx: Context = await self.actor.start_remote_task(
self.channel,
fn_mod_path,
fn_name,
kwargs
nsf=NamespacePath.from_ref(async_gen_func),
kwargs=kwargs,
)
ctx._portal = self
@ -345,7 +413,8 @@ class Portal:
try:
# deliver receive only stream
async with MsgStream(
ctx, ctx._recv_chan,
ctx=ctx,
rx_chan=ctx._recv_chan,
) as rchan:
self._streams.add(rchan)
yield rchan
@ -372,175 +441,12 @@ class Portal:
# await recv_chan.aclose()
self._streams.remove(rchan)
@asynccontextmanager
async def open_context(
self,
func: Callable,
**kwargs,
) -> AsyncGenerator[tuple[Context, Any], None]:
'''
Open an inter-actor task context.
This is a synchronous API which allows for deterministic
setup/teardown of a remote task. The yielded ``Context`` further
allows for opening bidirectional streams, explicit cancellation
and synchronized final result collection. See ``tractor.Context``.
'''
# conduct target func method structural checks
if not inspect.iscoroutinefunction(func) and (
getattr(func, '_tractor_contex_function', False)
):
raise TypeError(
f'{func} must be an async generator function!')
fn_mod_path, fn_name = NamespacePath.from_ref(func).to_tuple()
ctx = await self.actor.start_remote_task(
self.channel,
fn_mod_path,
fn_name,
kwargs
)
assert ctx._remote_func_type == 'context'
msg = await ctx._recv_chan.receive()
try:
# the "first" value here is delivered by the callee's
# ``Context.started()`` call.
first = msg['started']
ctx._started_called = True
except KeyError:
assert msg.get('cid'), ("Received internal error at context?")
if msg.get('error'):
# raise kerr from unpack_error(msg, self.channel)
raise unpack_error(msg, self.channel) from None
else:
raise MessagingError(
f'Context for {ctx.cid} was expecting a `started` message'
f' but received a non-error msg:\n{pformat(msg)}'
)
_err: Optional[BaseException] = None
ctx._portal = self
uid = self.channel.uid
cid = ctx.cid
etype: Optional[Type[BaseException]] = None
# deliver context instance and .started() msg value in open tuple.
try:
async with trio.open_nursery() as scope_nursery:
ctx._scope_nursery = scope_nursery
# do we need this?
# await trio.lowlevel.checkpoint()
yield ctx, first
except ContextCancelled as err:
_err = err
if not ctx._cancel_called:
# context was cancelled at the far end but was
# not part of this end requesting that cancel
# so raise for the local task to respond and handle.
raise
# if the context was cancelled by client code
# then we don't need to raise since user code
# is expecting this and the block should exit.
else:
log.debug(f'Context {ctx} cancelled gracefully')
except (
BaseException,
# more specifically, we need to handle these but not
# sure it's worth being pedantic:
# Exception,
# trio.Cancelled,
# KeyboardInterrupt,
) as err:
etype = type(err)
# the context cancels itself on any cancel
# causing error.
if ctx.chan.connected():
log.cancel(
'Context cancelled for task, sending cancel request..\n'
f'task:{cid}\n'
f'actor:{uid}'
)
await ctx.cancel()
else:
log.warning(
'IPC connection for context is broken?\n'
f'task:{cid}\n'
f'actor:{uid}'
)
raise
finally:
# in the case where a runtime nursery (due to internal bug)
# or a remote actor transmits an error we want to be
# sure we get the error the underlying feeder mem chan.
# if it's not raised here it *should* be raised from the
# msg loop nursery right?
if ctx.chan.connected():
log.info(
'Waiting on final context-task result for\n'
f'task: {cid}\n'
f'actor: {uid}'
)
result = await ctx.result()
log.runtime(
f'Context {fn_name} returned '
f'value from callee `{result}`'
)
# though it should be impossible for any tasks
# operating *in* this scope to have survived
# we tear down the runtime feeder chan last
# to avoid premature stream clobbers.
if ctx._recv_chan is not None:
# should we encapsulate this in the context api?
await ctx._recv_chan.aclose()
if etype:
if ctx._cancel_called:
log.cancel(
f'Context {fn_name} cancelled by caller with\n{etype}'
)
elif _err is not None:
log.cancel(
f'Context for task cancelled by callee with {etype}\n'
f'target: `{fn_name}`\n'
f'task:{cid}\n'
f'actor:{uid}'
)
# XXX: (MEGA IMPORTANT) if this is a root opened process we
# wait for any immediate child in debug before popping the
# context from the runtime msg loop otherwise inside
# ``Actor._push_result()`` the msg will be discarded and in
# the case where that msg is global debugger unlock (via
# a "stop" msg for a stream), this can result in a deadlock
# where the root is waiting on the lock to clear but the
# child has already cleared it and clobbered IPC.
from ._debug import maybe_wait_for_debugger
await maybe_wait_for_debugger()
# remove the context from runtime tracking
self.actor._contexts.pop(
(self.channel.uid, ctx.cid),
None,
)
# NOTE: impl is found in `._context`` mod to make
# reading/groking the details simpler code-org-wise. This
# method does not have to be used over that `@acm` module func
# directly, it is for conventience and from the original API
# design.
open_context = open_context_from_portal
@dataclass
@ -555,7 +461,12 @@ class LocalPortal:
actor: 'Actor' # type: ignore # noqa
channel: Channel
async def run_from_ns(self, ns: str, func_name: str, **kwargs) -> Any:
async def run_from_ns(
self,
ns: str,
func_name: str,
**kwargs,
) -> Any:
'''
Run a requested local function from a namespace path and
return it's result.
@ -566,11 +477,11 @@ class LocalPortal:
return await func(**kwargs)
@asynccontextmanager
@acm
async def open_portal(
channel: Channel,
nursery: Optional[trio.Nursery] = None,
nursery: trio.Nursery|None = None,
start_msg_loop: bool = True,
shield: bool = False,
@ -595,7 +506,7 @@ async def open_portal(
if channel.uid is None:
await actor._do_handshake(channel)
msg_loop_cs: Optional[trio.CancelScope] = None
msg_loop_cs: trio.CancelScope|None = None
if start_msg_loop:
from ._runtime import process_messages
msg_loop_cs = await nursery.start(

View File

@ -25,19 +25,19 @@ import logging
import signal
import sys
import os
import typing
import warnings
from exceptiongroup import BaseExceptionGroup
import trio
from ._runtime import (
Actor,
Arbiter,
# TODO: rename and make a non-actor subtype?
# Arbiter as Registry,
async_main,
)
from . import _debug
from .devx import _debug
from . import _spawn
from . import _state
from . import log
@ -46,8 +46,14 @@ from ._exceptions import is_multi_cancelled
# set at startup and after forks
_default_arbiter_host: str = '127.0.0.1'
_default_arbiter_port: int = 1616
_default_host: str = '127.0.0.1'
_default_port: int = 1616
# default registry always on localhost
_default_lo_addrs: list[tuple[str, int]] = [(
_default_host,
_default_port,
)]
logger = log.get_logger('tractor')
@ -58,38 +64,54 @@ async def open_root_actor(
*,
# defaults are above
arbiter_addr: tuple[str, int] | None = None,
registry_addrs: list[tuple[str, int]]|None = None,
# defaults are above
registry_addr: tuple[str, int] | None = None,
arbiter_addr: tuple[str, int]|None = None,
name: str | None = 'root',
name: str|None = 'root',
# either the `multiprocessing` start method:
# https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods
# OR `trio` (the new default).
start_method: _spawn.SpawnMethodKey | None = None,
start_method: _spawn.SpawnMethodKey|None = None,
# enables the multi-process debugger support
debug_mode: bool = False,
# internal logging
loglevel: str | None = None,
loglevel: str|None = None,
enable_modules: list | None = None,
rpc_module_paths: list | None = None,
enable_modules: list|None = None,
rpc_module_paths: list|None = None,
) -> typing.Any:
# NOTE: allow caller to ensure that only one registry exists
# and that this call creates it.
ensure_registry: bool = False,
) -> Actor:
'''
Runtime init entry point for ``tractor``.
'''
# TODO: stick this in a `@cm` defined in `devx._debug`?
#
# Override the global debugger hook to make it play nice with
# ``trio``, see much discussion in:
# https://github.com/python-trio/trio/issues/1155#issuecomment-742964018
builtin_bp_handler = sys.breakpointhook
orig_bp_path: str | None = os.environ.get('PYTHONBREAKPOINT', None)
os.environ['PYTHONBREAKPOINT'] = 'tractor._debug._set_trace'
if (
await _debug.maybe_init_greenback(
raise_not_found=False,
)
):
builtin_bp_handler = sys.breakpointhook
orig_bp_path: str|None = os.environ.get(
'PYTHONBREAKPOINT',
None,
)
os.environ['PYTHONBREAKPOINT'] = (
'tractor.devx._debug.pause_from_sync'
)
# attempt to retreive ``trio``'s sigint handler and stash it
# on our debugger lock state.
@ -99,7 +121,11 @@ async def open_root_actor(
_state._runtime_vars['_is_root'] = True
# caps based rpc list
enable_modules = enable_modules or []
enable_modules = (
enable_modules
or
[]
)
if rpc_module_paths:
warnings.warn(
@ -115,29 +141,34 @@ async def open_root_actor(
if arbiter_addr is not None:
warnings.warn(
'`arbiter_addr` is now deprecated and has been renamed to'
'`registry_addr`.\nUse that instead..',
'`arbiter_addr` is now deprecated\n'
'Use `registry_addrs: list[tuple]` instead..',
DeprecationWarning,
stacklevel=2,
)
registry_addrs = [arbiter_addr]
registry_addr = (host, port) = (
registry_addr
or arbiter_addr
or (
_default_arbiter_host,
_default_arbiter_port,
)
registry_addrs: list[tuple[str, int]] = (
registry_addrs
or
_default_lo_addrs
)
assert registry_addrs
loglevel = (loglevel or log._default_loglevel).upper()
loglevel = (
loglevel
or log._default_loglevel
).upper()
if debug_mode and _spawn._spawn_method == 'trio':
if (
debug_mode
and _spawn._spawn_method == 'trio'
):
_state._runtime_vars['_debug_mode'] = True
# expose internal debug module to every actor allowing
# for use of ``await tractor.breakpoint()``
enable_modules.append('tractor._debug')
# expose internal debug module to every actor allowing for
# use of ``await tractor.pause()``
enable_modules.append('tractor.devx._debug')
# if debug mode get's enabled *at least* use that level of
# logging for some informative console prompts.
@ -155,75 +186,146 @@ async def open_root_actor(
"Debug mode is only supported for the `trio` backend!"
)
log.get_console_log(loglevel)
assert loglevel
_log = log.get_console_log(loglevel)
assert _log
try:
# make a temporary connection to see if an arbiter exists,
# if one can't be made quickly we assume none exists.
arbiter_found = False
# TODO: factor this into `.devx._stackscope`!!
if debug_mode:
try:
logger.info('Enabling `stackscope` traces on SIGUSR1')
from .devx import enable_stack_on_sig
enable_stack_on_sig()
except ImportError:
logger.warning(
'`stackscope` not installed for use in debug mode!'
)
# TODO: this connect-and-bail forces us to have to carefully
# rewrap TCP 104-connection-reset errors as EOF so as to avoid
# propagating cancel-causing errors to the channel-msg loop
# machinery. Likely it would be better to eventually have
# a "discovery" protocol with basic handshake instead.
with trio.move_on_after(1):
async with _connect_chan(host, port):
arbiter_found = True
# closed into below ping task-func
ponged_addrs: list[tuple[str, int]] = []
except OSError:
# TODO: make this a "discovery" log level?
logger.warning(f"No actor registry found @ {host}:{port}")
async def ping_tpt_socket(
addr: tuple[str, int],
timeout: float = 1,
) -> None:
'''
Attempt temporary connection to see if a registry is
listening at the requested address by a tranport layer
ping.
# create a local actor and start up its main routine/task
if arbiter_found:
If a connection can't be made quickly we assume none no
server is listening at that addr.
'''
try:
# TODO: this connect-and-bail forces us to have to
# carefully rewrap TCP 104-connection-reset errors as
# EOF so as to avoid propagating cancel-causing errors
# to the channel-msg loop machinery. Likely it would
# be better to eventually have a "discovery" protocol
# with basic handshake instead?
with trio.move_on_after(timeout):
async with _connect_chan(*addr):
ponged_addrs.append(addr)
except OSError:
# TODO: make this a "discovery" log level?
logger.warning(f'No actor registry found @ {addr}')
async with trio.open_nursery() as tn:
for addr in registry_addrs:
tn.start_soon(
ping_tpt_socket,
tuple(addr), # TODO: just drop this requirement?
)
trans_bind_addrs: list[tuple[str, int]] = []
# Create a new local root-actor instance which IS NOT THE
# REGISTRAR
if ponged_addrs:
if ensure_registry:
raise RuntimeError(
f'Failed to open `{name}`@{ponged_addrs}: '
'registry socket(s) already bound'
)
# we were able to connect to an arbiter
logger.info(f"Arbiter seems to exist @ {host}:{port}")
logger.info(
f'Registry(s) seem(s) to exist @ {ponged_addrs}'
)
actor = Actor(
name or 'anonymous',
arbiter_addr=registry_addr,
name=name or 'anonymous',
registry_addrs=ponged_addrs,
loglevel=loglevel,
enable_modules=enable_modules,
)
host, port = (host, 0)
# DO NOT use the registry_addrs as the transport server
# addrs for this new non-registar, root-actor.
for host, port in ponged_addrs:
# NOTE: zero triggers dynamic OS port allocation
trans_bind_addrs.append((host, 0))
# Start this local actor as the "registrar", aka a regular
# actor who manages the local registry of "mailboxes" of
# other process-tree-local sub-actors.
else:
# start this local actor as the arbiter (aka a regular actor who
# manages the local registry of "mailboxes")
# Note that if the current actor is the arbiter it is desirable
# for it to stay up indefinitely until a re-election process has
# taken place - which is not implemented yet FYI).
# NOTE that if the current actor IS THE REGISTAR, the
# following init steps are taken:
# - the tranport layer server is bound to each (host, port)
# pair defined in provided registry_addrs, or the default.
trans_bind_addrs = registry_addrs
# - it is normally desirable for any registrar to stay up
# indefinitely until either all registered (child/sub)
# actors are terminated (via SC supervision) or,
# a re-election process has taken place.
# NOTE: all of ^ which is not implemented yet - see:
# https://github.com/goodboy/tractor/issues/216
# https://github.com/goodboy/tractor/pull/348
# https://github.com/goodboy/tractor/issues/296
actor = Arbiter(
name or 'arbiter',
arbiter_addr=registry_addr,
name or 'registrar',
registry_addrs=registry_addrs,
loglevel=loglevel,
enable_modules=enable_modules,
)
# Start up main task set via core actor-runtime nurseries.
try:
# assign process-local actor
_state._current_actor = actor
# start local channel-server and fake the portal API
# NOTE: this won't block since we provide the nursery
logger.info(f"Starting local {actor} @ {host}:{port}")
ml_addrs_str: str = '\n'.join(
f'@{addr}' for addr in trans_bind_addrs
)
logger.info(
f'Starting local {actor.uid} on the following transport addrs:\n'
f'{ml_addrs_str}'
)
# start the actor runtime in a new task
async with trio.open_nursery() as nursery:
# ``_runtime.async_main()`` creates an internal nursery and
# thus blocks here until the entire underlying actor tree has
# terminated thereby conducting structured concurrency.
# ``_runtime.async_main()`` creates an internal nursery
# and blocks here until any underlying actor(-process)
# tree has terminated thereby conducting so called
# "end-to-end" structured concurrency throughout an
# entire hierarchical python sub-process set; all
# "actor runtime" primitives are SC-compat and thus all
# transitively spawned actors/processes must be as
# well.
await nursery.start(
partial(
async_main,
actor,
accept_addr=(host, port),
accept_addrs=trans_bind_addrs,
parent_addr=None
)
)
@ -235,12 +337,16 @@ async def open_root_actor(
BaseExceptionGroup,
) as err:
entered = await _debug._maybe_enter_pm(err)
entered: bool = await _debug._maybe_enter_pm(err)
if not entered and not is_multi_cancelled(err):
logger.exception("Root actor crashed:")
if (
not entered
and not is_multi_cancelled(err)
):
logger.exception('Root actor crashed:\n')
# always re-raise
# ALWAYS re-raise any error bubbled up from the
# runtime!
raise
finally:
@ -253,12 +359,15 @@ async def open_root_actor(
# for an in nurseries:
# tempn.start_soon(an.exited.wait)
logger.cancel("Shutting down root actor")
await actor.cancel()
logger.info(
'Closing down root actor'
)
await actor.cancel(None) # self cancel
finally:
_state._current_actor = None
_state._last_actor_terminated = actor
# restore breakpoint hook state
# restore built-in `breakpoint()` hook state
sys.breakpointhook = builtin_bp_handler
if orig_bp_path is not None:
os.environ['PYTHONBREAKPOINT'] = orig_bp_path
@ -274,10 +383,7 @@ def run_daemon(
# runtime kwargs
name: str | None = 'root',
registry_addr: tuple[str, int] = (
_default_arbiter_host,
_default_arbiter_port,
),
registry_addrs: list[tuple[str, int]] = _default_lo_addrs,
start_method: str | None = None,
debug_mode: bool = False,
@ -301,7 +407,7 @@ def run_daemon(
async def _main():
async with open_root_actor(
registry_addr=registry_addr,
registry_addrs=registry_addrs,
name=name,
start_method=start_method,
debug_mode=debug_mode,

1092
tractor/_rpc.py 100644

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

833
tractor/_shm.py 100644
View File

@ -0,0 +1,833 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
"""
SC friendly shared memory management geared at real-time
processing.
Support for ``numpy`` compatible array-buffers is provided but is
considered optional within the context of this runtime-library.
"""
from __future__ import annotations
from sys import byteorder
import time
from typing import Optional
from multiprocessing import shared_memory as shm
from multiprocessing.shared_memory import (
SharedMemory,
ShareableList,
)
from msgspec import Struct
import tractor
from .log import get_logger
_USE_POSIX = getattr(shm, '_USE_POSIX', False)
if _USE_POSIX:
from _posixshmem import shm_unlink
try:
import numpy as np
from numpy.lib import recfunctions as rfn
# import nptyping
except ImportError:
pass
log = get_logger(__name__)
def disable_mantracker():
'''
Disable all ``multiprocessing``` "resource tracking" machinery since
it's an absolute multi-threaded mess of non-SC madness.
'''
from multiprocessing import resource_tracker as mantracker
# Tell the "resource tracker" thing to fuck off.
class ManTracker(mantracker.ResourceTracker):
def register(self, name, rtype):
pass
def unregister(self, name, rtype):
pass
def ensure_running(self):
pass
# "know your land and know your prey"
# https://www.dailymotion.com/video/x6ozzco
mantracker._resource_tracker = ManTracker()
mantracker.register = mantracker._resource_tracker.register
mantracker.ensure_running = mantracker._resource_tracker.ensure_running
mantracker.unregister = mantracker._resource_tracker.unregister
mantracker.getfd = mantracker._resource_tracker.getfd
disable_mantracker()
class SharedInt:
'''
Wrapper around a single entry shared memory array which
holds an ``int`` value used as an index counter.
'''
def __init__(
self,
shm: SharedMemory,
) -> None:
self._shm = shm
@property
def value(self) -> int:
return int.from_bytes(self._shm.buf, byteorder)
@value.setter
def value(self, value) -> None:
self._shm.buf[:] = value.to_bytes(self._shm.size, byteorder)
def destroy(self) -> None:
if _USE_POSIX:
# We manually unlink to bypass all the "resource tracker"
# nonsense meant for non-SC systems.
name = self._shm.name
try:
shm_unlink(name)
except FileNotFoundError:
# might be a teardown race here?
log.warning(f'Shm for {name} already unlinked?')
class NDToken(Struct, frozen=True):
'''
Internal represenation of a shared memory ``numpy`` array "token"
which can be used to key and load a system (OS) wide shm entry
and correctly read the array by type signature.
This type is msg safe.
'''
shm_name: str # this servers as a "key" value
shm_first_index_name: str
shm_last_index_name: str
dtype_descr: tuple
size: int # in struct-array index / row terms
# TODO: use nptyping here on dtypes
@property
def dtype(self) -> list[tuple[str, str, tuple[int, ...]]]:
return np.dtype(
list(
map(tuple, self.dtype_descr)
)
).descr
def as_msg(self):
return self.to_dict()
@classmethod
def from_msg(cls, msg: dict) -> NDToken:
if isinstance(msg, NDToken):
return msg
# TODO: native struct decoding
# return _token_dec.decode(msg)
msg['dtype_descr'] = tuple(map(tuple, msg['dtype_descr']))
return NDToken(**msg)
# _token_dec = msgspec.msgpack.Decoder(NDToken)
# TODO: this api?
# _known_tokens = tractor.ActorVar('_shm_tokens', {})
# _known_tokens = tractor.ContextStack('_known_tokens', )
# _known_tokens = trio.RunVar('shms', {})
# TODO: this should maybe be provided via
# a `.trionics.maybe_open_context()` wrapper factory?
# process-local store of keys to tokens
_known_tokens: dict[str, NDToken] = {}
def get_shm_token(key: str) -> NDToken | None:
'''
Convenience func to check if a token
for the provided key is known by this process.
Returns either the ``numpy`` token or a string for a shared list.
'''
return _known_tokens.get(key)
def _make_token(
key: str,
size: int,
dtype: np.dtype,
) -> NDToken:
'''
Create a serializable token that can be used
to access a shared array.
'''
return NDToken(
shm_name=key,
shm_first_index_name=key + "_first",
shm_last_index_name=key + "_last",
dtype_descr=tuple(np.dtype(dtype).descr),
size=size,
)
class ShmArray:
'''
A shared memory ``numpy.ndarray`` API.
An underlying shared memory buffer is allocated based on
a user specified ``numpy.ndarray``. This fixed size array
can be read and written to by pushing data both onto the "front"
or "back" of a set index range. The indexes for the "first" and
"last" index are themselves stored in shared memory (accessed via
``SharedInt`` interfaces) values such that multiple processes can
interact with the same array using a synchronized-index.
'''
def __init__(
self,
shmarr: np.ndarray,
first: SharedInt,
last: SharedInt,
shm: SharedMemory,
# readonly: bool = True,
) -> None:
self._array = shmarr
# indexes for first and last indices corresponding
# to fille data
self._first = first
self._last = last
self._len = len(shmarr)
self._shm = shm
self._post_init: bool = False
# pushing data does not write the index (aka primary key)
self._write_fields: list[str] | None = None
dtype = shmarr.dtype
if dtype.fields:
self._write_fields = list(shmarr.dtype.fields.keys())[1:]
# TODO: ringbuf api?
@property
def _token(self) -> NDToken:
return NDToken(
shm_name=self._shm.name,
shm_first_index_name=self._first._shm.name,
shm_last_index_name=self._last._shm.name,
dtype_descr=tuple(self._array.dtype.descr),
size=self._len,
)
@property
def token(self) -> dict:
"""Shared memory token that can be serialized and used by
another process to attach to this array.
"""
return self._token.as_msg()
@property
def index(self) -> int:
return self._last.value % self._len
@property
def array(self) -> np.ndarray:
'''
Return an up-to-date ``np.ndarray`` view of the
so-far-written data to the underlying shm buffer.
'''
a = self._array[self._first.value:self._last.value]
# first, last = self._first.value, self._last.value
# a = self._array[first:last]
# TODO: eventually comment this once we've not seen it in the
# wild in a long time..
# XXX: race where first/last indexes cause a reader
# to load an empty array..
if len(a) == 0 and self._post_init:
raise RuntimeError('Empty array race condition hit!?')
# breakpoint()
return a
def ustruct(
self,
fields: Optional[list[str]] = None,
# type that all field values will be cast to
# in the returned view.
common_dtype: np.dtype = float,
) -> np.ndarray:
array = self._array
if fields:
selection = array[fields]
# fcount = len(fields)
else:
selection = array
# fcount = len(array.dtype.fields)
# XXX: manual ``.view()`` attempt that also doesn't work.
# uview = selection.view(
# dtype='<f16',
# ).reshape(-1, 4, order='A')
# assert len(selection) == len(uview)
u = rfn.structured_to_unstructured(
selection,
# dtype=float,
copy=True,
)
# unstruct = np.ndarray(u.shape, dtype=a.dtype, buffer=shm.buf)
# array[:] = a[:]
return u
# return ShmArray(
# shmarr=u,
# first=self._first,
# last=self._last,
# shm=self._shm
# )
def last(
self,
length: int = 1,
) -> np.ndarray:
'''
Return the last ``length``'s worth of ("row") entries from the
array.
'''
return self.array[-length:]
def push(
self,
data: np.ndarray,
field_map: Optional[dict[str, str]] = None,
prepend: bool = False,
update_first: bool = True,
start: int | None = None,
) -> int:
'''
Ring buffer like "push" to append data
into the buffer and return updated "last" index.
NB: no actual ring logic yet to give a "loop around" on overflow
condition, lel.
'''
length = len(data)
if prepend:
index = (start or self._first.value) - length
if index < 0:
raise ValueError(
f'Array size of {self._len} was overrun during prepend.\n'
f'You have passed {abs(index)} too many datums.'
)
else:
index = start if start is not None else self._last.value
end = index + length
if field_map:
src_names, dst_names = zip(*field_map.items())
else:
dst_names = src_names = self._write_fields
try:
self._array[
list(dst_names)
][index:end] = data[list(src_names)][:]
# NOTE: there was a race here between updating
# the first and last indices and when the next reader
# tries to access ``.array`` (which due to the index
# overlap will be empty). Pretty sure we've fixed it now
# but leaving this here as a reminder.
if (
prepend
and update_first
and length
):
assert index < self._first.value
if (
index < self._first.value
and update_first
):
assert prepend, 'prepend=True not passed but index decreased?'
self._first.value = index
elif not prepend:
self._last.value = end
self._post_init = True
return end
except ValueError as err:
if field_map:
raise
# should raise if diff detected
self.diff_err_fields(data)
raise err
def diff_err_fields(
self,
data: np.ndarray,
) -> None:
# reraise with any field discrepancy
our_fields, their_fields = (
set(self._array.dtype.fields),
set(data.dtype.fields),
)
only_in_ours = our_fields - their_fields
only_in_theirs = their_fields - our_fields
if only_in_ours:
raise TypeError(
f"Input array is missing field(s): {only_in_ours}"
)
elif only_in_theirs:
raise TypeError(
f"Input array has unknown field(s): {only_in_theirs}"
)
# TODO: support "silent" prepends that don't update ._first.value?
def prepend(
self,
data: np.ndarray,
) -> int:
end = self.push(data, prepend=True)
assert end
def close(self) -> None:
self._first._shm.close()
self._last._shm.close()
self._shm.close()
def destroy(self) -> None:
if _USE_POSIX:
# We manually unlink to bypass all the "resource tracker"
# nonsense meant for non-SC systems.
shm_unlink(self._shm.name)
self._first.destroy()
self._last.destroy()
def flush(self) -> None:
# TODO: flush to storage backend like markestore?
...
def open_shm_ndarray(
size: int,
key: str | None = None,
dtype: np.dtype | None = None,
append_start_index: int | None = None,
readonly: bool = False,
) -> ShmArray:
'''
Open a memory shared ``numpy`` using the standard library.
This call unlinks (aka permanently destroys) the buffer on teardown
and thus should be used from the parent-most accessor (process).
'''
# create new shared mem segment for which we
# have write permission
a = np.zeros(size, dtype=dtype)
a['index'] = np.arange(len(a))
shm = SharedMemory(
name=key,
create=True,
size=a.nbytes
)
array = np.ndarray(
a.shape,
dtype=a.dtype,
buffer=shm.buf
)
array[:] = a[:]
array.setflags(write=int(not readonly))
token = _make_token(
key=key,
size=size,
dtype=dtype,
)
# create single entry arrays for storing an first and last indices
first = SharedInt(
shm=SharedMemory(
name=token.shm_first_index_name,
create=True,
size=4, # std int
)
)
last = SharedInt(
shm=SharedMemory(
name=token.shm_last_index_name,
create=True,
size=4, # std int
)
)
# Start the "real-time" append-updated (or "pushed-to") section
# after some start index: ``append_start_index``. This allows appending
# from a start point in the array which isn't the 0 index and looks
# something like,
# -------------------------
# | | i
# _________________________
# <-------------> <------->
# history real-time
#
# Once fully "prepended", the history section will leave the
# ``ShmArray._start.value: int = 0`` and the yet-to-be written
# real-time section will start at ``ShmArray.index: int``.
# this sets the index to nearly 2/3rds into the the length of
# the buffer leaving at least a "days worth of second samples"
# for the real-time section.
if append_start_index is None:
append_start_index = round(size * 0.616)
last.value = first.value = append_start_index
shmarr = ShmArray(
array,
first,
last,
shm,
)
assert shmarr._token == token
_known_tokens[key] = shmarr.token
# "unlink" created shm on process teardown by
# pushing teardown calls onto actor context stack
stack = tractor.current_actor().lifetime_stack
stack.callback(shmarr.close)
stack.callback(shmarr.destroy)
return shmarr
def attach_shm_ndarray(
token: tuple[str, str, tuple[str, str]],
readonly: bool = True,
) -> ShmArray:
'''
Attach to an existing shared memory array previously
created by another process using ``open_shared_array``.
No new shared mem is allocated but wrapper types for read/write
access are constructed.
'''
token = NDToken.from_msg(token)
key = token.shm_name
if key in _known_tokens:
assert NDToken.from_msg(_known_tokens[key]) == token, "WTF"
# XXX: ugh, looks like due to the ``shm_open()`` C api we can't
# actually place files in a subdir, see discussion here:
# https://stackoverflow.com/a/11103289
# attach to array buffer and view as per dtype
_err: Optional[Exception] = None
for _ in range(3):
try:
shm = SharedMemory(
name=key,
create=False,
)
break
except OSError as oserr:
_err = oserr
time.sleep(0.1)
else:
if _err:
raise _err
shmarr = np.ndarray(
(token.size,),
dtype=token.dtype,
buffer=shm.buf
)
shmarr.setflags(write=int(not readonly))
first = SharedInt(
shm=SharedMemory(
name=token.shm_first_index_name,
create=False,
size=4, # std int
),
)
last = SharedInt(
shm=SharedMemory(
name=token.shm_last_index_name,
create=False,
size=4, # std int
),
)
# make sure we can read
first.value
sha = ShmArray(
shmarr,
first,
last,
shm,
)
# read test
sha.array
# Stash key -> token knowledge for future queries
# via `maybe_opepn_shm_array()` but only after we know
# we can attach.
if key not in _known_tokens:
_known_tokens[key] = token
# "close" attached shm on actor teardown
tractor.current_actor().lifetime_stack.callback(sha.close)
return sha
def maybe_open_shm_ndarray(
key: str, # unique identifier for segment
size: int,
dtype: np.dtype | None = None,
append_start_index: int = 0,
readonly: bool = True,
) -> tuple[ShmArray, bool]:
'''
Attempt to attach to a shared memory block using a "key" lookup
to registered blocks in the users overall "system" registry
(presumes you don't have the block's explicit token).
This function is meant to solve the problem of discovering whether
a shared array token has been allocated or discovered by the actor
running in **this** process. Systems where multiple actors may seek
to access a common block can use this function to attempt to acquire
a token as discovered by the actors who have previously stored
a "key" -> ``NDToken`` map in an actor local (aka python global)
variable.
If you know the explicit ``NDToken`` for your memory segment instead
use ``attach_shm_array``.
'''
try:
# see if we already know this key
token = _known_tokens[key]
return (
attach_shm_ndarray(
token=token,
readonly=readonly,
),
False, # not newly opened
)
except KeyError:
log.warning(f"Could not find {key} in shms cache")
if dtype:
token = _make_token(
key,
size=size,
dtype=dtype,
)
else:
try:
return (
attach_shm_ndarray(
token=token,
readonly=readonly,
),
False,
)
except FileNotFoundError:
log.warning(f"Could not attach to shm with token {token}")
# This actor does not know about memory
# associated with the provided "key".
# Attempt to open a block and expect
# to fail if a block has been allocated
# on the OS by someone else.
return (
open_shm_ndarray(
key=key,
size=size,
dtype=dtype,
append_start_index=append_start_index,
readonly=readonly,
),
True,
)
class ShmList(ShareableList):
'''
Carbon copy of ``.shared_memory.ShareableList`` with a few
enhancements:
- readonly mode via instance var flag `._readonly: bool`
- ``.__getitem__()`` accepts ``slice`` inputs
- exposes the underlying buffer "name" as a ``.key: str``
'''
def __init__(
self,
sequence: list | None = None,
*,
name: str | None = None,
readonly: bool = True
) -> None:
self._readonly = readonly
self._key = name
return super().__init__(
sequence=sequence,
name=name,
)
@property
def key(self) -> str:
return self._key
@property
def readonly(self) -> bool:
return self._readonly
def __setitem__(
self,
position,
value,
) -> None:
# mimick ``numpy`` error
if self._readonly:
raise ValueError('assignment destination is read-only')
return super().__setitem__(position, value)
def __getitem__(
self,
indexish,
) -> list:
# NOTE: this is a non-writeable view (copy?) of the buffer
# in a new list instance.
if isinstance(indexish, slice):
return list(self)[indexish]
return super().__getitem__(indexish)
# TODO: should we offer a `.array` and `.push()` equivalent
# to the `ShmArray`?
# currently we have the following limitations:
# - can't write slices of input using traditional slice-assign
# syntax due to the ``ShareableList.__setitem__()`` implementation.
# - ``list(shmlist)`` returns a non-mutable copy instead of
# a writeable view which would be handier numpy-style ops.
def open_shm_list(
key: str,
sequence: list | None = None,
size: int = int(2 ** 10),
dtype: float | int | bool | str | bytes | None = float,
readonly: bool = True,
) -> ShmList:
if sequence is None:
default = {
float: 0.,
int: 0,
bool: True,
str: 'doggy',
None: None,
}[dtype]
sequence = [default] * size
shml = ShmList(
sequence=sequence,
name=key,
readonly=readonly,
)
# "close" attached shm on actor teardown
try:
actor = tractor.current_actor()
actor.lifetime_stack.callback(shml.shm.close)
actor.lifetime_stack.callback(shml.shm.unlink)
except RuntimeError:
log.warning('tractor runtime not active, skipping teardown steps')
return shml
def attach_shm_list(
key: str,
readonly: bool = False,
) -> ShmList:
return ShmList(
name=key,
readonly=readonly,
)

View File

@ -19,6 +19,7 @@ Machinery for actor process spawning using multiple backends.
"""
from __future__ import annotations
import multiprocessing as mp
import sys
import platform
from typing import (
@ -30,30 +31,28 @@ from typing import (
TYPE_CHECKING,
)
from exceptiongroup import BaseExceptionGroup
import trio
from trio_typing import TaskStatus
from trio import TaskStatus
from ._debug import (
from tractor.devx import (
maybe_wait_for_debugger,
acquire_debug_lock,
)
from ._state import (
from tractor._state import (
current_actor,
is_main_process,
is_root_process,
debug_mode,
)
from .log import get_logger
from ._portal import Portal
from ._runtime import Actor
from ._entry import _mp_main
from ._exceptions import ActorFailure
from tractor.log import get_logger
from tractor._portal import Portal
from tractor._runtime import Actor
from tractor._entry import _mp_main
from tractor._exceptions import ActorFailure
if TYPE_CHECKING:
from ._supervise import ActorNursery
import multiprocessing as mp
ProcessType = TypeVar('ProcessType', mp.Process, trio.Process)
log = get_logger('tractor')
@ -70,7 +69,6 @@ _spawn_method: SpawnMethodKey = 'trio'
if platform.system() == 'Windows':
import multiprocessing as mp
_ctx = mp.get_context("spawn")
async def proc_waiter(proc: mp.Process) -> None:
@ -145,7 +143,7 @@ async def exhaust_portal(
# XXX: streams should never be reaped here since they should
# always be established and shutdown using a context manager api
final = await portal.result()
final: Any = await portal.result()
except (
Exception,
@ -153,13 +151,23 @@ async def exhaust_portal(
) as err:
# we reraise in the parent task via a ``BaseExceptionGroup``
return err
except trio.Cancelled as err:
# lol, of course we need this too ;P
# TODO: merge with above?
log.warning(f"Cancelled result waiter for {portal.actor.uid}")
log.warning(
'Cancelled portal result waiter task:\n'
f'uid: {portal.channel.uid}\n'
f'error: {err}\n'
)
return err
else:
log.debug(f"Returning final result: {final}")
log.debug(
f'Returning final result from portal:\n'
f'uid: {portal.channel.uid}\n'
f'result: {final}\n'
)
return final
@ -171,41 +179,71 @@ async def cancel_on_completion(
) -> None:
'''
Cancel actor gracefully once it's "main" portal's
Cancel actor gracefully once its "main" portal's
result arrives.
Should only be called for actors spawned with `run_in_actor()`.
Should only be called for actors spawned via the
`Portal.run_in_actor()` API.
=> and really this API will be deprecated and should be
re-implemented as a `.hilevel.one_shot_task_nursery()`..)
'''
# if this call errors we store the exception for later
# in ``errors`` which will be reraised inside
# an exception group and we still send out a cancel request
result = await exhaust_portal(portal, actor)
result: Any|Exception = await exhaust_portal(portal, actor)
if isinstance(result, Exception):
errors[actor.uid] = result
log.warning(
f"Cancelling {portal.channel.uid} after error {result}"
errors[actor.uid]: Exception = result
log.cancel(
'Cancelling subactor runtime due to error:\n\n'
f'Portal.cancel_actor() => {portal.channel.uid}\n\n'
f'error: {result}\n'
)
else:
log.runtime(
f"Cancelling {portal.channel.uid} gracefully "
f"after result {result}")
'Cancelling subactor gracefully:\n\n'
f'Portal.cancel_actor() => {portal.channel.uid}\n\n'
f'result: {result}\n'
)
# cancel the process now that we have a final result
await portal.cancel_actor()
async def do_hard_kill(
async def hard_kill(
proc: trio.Process,
terminate_after: int = 3,
terminate_after: int = 1.6,
# NOTE: for mucking with `.pause()`-ing inside the runtime
# whilst also hacking on it XD
# terminate_after: int = 99999,
) -> None:
'''
Un-gracefully terminate an OS level `trio.Process` after timeout.
Used in 2 main cases:
- "unknown remote runtime state": a hanging/stalled actor that
isn't responding after sending a (graceful) runtime cancel
request via an IPC msg.
- "cancelled during spawn": a process who's actor runtime was
cancelled before full startup completed (such that
cancel-request-handling machinery was never fully
initialized) and thus a "cancel request msg" is never going
to be handled.
'''
log.cancel(
'Terminating sub-proc:\n'
f'|_{proc}\n'
)
# NOTE: this timeout used to do nothing since we were shielding
# the ``.wait()`` inside ``new_proc()`` which will pretty much
# never release until the process exits, now it acts as
# a hard-kill time ultimatum.
log.debug(f"Terminating {proc}")
with trio.move_on_after(terminate_after) as cs:
# NOTE: code below was copied verbatim from the now deprecated
@ -216,6 +254,9 @@ async def do_hard_kill(
# and wait for it to exit. If cancelled, kills the process and
# waits for it to finish exiting before propagating the
# cancellation.
#
# This code was originally triggred by ``proc.__aexit__()``
# but now must be called manually.
with trio.CancelScope(shield=True):
if proc.stdin is not None:
await proc.stdin.aclose()
@ -231,15 +272,25 @@ async def do_hard_kill(
with trio.CancelScope(shield=True):
await proc.wait()
# XXX NOTE XXX: zombie squad dispatch:
# (should ideally never, but) If we do get here it means
# graceful termination of a process failed and we need to
# resort to OS level signalling to interrupt and cancel the
# (presumably stalled or hung) actor. Since we never allow
# zombies (as a feature) we ask the OS to do send in the
# removal swad as the last resort.
if cs.cancelled_caught:
# XXX: should pretty much never get here unless we have
# to move the bits from ``proc.__aexit__()`` out and
# into here.
log.critical(f"#ZOMBIE_LORD_IS_HERE: {proc}")
# TODO: toss in the skynet-logo face as ascii art?
log.critical(
# 'Well, the #ZOMBIE_LORD_IS_HERE# to collect\n'
'#T-800 deployed to collect zombie B0\n'
f'|\n'
f'|_{proc}\n'
)
proc.kill()
async def soft_wait(
async def soft_kill(
proc: ProcessType,
wait_func: Callable[
@ -249,14 +300,26 @@ async def soft_wait(
portal: Portal,
) -> None:
# Wait for proc termination but **dont' yet** call
# ``trio.Process.__aexit__()`` (it tears down stdio
# which will kill any waiting remote pdb trace).
# This is a "soft" (cancellable) join/reap.
uid = portal.channel.uid
'''
Wait for proc termination but **don't yet** teardown
std-streams since it will clobber any ongoing pdb REPL
session.
This is our "soft"/graceful, and thus itself also cancellable,
join/reap on an actor-runtime-in-process shutdown; it is
**not** the same as a "hard kill" via an OS signal (for that
see `.hard_kill()`).
'''
uid: tuple[str, str] = portal.channel.uid
try:
log.cancel(f'Soft waiting on actor:\n{uid}')
log.cancel(
'Soft killing sub-actor via `Portal.cancel_actor()`\n'
f'|_{proc}\n'
)
# wait on sub-proc to signal termination
await wait_func(proc)
except trio.Cancelled:
# if cancelled during a soft wait, cancel the child
# actor before entering the hard reap sequence
@ -268,22 +331,29 @@ async def soft_wait(
async def cancel_on_proc_deth():
'''
Cancel the actor cancel request if we detect that
that the process terminated.
"Cancel-the-cancel" request: if we detect that the
underlying sub-process exited prior to
a `Portal.cancel_actor()` call completing .
'''
await wait_func(proc)
n.cancel_scope.cancel()
# start a task to wait on the termination of the
# process by itself waiting on a (caller provided) wait
# function which should unblock when the target process
# has terminated.
n.start_soon(cancel_on_proc_deth)
# send the actor-runtime a cancel request.
await portal.cancel_actor()
if proc.poll() is None: # type: ignore
log.warning(
'Actor still alive after cancel request:\n'
f'{uid}'
'Subactor still alive after cancel request?\n\n'
f'uid: {uid}\n'
f'|_{proc}\n'
)
n.cancel_scope.cancel()
raise
@ -295,7 +365,7 @@ async def new_proc(
errors: dict[tuple[str, str], Exception],
# passed through to actor main
bind_addr: tuple[str, int],
bind_addrs: list[tuple[str, int]],
parent_addr: tuple[str, int],
_runtime_vars: dict[str, Any], # serialized and sent to _child
@ -307,7 +377,7 @@ async def new_proc(
) -> None:
# lookup backend spawning target
target = _methods[_spawn_method]
target: Callable = _methods[_spawn_method]
# mark the new actor with the global spawn method
subactor._spawn_method = _spawn_method
@ -317,7 +387,7 @@ async def new_proc(
actor_nursery,
subactor,
errors,
bind_addr,
bind_addrs,
parent_addr,
_runtime_vars, # run time vars
infect_asyncio=infect_asyncio,
@ -332,7 +402,7 @@ async def trio_proc(
errors: dict[tuple[str, str], Exception],
# passed through to actor main
bind_addr: tuple[str, int],
bind_addrs: list[tuple[str, int]],
parent_addr: tuple[str, int],
_runtime_vars: dict[str, Any], # serialized and sent to _child
*,
@ -375,19 +445,22 @@ async def trio_proc(
spawn_cmd.append("--asyncio")
cancelled_during_spawn: bool = False
proc: trio.Process | None = None
proc: trio.Process|None = None
try:
try:
# TODO: needs ``trio_typing`` patch?
proc = await trio.lowlevel.open_process(spawn_cmd)
log.runtime(f"Started {proc}")
log.runtime(
'Started new sub-proc\n'
f'|_{proc}\n'
)
# wait for actor to spawn and connect back to us
# channel should have handshake completed by the
# local actor by the time we get a ref to it
event, chan = await actor_nursery._actor.wait_for_peer(
subactor.uid)
subactor.uid
)
except trio.Cancelled:
cancelled_during_spawn = True
@ -418,12 +491,11 @@ async def trio_proc(
# send additional init params
await chan.send({
"_parent_main_data": subactor._parent_main_data,
"enable_modules": subactor.enable_modules,
"_arb_addr": subactor._arb_addr,
"bind_host": bind_addr[0],
"bind_port": bind_addr[1],
"_runtime_vars": _runtime_vars,
'_parent_main_data': subactor._parent_main_data,
'enable_modules': subactor.enable_modules,
'reg_addrs': subactor.reg_addrs,
'bind_addrs': bind_addrs,
'_runtime_vars': _runtime_vars,
})
# track subactor in current nursery
@ -449,7 +521,7 @@ async def trio_proc(
# This is a "soft" (cancellable) join/reap which
# will remote cancel the actor on a ``trio.Cancelled``
# condition.
await soft_wait(
await soft_kill(
proc,
trio.Process.wait,
portal
@ -457,9 +529,10 @@ async def trio_proc(
# cancel result waiter that may have been spawned in
# tandem if not done already
log.warning(
"Cancelling existing result waiter task for "
f"{subactor.uid}")
log.cancel(
'Cancelling existing result waiter task for '
f'{subactor.uid}'
)
nursery.cancel_scope.cancel()
finally:
@ -477,22 +550,40 @@ async def trio_proc(
with trio.move_on_after(0.5):
await proc.wait()
if is_root_process():
# TODO: solve the following issue where we need
# to do a similar wait like this but in an
# "intermediary" parent actor that itself isn't
# in debug but has a child that is, and we need
# to hold off on relaying SIGINT until that child
# is complete.
# https://github.com/goodboy/tractor/issues/320
await maybe_wait_for_debugger(
child_in_debug=_runtime_vars.get(
'_debug_mode', False),
)
await maybe_wait_for_debugger(
child_in_debug=_runtime_vars.get(
'_debug_mode', False
),
header_msg=(
'Delaying subproc reaper while debugger locked..\n'
),
# TODO: need a diff value then default?
# poll_steps=9999999,
)
# TODO: solve the following issue where we need
# to do a similar wait like this but in an
# "intermediary" parent actor that itself isn't
# in debug but has a child that is, and we need
# to hold off on relaying SIGINT until that child
# is complete.
# https://github.com/goodboy/tractor/issues/320
# -[ ] we need to handle non-root parent-actors specially
# by somehow determining if a child is in debug and then
# avoiding cancel/kill of said child by this
# (intermediary) parent until such a time as the root says
# the pdb lock is released and we are good to tear down
# (our children)..
#
# -[ ] so maybe something like this where we try to
# acquire the lock and get notified of who has it,
# check that uid against our known children?
# this_uid: tuple[str, str] = current_actor().uid
# await acquire_debug_lock(this_uid)
if proc.poll() is None:
log.cancel(f"Attempting to hard kill {proc}")
await do_hard_kill(proc)
await hard_kill(proc)
log.debug(f"Joined {proc}")
else:
@ -510,7 +601,7 @@ async def mp_proc(
subactor: Actor,
errors: dict[tuple[str, str], Exception],
# passed through to actor main
bind_addr: tuple[str, int],
bind_addrs: list[tuple[str, int]],
parent_addr: tuple[str, int],
_runtime_vars: dict[str, Any], # serialized and sent to _child
*,
@ -568,7 +659,7 @@ async def mp_proc(
target=_mp_main,
args=(
subactor,
bind_addr,
bind_addrs,
fs_info,
_spawn_method,
parent_addr,
@ -636,7 +727,7 @@ async def mp_proc(
# This is a "soft" (cancellable) join/reap which
# will remote cancel the actor on a ``trio.Cancelled``
# condition.
await soft_wait(
await soft_kill(
proc,
proc_waiter,
portal

View File

@ -18,44 +18,88 @@
Per process state
"""
from __future__ import annotations
from typing import (
Optional,
Any,
TYPE_CHECKING,
)
import trio
from ._exceptions import NoRuntime
if TYPE_CHECKING:
from ._runtime import Actor
_current_actor: Optional['Actor'] = None # type: ignore # noqa
_current_actor: Actor|None = None # type: ignore # noqa
_last_actor_terminated: Actor|None = None
_runtime_vars: dict[str, Any] = {
'_debug_mode': False,
'_is_root': False,
'_root_mailbox': (None, None)
'_root_mailbox': (None, None),
'_registry_addrs': [],
}
def current_actor(err_on_no_runtime: bool = True) -> 'Actor': # type: ignore # noqa
"""Get the process-local actor instance.
"""
if _current_actor is None and err_on_no_runtime:
raise NoRuntime("No local actor has been initialized yet")
def last_actor() -> Actor|None:
'''
Try to return last active `Actor` singleton
for this process.
For case where runtime already exited but someone is asking
about the "last" actor probably to get its `.uid: tuple`.
'''
return _last_actor_terminated
def current_actor(
err_on_no_runtime: bool = True,
) -> Actor:
'''
Get the process-local actor instance.
'''
if (
err_on_no_runtime
and _current_actor is None
):
msg: str = 'No local actor has been initialized yet'
from ._exceptions import NoRuntime
if last := last_actor():
msg += (
f'Apparently the lact active actor was\n'
f'|_{last}\n'
f'|_{last.uid}\n'
)
# no actor runtime has (as of yet) ever been started for
# this process.
else:
msg += (
'No last actor found?\n'
'Did you forget to open one of:\n\n'
'- `tractor.open_root_actor()`\n'
'- `tractor.open_nursery()`\n'
)
raise NoRuntime(msg)
return _current_actor
def is_main_process() -> bool:
"""Bool determining if this actor is running in the top-most process.
"""
'''
Bool determining if this actor is running in the top-most process.
'''
import multiprocessing as mp
return mp.current_process().name == 'MainProcess'
def debug_mode() -> bool:
"""Bool determining if "debug mode" is on which enables
'''
Bool determining if "debug mode" is on which enables
remote subactor pdb entry on crashes.
"""
'''
return bool(_runtime_vars['_debug_mode'])

View File

@ -14,31 +14,38 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
"""
'''
Message stream types and APIs.
"""
The machinery and types behind ``Context.open_stream()``
'''
from __future__ import annotations
from contextlib import asynccontextmanager as acm
import inspect
from contextlib import asynccontextmanager
from dataclasses import dataclass
from pprint import pformat
from typing import (
Any,
Optional,
Callable,
AsyncGenerator,
AsyncIterator
AsyncIterator,
TYPE_CHECKING,
)
import warnings
import trio
from ._ipc import Channel
from ._exceptions import unpack_error, ContextCancelled
from ._state import current_actor
from ._exceptions import (
_raise_from_no_key_in_msg,
ContextCancelled,
)
from .log import get_logger
from .trionics import broadcast_receiver, BroadcastReceiver
from .trionics import (
broadcast_receiver,
BroadcastReceiver,
)
if TYPE_CHECKING:
from ._context import Context
log = get_logger(__name__)
@ -49,7 +56,6 @@ log = get_logger(__name__)
# messages? class ReceiveChannel(AsyncResource, Generic[ReceiveType]):
# - use __slots__ on ``Context``?
class MsgStream(trio.abc.Channel):
'''
A bidirectional message stream for receiving logically sequenced
@ -70,9 +76,9 @@ class MsgStream(trio.abc.Channel):
'''
def __init__(
self,
ctx: 'Context', # typing: ignore # noqa
ctx: Context, # typing: ignore # noqa
rx_chan: trio.MemoryReceiveChannel,
_broadcaster: Optional[BroadcastReceiver] = None,
_broadcaster: BroadcastReceiver | None = None,
) -> None:
self._ctx = ctx
@ -80,122 +86,248 @@ class MsgStream(trio.abc.Channel):
self._broadcaster = _broadcaster
# flag to denote end of stream
self._eoc: bool = False
self._closed: bool = False
self._eoc: bool|trio.EndOfChannel = False
self._closed: bool|trio.ClosedResourceError = False
# delegate directly to underlying mem channel
def receive_nowait(self):
msg = self._rx_chan.receive_nowait()
return msg['yield']
def receive_nowait(
self,
allow_msg_keys: list[str] = ['yield'],
):
msg: dict = self._rx_chan.receive_nowait()
for (
i,
key,
) in enumerate(allow_msg_keys):
try:
return msg[key]
except KeyError as kerr:
if i < (len(allow_msg_keys) - 1):
continue
async def receive(self):
'''Async receive a single msg from the IPC transport, the next
in sequence for this stream.
_raise_from_no_key_in_msg(
ctx=self._ctx,
msg=msg,
src_err=kerr,
log=log,
expect_key=key,
stream=self,
)
async def receive(
self,
hide_tb: bool = True,
):
'''
Receive a single msg from the IPC transport, the next in
sequence sent by the far end task (possibly in order as
determined by the underlying protocol).
'''
__tracebackhide__: bool = hide_tb
# NOTE: `trio.ReceiveChannel` implements
# EOC handling as follows (aka uses it
# to gracefully exit async for loops):
#
# async def __anext__(self) -> ReceiveType:
# try:
# return await self.receive()
# except trio.EndOfChannel:
# raise StopAsyncIteration
#
# see ``.aclose()`` for notes on the old behaviour prior to
# introducing this
if self._eoc:
raise trio.EndOfChannel
raise self._eoc
if self._closed:
raise trio.ClosedResourceError('This stream was closed')
raise self._closed
src_err: Exception|None = None # orig tb
try:
msg = await self._rx_chan.receive()
return msg['yield']
try:
msg = await self._rx_chan.receive()
return msg['yield']
except KeyError as err:
# internal error should never get here
assert msg.get('cid'), ("Received internal error at portal?")
except KeyError as kerr:
src_err = kerr
# TODO: handle 2 cases with 3.10 match syntax
# - 'stop'
# - 'error'
# possibly just handle msg['stop'] here!
if self._closed:
raise trio.ClosedResourceError('This stream was closed')
if msg.get('stop') or self._eoc:
log.debug(f"{self} was stopped at remote end")
# XXX: important to set so that a new ``.receive()``
# call (likely by another task using a broadcast receiver)
# doesn't accidentally pull the ``return`` message
# value out of the underlying feed mem chan!
self._eoc = True
# # when the send is closed we assume the stream has
# # terminated and signal this local iterator to stop
# await self.aclose()
# XXX: this causes ``ReceiveChannel.__anext__()`` to
# raise a ``StopAsyncIteration`` **and** in our catch
# block below it will trigger ``.aclose()``.
raise trio.EndOfChannel from err
# TODO: test that shows stream raising an expected error!!!
elif msg.get('error'):
# raise the error message
raise unpack_error(msg, self._ctx.chan)
else:
raise
# NOTE: may raise any of the below error types
# includg EoC when a 'stop' msg is found.
_raise_from_no_key_in_msg(
ctx=self._ctx,
msg=msg,
src_err=kerr,
log=log,
expect_key='yield',
stream=self,
)
# XXX: the stream terminates on either of:
# - via `self._rx_chan.receive()` raising after manual closure
# by the rpc-runtime OR,
# - via a received `{'stop': ...}` msg from remote side.
# |_ NOTE: previously this was triggered by calling
# ``._rx_chan.aclose()`` on the send side of the channel inside
# `Actor._push_result()`, but now the 'stop' message handling
# has been put just above inside `_raise_from_no_key_in_msg()`.
except (
trio.ClosedResourceError, # by self._rx_chan
trio.EndOfChannel, # by self._rx_chan or `stop` msg from far end
):
# XXX: we close the stream on any of these error conditions:
# a ``ClosedResourceError`` indicates that the internal
# feeder memory receive channel was closed likely by the
# runtime after the associated transport-channel
# disconnected or broke.
# an ``EndOfChannel`` indicates either the internal recv
# memchan exhausted **or** we raisesd it just above after
# receiving a `stop` message from the far end of the stream.
# Previously this was triggered by calling ``.aclose()`` on
# the send side of the channel inside
# ``Actor._push_result()`` (should still be commented code
# there - which should eventually get removed), but now the
# 'stop' message handling has been put just above.
trio.EndOfChannel,
) as eoc:
src_err = eoc
self._eoc = eoc
# TODO: Locally, we want to close this stream gracefully, by
# terminating any local consumers tasks deterministically.
# One we have broadcast support, we **don't** want to be
# Once we have broadcast support, we **don't** want to be
# closing this stream and not flushing a final value to
# remaining (clone) consumers who may not have been
# scheduled to receive it yet.
# try:
# maybe_err_msg_or_res: dict = self._rx_chan.receive_nowait()
# if maybe_err_msg_or_res:
# log.warning(
# 'Discarding un-processed msg:\n'
# f'{maybe_err_msg_or_res}'
# )
# except trio.WouldBlock:
# # no queued msgs that might be another remote
# # error, so just raise the original EoC
# pass
# when the send is closed we assume the stream has
# terminated and signal this local iterator to stop
await self.aclose()
# raise eoc
raise # propagate
# a ``ClosedResourceError`` indicates that the internal
# feeder memory receive channel was closed likely by the
# runtime after the associated transport-channel
# disconnected or broke.
except trio.ClosedResourceError as cre: # by self._rx_chan.receive()
src_err = cre
log.warning(
'`Context._rx_chan` was already closed?'
)
self._closed = cre
async def aclose(self):
# when the send is closed we assume the stream has
# terminated and signal this local iterator to stop
drained: list[Exception|dict] = await self.aclose()
if drained:
# from .devx import pause
# await pause()
log.warning(
'Drained context msgs during closure:\n'
f'{drained}'
)
# TODO: pass these to the `._ctx._drained_msgs: deque`
# and then iterate them as part of any `.result()` call?
# NOTE XXX: if the context was cancelled or remote-errored
# but we received the stream close msg first, we
# probably want to instead raise the remote error
# over the end-of-stream connection error since likely
# the remote error was the source cause?
ctx: Context = self._ctx
ctx.maybe_raise(
raise_ctxc_from_self_call=True,
)
# propagate any error but hide low-level frame details
# from the caller by default for debug noise reduction.
if (
hide_tb
# XXX NOTE XXX don't reraise on certain
# stream-specific internal error types like,
#
# - `trio.EoC` since we want to use the exact instance
# to ensure that it is the error that bubbles upward
# for silent absorption by `Context.open_stream()`.
and not self._eoc
# - `RemoteActorError` (or `ContextCancelled`) if it gets
# raised from `_raise_from_no_key_in_msg()` since we
# want the same (as the above bullet) for any
# `.open_context()` block bubbled error raised by
# any nearby ctx API remote-failures.
# and not isinstance(src_err, RemoteActorError)
):
raise type(src_err)(*src_err.args) from src_err
else:
raise src_err
async def aclose(self) -> list[Exception|dict]:
'''
Cancel associated remote actor task and local memory channel on
close.
Notes:
- REMEMBER that this is also called by `.__aexit__()` so
careful consideration must be made to handle whatever
internal stsate is mutated, particuarly in terms of
draining IPC msgs!
- more or less we try to maintain adherance to trio's `.aclose()` semantics:
https://trio.readthedocs.io/en/stable/reference-io.html#trio.abc.AsyncResource.aclose
'''
# XXX: keep proper adherance to trio's `.aclose()` semantics:
# https://trio.readthedocs.io/en/stable/reference-io.html#trio.abc.AsyncResource.aclose
rx_chan = self._rx_chan
if rx_chan._closed:
log.cancel(f"{self} is already closed")
# rx_chan = self._rx_chan
# XXX NOTE XXX
# it's SUPER IMPORTANT that we ensure we don't DOUBLE
# DRAIN msgs on closure so avoid getting stuck handing on
# the `._rx_chan` since we call this method on
# `.__aexit__()` as well!!!
# => SO ENSURE WE CATCH ALL TERMINATION STATES in this
# block including the EoC..
if self.closed:
# this stream has already been closed so silently succeed as
# per ``trio.AsyncResource`` semantics.
# https://trio.readthedocs.io/en/stable/reference-io.html#trio.abc.AsyncResource.aclose
return
return []
self._eoc = True
ctx: Context = self._ctx
drained: list[Exception|dict] = []
while not drained:
try:
maybe_final_msg = self.receive_nowait(
allow_msg_keys=['yield', 'return'],
)
if maybe_final_msg:
log.debug(
'Drained un-processed stream msg:\n'
f'{pformat(maybe_final_msg)}'
)
# TODO: inject into parent `Context` buf?
drained.append(maybe_final_msg)
# NOTE: we only need these handlers due to the
# `.receive_nowait()` call above which may re-raise
# one of these errors on a msg key error!
except trio.WouldBlock as be:
drained.append(be)
break
except trio.EndOfChannel as eoc:
self._eoc: Exception = eoc
drained.append(eoc)
break
except trio.ClosedResourceError as cre:
self._closed = cre
drained.append(cre)
break
except ContextCancelled as ctxc:
# log.exception('GOT CTXC')
log.cancel(
'Context was cancelled during stream closure:\n'
f'canceller: {ctxc.canceller}\n'
f'{pformat(ctxc.msgdata)}'
)
break
# NOTE: this is super subtle IPC messaging stuff:
# Relay stop iteration to far end **iff** we're
@ -226,26 +358,40 @@ class MsgStream(trio.abc.Channel):
except (
trio.BrokenResourceError,
trio.ClosedResourceError
):
) as re:
# the underlying channel may already have been pulled
# in which case our stop message is meaningless since
# it can't traverse the transport.
ctx = self._ctx
log.warning(
f'Stream was already destroyed?\n'
f'actor: {ctx.chan.uid}\n'
f'ctx id: {ctx.cid}'
)
drained.append(re)
self._closed = re
self._closed = True
# if caught_eoc:
# # from .devx import _debug
# # await _debug.pause()
# with trio.CancelScope(shield=True):
# await rx_chan.aclose()
# Do we close the local mem chan ``self._rx_chan`` ??!?
# NO, DEFINITELY NOT if we're a bi-dir ``MsgStream``!
# BECAUSE this same core-msg-loop mem recv-chan is used to deliver
# the potential final result from the surrounding inter-actor
# `Context` so we don't want to close it until that context has
# run to completion.
if not self._eoc:
log.cancel(
'Stream closed before it received an EoC?\n'
'Setting eoc manually..\n..'
)
self._eoc: bool = trio.EndOfChannel(
f'Context stream closed by {self._ctx.side}\n'
f'|_{self}\n'
)
# ?XXX WAIT, why do we not close the local mem chan `._rx_chan` XXX?
# => NO, DEFINITELY NOT! <=
# if we're a bi-dir ``MsgStream`` BECAUSE this same
# core-msg-loop mem recv-chan is used to deliver the
# potential final result from the surrounding inter-actor
# `Context` so we don't want to close it until that
# context has run to completion.
# XXX: Notes on old behaviour:
# await rx_chan.aclose()
@ -274,8 +420,28 @@ class MsgStream(trio.abc.Channel):
# runtime's closure of ``rx_chan`` in the case where we may
# still need to consume msgs that are "in transit" from the far
# end (eg. for ``Context.result()``).
# self._closed = True
return drained
@asynccontextmanager
@property
def closed(self) -> bool:
rxc: bool = self._rx_chan._closed
_closed: bool|Exception = self._closed
_eoc: bool|trio.EndOfChannel = self._eoc
if rxc or _closed or _eoc:
log.runtime(
f'`MsgStream` is already closed\n'
f'{self}\n'
f' |_cid: {self._ctx.cid}\n'
f' |_rx_chan._closed: {type(rxc)} = {rxc}\n'
f' |_closed: {type(_closed)} = {_closed}\n'
f' |_eoc: {type(_eoc)} = {_eoc}'
)
return True
return False
@acm
async def subscribe(
self,
@ -329,386 +495,50 @@ class MsgStream(trio.abc.Channel):
async def send(
self,
data: Any
data: Any,
hide_tb: bool = True,
) -> None:
'''
Send a message over this stream to the far end.
'''
if self._ctx._error:
raise self._ctx._error # from None
__tracebackhide__: bool = hide_tb
# raise any alreay known error immediately
self._ctx.maybe_raise()
if self._eoc:
raise self._eoc
if self._closed:
raise trio.ClosedResourceError('This stream was already closed')
raise self._closed
await self._ctx.chan.send({'yield': data, 'cid': self._ctx.cid})
@dataclass
class Context:
'''
An inter-actor, ``trio`` task communication context.
NB: This class should never be instatiated directly, it is delivered
by either runtime machinery to a remotely started task or by entering
``Portal.open_context()``.
Allows maintaining task or protocol specific state between
2 communicating actor tasks. A unique context is created on the
callee side/end for every request to a remote actor from a portal.
A context can be cancelled and (possibly eventually restarted) from
either side of the underlying IPC channel, open task oriented
message streams and acts as an IPC aware inter-actor-task cancel
scope.
'''
chan: Channel
cid: str
# these are the "feeder" channels for delivering
# message values to the local task from the runtime
# msg processing loop.
_recv_chan: trio.MemoryReceiveChannel
_send_chan: trio.MemorySendChannel
_remote_func_type: Optional[str] = None
# only set on the caller side
_portal: Optional['Portal'] = None # type: ignore # noqa
_result: Optional[Any] = False
_error: Optional[BaseException] = None
# status flags
_cancel_called: bool = False
_cancel_msg: Optional[str] = None
_enter_debugger_on_cancel: bool = True
_started_called: bool = False
_started_received: bool = False
_stream_opened: bool = False
# only set on the callee side
_scope_nursery: Optional[trio.Nursery] = None
_backpressure: bool = False
async def send_yield(self, data: Any) -> None:
warnings.warn(
"`Context.send_yield()` is now deprecated. "
"Use ``MessageStream.send()``. ",
DeprecationWarning,
stacklevel=2,
)
await self.chan.send({'yield': data, 'cid': self.cid})
async def send_stop(self) -> None:
await self.chan.send({'stop': True, 'cid': self.cid})
async def _maybe_raise_from_remote_msg(
self,
msg: dict[str, Any],
) -> None:
'''
(Maybe) unpack and raise a msg error into the local scope
nursery for this context.
Acts as a form of "relay" for a remote error raised
in the corresponding remote callee task.
'''
error = msg.get('error')
if error:
# If this is an error message from a context opened by
# ``Portal.open_context()`` we want to interrupt any ongoing
# (child) tasks within that context to be notified of the remote
# error relayed here.
#
# The reason we may want to raise the remote error immediately
# is that there is no guarantee the associated local task(s)
# will attempt to read from any locally opened stream any time
# soon.
#
# NOTE: this only applies when
# ``Portal.open_context()`` has been called since it is assumed
# (currently) that other portal APIs (``Portal.run()``,
# ``.run_in_actor()``) do their own error checking at the point
# of the call and result processing.
log.error(
f'Remote context error for {self.chan.uid}:{self.cid}:\n'
f'{msg["error"]["tb_str"]}'
try:
await self._ctx.chan.send(
payload={
'yield': data,
'cid': self._ctx.cid,
},
# hide_tb=hide_tb,
)
error = unpack_error(msg, self.chan)
if (
isinstance(error, ContextCancelled) and
self._cancel_called
):
# this is an expected cancel request response message
# and we don't need to raise it in scope since it will
# potentially override a real error
return
self._error = error
# TODO: tempted to **not** do this by-reraising in a
# nursery and instead cancel a surrounding scope, detect
# the cancellation, then lookup the error that was set?
if self._scope_nursery:
async def raiser():
raise self._error from None
# from trio.testing import wait_all_tasks_blocked
# await wait_all_tasks_blocked()
if not self._scope_nursery._closed: # type: ignore
self._scope_nursery.start_soon(raiser)
async def cancel(
self,
msg: Optional[str] = None,
) -> None:
'''
Cancel this inter-actor-task context.
Request that the far side cancel it's current linked context,
Timeout quickly in an attempt to sidestep 2-generals...
'''
side = 'caller' if self._portal else 'callee'
if msg:
assert side == 'callee', 'Only callee side can provide cancel msg'
log.cancel(f'Cancelling {side} side of context to {self.chan.uid}')
self._cancel_called = True
if side == 'caller':
if not self._portal:
raise RuntimeError(
"No portal found, this is likely a callee side context"
)
cid = self.cid
with trio.move_on_after(0.5) as cs:
cs.shield = True
log.cancel(
f"Cancelling stream {cid} to "
f"{self._portal.channel.uid}")
# NOTE: we're telling the far end actor to cancel a task
# corresponding to *this actor*. The far end local channel
# instance is passed to `Actor._cancel_task()` implicitly.
await self._portal.run_from_ns('self', '_cancel_task', cid=cid)
if cs.cancelled_caught:
# XXX: there's no way to know if the remote task was indeed
# cancelled in the case where the connection is broken or
# some other network error occurred.
# if not self._portal.channel.connected():
if not self.chan.connected():
log.cancel(
"May have failed to cancel remote task "
f"{cid} for {self._portal.channel.uid}")
else:
log.cancel(
"Timed out on cancelling remote task "
f"{cid} for {self._portal.channel.uid}")
# callee side remote task
else:
self._cancel_msg = msg
# TODO: should we have an explicit cancel message
# or is relaying the local `trio.Cancelled` as an
# {'error': trio.Cancelled, cid: "blah"} enough?
# This probably gets into the discussion in
# https://github.com/goodboy/tractor/issues/36
assert self._scope_nursery
self._scope_nursery.cancel_scope.cancel()
if self._recv_chan:
await self._recv_chan.aclose()
@asynccontextmanager
async def open_stream(
self,
backpressure: Optional[bool] = True,
msg_buffer_size: Optional[int] = None,
) -> AsyncGenerator[MsgStream, None]:
'''
Open a ``MsgStream``, a bi-directional stream connected to the
cross-actor (far end) task for this ``Context``.
This context manager must be entered on both the caller and
callee for the stream to logically be considered "connected".
A ``MsgStream`` is currently "one-shot" use, meaning if you
close it you can not "re-open" it for streaming and instead you
must re-establish a new surrounding ``Context`` using
``Portal.open_context()``. In the future this may change but
currently there seems to be no obvious reason to support
"re-opening":
- pausing a stream can be done with a message.
- task errors will normally require a restart of the entire
scope of the inter-actor task context due to the nature of
``trio``'s cancellation system.
'''
actor = current_actor()
# here we create a mem chan that corresponds to the
# far end caller / callee.
# Likewise if the surrounding context has been cancelled we error here
# since it likely means the surrounding block was exited or
# killed
if self._cancel_called:
task = trio.lowlevel.current_task().name
raise ContextCancelled(
f'Context around {actor.uid[0]}:{task} was already cancelled!'
)
if not self._portal and not self._started_called:
raise RuntimeError(
'Context.started()` must be called before opening a stream'
)
# NOTE: in one way streaming this only happens on the
# caller side inside `Actor.start_remote_task()` so if you try
# to send a stop from the caller to the callee in the
# single-direction-stream case you'll get a lookup error
# currently.
ctx = actor.get_context(
self.chan,
self.cid,
msg_buffer_size=msg_buffer_size,
)
ctx._backpressure = backpressure
assert ctx is self
# XXX: If the underlying channel feeder receive mem chan has
# been closed then likely client code has already exited
# a ``.open_stream()`` block prior or there was some other
# unanticipated error or cancellation from ``trio``.
if ctx._recv_chan._closed:
raise trio.ClosedResourceError(
'The underlying channel for this stream was already closed!?')
async with MsgStream(
ctx=self,
rx_chan=ctx._recv_chan,
) as stream:
if self._portal:
self._portal._streams.add(stream)
try:
self._stream_opened = True
# XXX: do we need this?
# ensure we aren't cancelled before yielding the stream
# await trio.lowlevel.checkpoint()
yield stream
# NOTE: Make the stream "one-shot use". On exit, signal
# ``trio.EndOfChannel``/``StopAsyncIteration`` to the
# far end.
await stream.aclose()
finally:
if self._portal:
try:
self._portal._streams.remove(stream)
except KeyError:
log.warning(
f'Stream was already destroyed?\n'
f'actor: {self.chan.uid}\n'
f'ctx id: {self.cid}'
)
async def result(self) -> Any:
'''
From a caller side, wait for and return the final result from
the callee side task.
'''
assert self._portal, "Context.result() can not be called from callee!"
assert self._recv_chan
if self._result is False:
if not self._recv_chan._closed: # type: ignore
# wait for a final context result consuming
# and discarding any bi dir stream msgs still
# in transit from the far end.
while True:
msg = await self._recv_chan.receive()
try:
self._result = msg['return']
break
except KeyError as msgerr:
if 'yield' in msg:
# far end task is still streaming to us so discard
log.warning(f'Discarding stream delivered {msg}')
continue
elif 'stop' in msg:
log.debug('Remote stream terminated')
continue
# internal error should never get here
assert msg.get('cid'), (
"Received internal error at portal?")
raise unpack_error(
msg, self._portal.channel
) from msgerr
return self._result
async def started(
self,
value: Optional[Any] = None
) -> None:
'''
Indicate to calling actor's task that this linked context
has started and send ``value`` to the other side.
On the calling side ``value`` is the second item delivered
in the tuple returned by ``Portal.open_context()``.
'''
if self._portal:
raise RuntimeError(
f"Caller side context {self} can not call started!")
elif self._started_called:
raise RuntimeError(
f"called 'started' twice on context with {self.chan.uid}")
await self.chan.send({'started': value, 'cid': self.cid})
self._started_called = True
# TODO: do we need a restart api?
# async def restart(self) -> None:
# pass
except (
trio.ClosedResourceError,
trio.BrokenResourceError,
BrokenPipeError,
) as trans_err:
if hide_tb:
raise type(trans_err)(
*trans_err.args
) from trans_err
else:
raise
def stream(func: Callable) -> Callable:
"""Mark an async function as a streaming routine with ``@stream``.
'''
Mark an async function as a streaming routine with ``@stream``.
"""
# annotate
'''
# TODO: apply whatever solution ``mypy`` ends up picking for this:
# https://github.com/python/mypy/issues/2087#issuecomment-769266912
func._tractor_stream_function = True # type: ignore
@ -734,22 +564,3 @@ def stream(func: Callable) -> Callable:
"(Or ``to_trio`` if using ``asyncio`` in guest mode)."
)
return func
def context(func: Callable) -> Callable:
"""Mark an async function as a streaming routine with ``@context``.
"""
# annotate
# TODO: apply whatever solution ``mypy`` ends up picking for this:
# https://github.com/python/mypy/issues/2087#issuecomment-769266912
func._tractor_context_function = True # type: ignore
sig = inspect.signature(func)
params = sig.parameters
if 'ctx' not in params:
raise TypeError(
"The first argument to the context function "
f"{func.__name__} must be `ctx: tractor.Context`"
)
return func

View File

@ -21,22 +21,22 @@
from contextlib import asynccontextmanager as acm
from functools import partial
import inspect
from typing import (
Optional,
TYPE_CHECKING,
)
from pprint import pformat
from typing import TYPE_CHECKING
import typing
import warnings
from exceptiongroup import BaseExceptionGroup
import trio
from ._debug import maybe_wait_for_debugger
from .devx._debug import maybe_wait_for_debugger
from ._state import current_actor, is_main_process
from .log import get_logger, get_loglevel
from ._runtime import Actor
from ._portal import Portal
from ._exceptions import is_multi_cancelled
from ._exceptions import (
is_multi_cancelled,
ContextCancelled,
)
from ._root import open_root_actor
from . import _state
from . import _spawn
@ -94,7 +94,7 @@ class ActorNursery:
tuple[
Actor,
trio.Process | mp.Process,
Optional[Portal],
Portal | None,
]
] = {}
# portals spawned with ``run_in_actor()`` are
@ -106,16 +106,24 @@ class ActorNursery:
self.errors = errors
self.exited = trio.Event()
# NOTE: when no explicit call is made to
# `.open_root_actor()` by application code,
# `.open_nursery()` will implicitly call it to start the
# actor-tree runtime. In this case we mark ourselves as
# such so that runtime components can be aware for logging
# and syncing purposes to any actor opened nurseries.
self._implicit_runtime_started: bool = False
async def start_actor(
self,
name: str,
*,
bind_addr: tuple[str, int] = _default_bind_addr,
bind_addrs: list[tuple[str, int]] = [_default_bind_addr],
rpc_module_paths: list[str] | None = None,
enable_modules: list[str] | None = None,
loglevel: str | None = None, # set log level per subactor
nursery: trio.Nursery | None = None,
debug_mode: Optional[bool] | None = None,
debug_mode: bool | None = None,
infect_asyncio: bool = False,
) -> Portal:
'''
@ -150,14 +158,16 @@ class ActorNursery:
# modules allowed to invoked funcs from
enable_modules=enable_modules,
loglevel=loglevel,
arbiter_addr=current_actor()._arb_addr,
# verbatim relay this actor's registrar addresses
registry_addrs=current_actor().reg_addrs,
)
parent_addr = self._actor.accept_addr
assert parent_addr
# start a task to spawn a process
# blocks until process has been started and a portal setup
nursery = nursery or self._da_nursery
nursery: trio.Nursery = nursery or self._da_nursery
# XXX: the type ignore is actually due to a `mypy` bug
return await nursery.start( # type: ignore
@ -167,7 +177,7 @@ class ActorNursery:
self,
subactor,
self.errors,
bind_addr,
bind_addrs,
parent_addr,
_rtv, # run time vars
infect_asyncio=infect_asyncio,
@ -180,8 +190,8 @@ class ActorNursery:
fn: typing.Callable,
*,
name: Optional[str] = None,
bind_addr: tuple[str, int] = _default_bind_addr,
name: str | None = None,
bind_addrs: tuple[str, int] = [_default_bind_addr],
rpc_module_paths: list[str] | None = None,
enable_modules: list[str] | None = None,
loglevel: str | None = None, # set log level per subactor
@ -190,14 +200,16 @@ class ActorNursery:
**kwargs, # explicit args to ``fn``
) -> Portal:
"""Spawn a new actor, run a lone task, then terminate the actor and
'''
Spawn a new actor, run a lone task, then terminate the actor and
return its result.
Actors spawned using this method are kept alive at nursery teardown
until the task spawned by executing ``fn`` completes at which point
the actor is terminated.
"""
mod_path = fn.__module__
'''
mod_path: str = fn.__module__
if name is None:
# use the explicit function name if not provided
@ -208,7 +220,7 @@ class ActorNursery:
enable_modules=[mod_path] + (
enable_modules or rpc_module_paths or []
),
bind_addr=bind_addr,
bind_addrs=bind_addrs,
loglevel=loglevel,
# use the run_in_actor nursery
nursery=self._ria_nursery,
@ -232,21 +244,37 @@ class ActorNursery:
)
return portal
async def cancel(self, hard_kill: bool = False) -> None:
"""Cancel this nursery by instructing each subactor to cancel
async def cancel(
self,
hard_kill: bool = False,
) -> None:
'''
Cancel this nursery by instructing each subactor to cancel
itself and wait for all subactors to terminate.
If ``hard_killl`` is set to ``True`` then kill the processes
directly without any far end graceful ``trio`` cancellation.
"""
'''
self.cancelled = True
log.cancel(f"Cancelling nursery in {self._actor.uid}")
# TODO: impl a repr for spawn more compact
# then `._children`..
children: dict = self._children
child_count: int = len(children)
msg: str = f'Cancelling actor nursery with {child_count} children\n'
with trio.move_on_after(3) as cs:
async with trio.open_nursery() as tn:
async with trio.open_nursery() as nursery:
for subactor, proc, portal in self._children.values():
subactor: Actor
proc: trio.Process
portal: Portal
for (
subactor,
proc,
portal,
) in children.values():
# TODO: are we ever even going to use this or
# is the spawning backend responsible for such
@ -258,12 +286,13 @@ class ActorNursery:
if portal is None: # actor hasn't fully spawned yet
event = self._actor._peer_connected[subactor.uid]
log.warning(
f"{subactor.uid} wasn't finished spawning?")
f"{subactor.uid} never 't finished spawning?"
)
await event.wait()
# channel/portal should now be up
_, _, portal = self._children[subactor.uid]
_, _, portal = children[subactor.uid]
# XXX should be impossible to get here
# unless method was called from within
@ -280,14 +309,24 @@ class ActorNursery:
# spawn cancel tasks for each sub-actor
assert portal
if portal.channel.connected():
nursery.start_soon(portal.cancel_actor)
tn.start_soon(portal.cancel_actor)
log.cancel(msg)
# if we cancelled the cancel (we hung cancelling remote actors)
# then hard kill all sub-processes
if cs.cancelled_caught:
log.error(
f"Failed to cancel {self}\nHard killing process tree!")
for subactor, proc, portal in self._children.values():
f'Failed to cancel {self}?\n'
'Hard killing underlying subprocess tree!\n'
)
subactor: Actor
proc: trio.Process
portal: Portal
for (
subactor,
proc,
portal,
) in children.values():
log.warning(f"Hard killing process {proc}")
proc.terminate()
@ -327,7 +366,7 @@ async def _open_and_supervise_one_cancels_all_nursery(
# the above "daemon actor" nursery will be notified.
async with trio.open_nursery() as ria_nursery:
anursery = ActorNursery(
an = ActorNursery(
actor,
ria_nursery,
da_nursery,
@ -336,16 +375,16 @@ async def _open_and_supervise_one_cancels_all_nursery(
try:
# spawning of actors happens in the caller's scope
# after we yield upwards
yield anursery
yield an
# When we didn't error in the caller's scope,
# signal all process-monitor-tasks to conduct
# the "hard join phase".
log.runtime(
f"Waiting on subactors {anursery._children} "
"to complete"
'Waiting on subactors to complete:\n'
f'{pformat(an._children)}\n'
)
anursery._join_procs.set()
an._join_procs.set()
except BaseException as inner_err:
errors[actor.uid] = inner_err
@ -357,37 +396,60 @@ async def _open_and_supervise_one_cancels_all_nursery(
# Instead try to wait for pdb to be released before
# tearing down.
await maybe_wait_for_debugger(
child_in_debug=anursery._at_least_one_child_in_debug
child_in_debug=an._at_least_one_child_in_debug
)
# if the caller's scope errored then we activate our
# one-cancels-all supervisor strategy (don't
# worry more are coming).
anursery._join_procs.set()
an._join_procs.set()
# XXX: hypothetically an error could be
# raised and then a cancel signal shows up
# XXX NOTE XXX: hypothetically an error could
# be raised and then a cancel signal shows up
# slightly after in which case the `else:`
# block here might not complete? For now,
# shield both.
with trio.CancelScope(shield=True):
etype = type(inner_err)
etype: type = type(inner_err)
if etype in (
trio.Cancelled,
KeyboardInterrupt
KeyboardInterrupt,
) or (
is_multi_cancelled(inner_err)
):
log.cancel(
f"Nursery for {current_actor().uid} "
f"was cancelled with {etype}")
f'Actor-nursery cancelled by {etype}\n\n'
f'{current_actor().uid}\n'
f' |_{an}\n\n'
# TODO: show tb str?
# f'{tb_str}'
)
elif etype in {
ContextCancelled,
}:
log.cancel(
'Actor-nursery caught remote cancellation\n\n'
f'{inner_err.tb_str}'
)
else:
log.exception(
f"Nursery for {current_actor().uid} "
f"errored with")
'Nursery errored with:\n'
# TODO: same thing as in
# `._invoke()` to compute how to
# place this div-line in the
# middle of the above msg
# content..
# -[ ] prolly helper-func it too
# in our `.log` module..
# '------ - ------'
)
# cancel all subactors
await anursery.cancel()
await an.cancel()
# ria_nursery scope end
@ -408,18 +470,22 @@ async def _open_and_supervise_one_cancels_all_nursery(
# XXX: yet another guard before allowing the cancel
# sequence in case a (single) child is in debug.
await maybe_wait_for_debugger(
child_in_debug=anursery._at_least_one_child_in_debug
child_in_debug=an._at_least_one_child_in_debug
)
# If actor-local error was raised while waiting on
# ".run_in_actor()" actors then we also want to cancel all
# remaining sub-actors (due to our lone strategy:
# one-cancels-all).
log.cancel(f"Nursery cancelling due to {err}")
if anursery._children:
if an._children:
log.cancel(
'Actor-nursery cancelling due error type:\n'
f'{err}\n'
)
with trio.CancelScope(shield=True):
await anursery.cancel()
await an.cancel()
raise
finally:
# No errors were raised while awaiting ".run_in_actor()"
# actors but those actors may have returned remote errors as
@ -428,9 +494,9 @@ async def _open_and_supervise_one_cancels_all_nursery(
# collected in ``errors`` so cancel all actors, summarize
# all errors and re-raise.
if errors:
if anursery._children:
if an._children:
with trio.CancelScope(shield=True):
await anursery.cancel()
await an.cancel()
# use `BaseExceptionGroup` as needed
if len(errors) > 1:
@ -465,19 +531,20 @@ async def open_nursery(
which cancellation scopes correspond to each spawned subactor set.
'''
implicit_runtime = False
actor = current_actor(err_on_no_runtime=False)
implicit_runtime: bool = False
actor: Actor = current_actor(err_on_no_runtime=False)
an: ActorNursery|None = None
try:
if actor is None and is_main_process():
if (
actor is None
and is_main_process()
):
# if we are the parent process start the
# actor runtime implicitly
log.info("Starting actor runtime!")
# mark us for teardown on exit
implicit_runtime = True
implicit_runtime: bool = True
async with open_root_actor(**kwargs) as actor:
assert actor is current_actor()
@ -485,24 +552,42 @@ async def open_nursery(
try:
async with _open_and_supervise_one_cancels_all_nursery(
actor
) as anursery:
yield anursery
) as an:
# NOTE: mark this nursery as having
# implicitly started the root actor so
# that `._runtime` machinery can avoid
# certain teardown synchronization
# blocking/waits and any associated (warn)
# logging when it's known that this
# nursery shouldn't be exited before the
# root actor is.
an._implicit_runtime_started = True
yield an
finally:
anursery.exited.set()
# XXX: this event will be set after the root actor
# runtime is already torn down, so we want to
# avoid any blocking on it.
an.exited.set()
else: # sub-nursery case
try:
async with _open_and_supervise_one_cancels_all_nursery(
actor
) as anursery:
yield anursery
) as an:
yield an
finally:
anursery.exited.set()
an.exited.set()
finally:
log.debug("Nursery teardown complete")
msg: str = (
'Actor-nursery exited\n'
f'|_{an}\n'
)
# shutdown runtime if it was started
if implicit_runtime:
log.info("Shutting down actor tree")
msg += '=> Shutting down actor runtime <=\n'
log.info(msg)

View File

@ -0,0 +1,74 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
Various helpers/utils for auditing your `tractor` app and/or the
core runtime.
'''
from contextlib import asynccontextmanager as acm
import pathlib
import tractor
from .pytest import (
tractor_test as tractor_test
)
def repodir() -> pathlib.Path:
'''
Return the abspath to the repo directory.
'''
# 2 parents up to step up through tests/<repo_dir>
return pathlib.Path(
__file__
# 3 .parents bc:
# <._testing-pkg>.<tractor-pkg>.<git-repo-dir>
# /$HOME/../<tractor-repo-dir>/tractor/_testing/__init__.py
).parent.parent.parent.absolute()
def examples_dir() -> pathlib.Path:
'''
Return the abspath to the examples directory as `pathlib.Path`.
'''
return repodir() / 'examples'
@acm
async def expect_ctxc(
yay: bool,
reraise: bool = False,
) -> None:
'''
Small acm to catch `ContextCancelled` errors when expected
below it in a `async with ()` block.
'''
if yay:
try:
yield
raise RuntimeError('Never raised ctxc?')
except tractor.ContextCancelled:
if reraise:
raise
else:
return
else:
yield

View File

@ -0,0 +1,113 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
`pytest` utils helpers and plugins for testing `tractor`'s runtime
and applications.
'''
from functools import (
partial,
wraps,
)
import inspect
import platform
import tractor
import trio
def tractor_test(fn):
'''
Decorator for async test funcs to present them as "native"
looking sync funcs runnable by `pytest` using `trio.run()`.
Use:
@tractor_test
async def test_whatever():
await ...
If fixtures:
- ``reg_addr`` (a socket addr tuple where arbiter is listening)
- ``loglevel`` (logging level passed to tractor internals)
- ``start_method`` (subprocess spawning backend)
are defined in the `pytest` fixture space they will be automatically
injected to tests declaring these funcargs.
'''
@wraps(fn)
def wrapper(
*args,
loglevel=None,
reg_addr=None,
start_method: str|None = None,
debug_mode: bool = False,
**kwargs
):
# __tracebackhide__ = True
# NOTE: inject ant test func declared fixture
# names by manually checking!
if 'reg_addr' in inspect.signature(fn).parameters:
# injects test suite fixture value to test as well
# as `run()`
kwargs['reg_addr'] = reg_addr
if 'loglevel' in inspect.signature(fn).parameters:
# allows test suites to define a 'loglevel' fixture
# that activates the internal logging
kwargs['loglevel'] = loglevel
if start_method is None:
if platform.system() == "Windows":
start_method = 'trio'
if 'start_method' in inspect.signature(fn).parameters:
# set of subprocess spawning backends
kwargs['start_method'] = start_method
if 'debug_mode' in inspect.signature(fn).parameters:
# set of subprocess spawning backends
kwargs['debug_mode'] = debug_mode
if kwargs:
# use explicit root actor start
async def _main():
async with tractor.open_root_actor(
# **kwargs,
registry_addrs=[reg_addr] if reg_addr else None,
loglevel=loglevel,
start_method=start_method,
# TODO: only enable when pytest is passed --pdb
debug_mode=debug_mode,
):
await fn(*args, **kwargs)
main = _main
else:
# use implicit root actor start
main = partial(fn, *args, **kwargs)
return trio.run(main)
return wrapper

View File

@ -0,0 +1,37 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
"""
Runtime "developer experience" utils and addons to aid our
(advanced) users and core devs in building distributed applications
and working with/on the actor runtime.
"""
from ._debug import (
maybe_wait_for_debugger as maybe_wait_for_debugger,
acquire_debug_lock as acquire_debug_lock,
breakpoint as breakpoint,
pause as pause,
pause_from_sync as pause_from_sync,
shield_sigint_handler as shield_sigint_handler,
MultiActorPdb as MultiActorPdb,
open_crash_handler as open_crash_handler,
maybe_open_crash_handler as maybe_open_crash_handler,
post_mortem as post_mortem,
)
from ._stackscope import (
enable_stack_on_sig as enable_stack_on_sig,
)

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,84 @@
# tractor: structured concurrent "actors".
# Copyright eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
The fundamental cross process SC abstraction: an inter-actor,
cancel-scope linked task "context".
A ``Context`` is very similar to the ``trio.Nursery.cancel_scope`` built
into each ``trio.Nursery`` except it links the lifetimes of memory space
disjoint, parallel executing tasks in separate actors.
'''
from signal import (
signal,
SIGUSR1,
)
import trio
@trio.lowlevel.disable_ki_protection
def dump_task_tree() -> None:
import stackscope
from tractor.log import get_console_log
tree_str: str = str(
stackscope.extract(
trio.lowlevel.current_root_task(),
recurse_child_tasks=True
)
)
log = get_console_log('cancel')
log.pdb(
f'Dumping `stackscope` tree:\n\n'
f'{tree_str}\n'
)
# import logging
# try:
# with open("/dev/tty", "w") as tty:
# tty.write(tree_str)
# except BaseException:
# logging.getLogger(
# "task_tree"
# ).exception("Error printing task tree")
def signal_handler(sig: int, frame: object) -> None:
import traceback
try:
trio.lowlevel.current_trio_token(
).run_sync_soon(dump_task_tree)
except RuntimeError:
# not in async context -- print a normal traceback
traceback.print_stack()
def enable_stack_on_sig(
sig: int = SIGUSR1
) -> None:
'''
Enable `stackscope` tracing on reception of a signal; by
default this is SIGUSR1.
'''
signal(
sig,
signal_handler,
)
# NOTE: not the above can be triggered from
# a (xonsh) shell using:
# kill -SIGUSR1 @$(pgrep -f '<cmd>')

129
tractor/devx/cli.py 100644
View File

@ -0,0 +1,129 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
"""
CLI framework extensions for hacking on the actor runtime.
Currently popular frameworks supported are:
- `typer` via the `@callback` API
"""
from __future__ import annotations
from typing import (
Any,
Callable,
)
from typing_extensions import Annotated
import typer
_runtime_vars: dict[str, Any] = {}
def load_runtime_vars(
ctx: typer.Context,
callback: Callable,
pdb: bool = False, # --pdb
ll: Annotated[
str,
typer.Option(
'--loglevel',
'-l',
help='BigD logging level',
),
] = 'cancel', # -l info
):
'''
Maybe engage crash handling with `pdbp` when code inside
a `typer` CLI endpoint cmd raises.
To use this callback simply take your `app = typer.Typer()` instance
and decorate this function with it like so:
.. code:: python
from tractor.devx import cli
app = typer.Typer()
# manual decoration to hook into `click`'s context system!
cli.load_runtime_vars = app.callback(
invoke_without_command=True,
)
And then you can use the now augmented `click` CLI context as so,
.. code:: python
@app.command(
context_settings={
"allow_extra_args": True,
"ignore_unknown_options": True,
}
)
def my_cli_cmd(
ctx: typer.Context,
):
rtvars: dict = ctx.runtime_vars
pdb: bool = rtvars['pdb']
with tractor.devx.cli.maybe_open_crash_handler(pdb=pdb):
trio.run(
partial(
my_tractor_main_task_func,
debug_mode=pdb,
loglevel=rtvars['ll'],
)
)
which will enable log level and debug mode globally for the entire
`tractor` + `trio` runtime thereafter!
Bo
'''
global _runtime_vars
_runtime_vars |= {
'pdb': pdb,
'll': ll,
}
ctx.runtime_vars: dict[str, Any] = _runtime_vars
print(
f'`typer` sub-cmd: {ctx.invoked_subcommand}\n'
f'`tractor` runtime vars: {_runtime_vars}'
)
# XXX NOTE XXX: hackzone.. if no sub-cmd is specified (the
# default if the user just invokes `bigd`) then we simply
# invoke the sole `_bigd()` cmd passing in the "parent"
# typer.Context directly to that call since we're treating it
# as a "non sub-command" or wtv..
# TODO: ideally typer would have some kinda built-in way to get
# this behaviour without having to construct and manually
# invoke our own cmd..
if (
ctx.invoked_subcommand is None
or ctx.invoked_subcommand == callback.__name__
):
cmd: typer.core.TyperCommand = typer.core.TyperCommand(
name='bigd',
callback=callback,
)
ctx.params = {'ctx': ctx}
cmd.invoke(ctx)

View File

@ -31,13 +31,13 @@ from typing import (
Callable,
)
from functools import partial
from async_generator import aclosing
from contextlib import aclosing
import trio
import wrapt
from ..log import get_logger
from .._streaming import Context
from .._context import Context
__all__ = ['pub']
@ -148,7 +148,8 @@ def pub(
*,
tasks: set[str] = set(),
):
"""Publisher async generator decorator.
'''
Publisher async generator decorator.
A publisher can be called multiple times from different actors but
will only spawn a finite set of internal tasks to stream values to
@ -227,7 +228,8 @@ def pub(
running in a single actor to stream data to an arbitrary number of
subscribers. If you are ok to have a new task running for every call
to ``pub_service()`` then probably don't need this.
"""
'''
global _pubtask2lock
# handle the decorator not called with () case

View File

@ -48,12 +48,15 @@ LOG_FORMAT = (
DATE_FORMAT = '%b %d %H:%M:%S'
LEVELS = {
LEVELS: dict[str, int] = {
'TRANSPORT': 5,
'RUNTIME': 15,
'CANCEL': 16,
'PDB': 500,
}
# _custom_levels: set[str] = {
# lvlname.lower for lvlname in LEVELS.keys()
# }
STD_PALETTE = {
'CRITICAL': 'red',
@ -82,6 +85,10 @@ class StackLevelAdapter(logging.LoggerAdapter):
msg: str,
) -> None:
'''
IPC level msg-ing.
'''
return self.log(5, msg)
def runtime(
@ -94,22 +101,57 @@ class StackLevelAdapter(logging.LoggerAdapter):
self,
msg: str,
) -> None:
return self.log(16, msg)
'''
Cancellation logging, mostly for runtime reporting.
'''
return self.log(
level=16,
msg=msg,
# stacklevel=4,
)
def pdb(
self,
msg: str,
) -> None:
'''
Debugger logging.
'''
return self.log(500, msg)
def log(self, level, msg, *args, **kwargs):
"""
def log(
self,
level,
msg,
*args,
**kwargs,
):
'''
Delegate a log call to the underlying logger, after adding
contextual information from this adapter instance.
"""
'''
if self.isEnabledFor(level):
stacklevel: int = 3
if (
level in LEVELS.values()
# or level in _custom_levels
):
stacklevel: int = 4
# msg, kwargs = self.process(msg, kwargs)
self._log(level, msg, args, **kwargs)
self._log(
level=level,
msg=msg,
args=args,
# NOTE: not sure how this worked before but, it
# seems with our custom level methods defined above
# we do indeed (now) require another stack level??
stacklevel=stacklevel,
**kwargs,
)
# LOL, the stdlib doesn't allow passing through ``stacklevel``..
def _log(
@ -122,12 +164,15 @@ class StackLevelAdapter(logging.LoggerAdapter):
stack_info=False,
# XXX: bit we added to show fileinfo from actual caller.
# this level then ``.log()`` then finally the caller's level..
stacklevel=3,
# - this level
# - then ``.log()``
# - then finally the caller's level..
stacklevel=4,
):
"""
'''
Low-level log implementation, proxied to allow nested logger adapters.
"""
'''
return self.logger._log(
level,
msg,
@ -181,15 +226,39 @@ def get_logger(
'''
log = rlog = logging.getLogger(_root_name)
if name and name != _proj_name:
if (
name
and name != _proj_name
):
# handling for modules that use ``get_logger(__name__)`` to
# avoid duplicate project-package token in msg output
rname, _, tail = name.partition('.')
if rname == _root_name:
name = tail
# NOTE: for handling for modules that use ``get_logger(__name__)``
# we make the following stylistic choice:
# - always avoid duplicate project-package token
# in msg output: i.e. tractor.tractor _ipc.py in header
# looks ridiculous XD
# - never show the leaf module name in the {name} part
# since in python the {filename} is always this same
# module-file.
sub_name: None | str = None
rname, _, sub_name = name.partition('.')
pkgpath, _, modfilename = sub_name.rpartition('.')
# NOTE: for tractor itself never include the last level
# module key in the name such that something like: eg.
# 'tractor.trionics._broadcast` only includes the first
# 2 tokens in the (coloured) name part.
if rname == 'tractor':
sub_name = pkgpath
if _root_name in sub_name:
duplicate, _, sub_name = sub_name.partition('.')
if not sub_name:
log = rlog
else:
log = rlog.getChild(sub_name)
log = rlog.getChild(name)
log.level = rlog.level
# add our actor-task aware adapter which will dynamically look up
@ -220,11 +289,19 @@ def get_console_log(
if not level:
return log
log.setLevel(level.upper() if not isinstance(level, int) else level)
log.setLevel(
level.upper()
if not isinstance(level, int)
else level
)
if not any(
handler.stream == sys.stderr # type: ignore
for handler in logger.handlers if getattr(handler, 'stream', None)
for handler in logger.handlers if getattr(
handler,
'stream',
None,
)
):
handler = logging.StreamHandler()
formatter = colorlog.ColoredFormatter(
@ -242,3 +319,7 @@ def get_console_log(
def get_loglevel() -> str:
return _default_loglevel
# global module logger for tractor itself
log = get_logger('tractor')

View File

@ -0,0 +1,26 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
Built-in messaging patterns, types, APIs and helpers.
'''
from .ptr import (
NamespacePath as NamespacePath,
)
from .types import (
Struct as Struct,
)

View File

@ -15,7 +15,7 @@
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
Built-in messaging patterns, types, APIs and helpers.
IPC-compat cross-mem-boundary object pointer.
'''
@ -43,38 +43,92 @@ Built-in messaging patterns, types, APIs and helpers.
# - https://github.com/msgpack/msgpack-python#packingunpacking-of-custom-data-type
from __future__ import annotations
from inspect import (
isfunction,
ismethod,
)
from pkgutil import resolve_name
class NamespacePath(str):
'''
A serializeable description of a (function) Python object location
described by the target's module path and namespace key meant as
a message-native "packet" to allows actors to point-and-load objects
by absolute reference.
A serializeable `str`-subtype implementing a "namespace
pointer" to any Python object reference (like a function)
using the same format as the built-in `pkgutil.resolve_name()`
system.
A value describes a target's module-path and namespace-key
separated by a ':' and thus can be easily used as
a IPC-message-native reference-type allowing memory isolated
actors to point-and-load objects via a minimal `str` value.
'''
_ref: object = None
_ref: object | type | None = None
def load_ref(self) -> object:
# TODO: support providing the ns instance in
# order to support 'self.<meth>` style to make
# `Portal.run_from_ns()` work!
# _ns: ModuleType|type|None = None
def load_ref(self) -> object | type:
if self._ref is None:
self._ref = resolve_name(self)
return self._ref
def to_tuple(
self,
@staticmethod
def _mk_fqnp(ref: type | object) -> tuple[str, str]:
'''
Generate a minial ``str`` pair which describes a python
object's namespace path and object/type name.
) -> tuple[str, str]:
ref = self.load_ref()
return ref.__module__, getattr(ref, '__name__', '')
In more precise terms something like:
- 'py.namespace.path:object_name',
- eg.'tractor.msg:NamespacePath' will be the ``str`` form
of THIS type XD
'''
if (
isfunction(ref)
):
name: str = getattr(ref, '__name__')
elif ismethod(ref):
# build out the path manually i guess..?
# TODO: better way?
name: str = '.'.join([
type(ref.__self__).__name__,
ref.__func__.__name__,
])
else: # object or other?
# isinstance(ref, object)
# and not isfunction(ref)
name: str = type(ref).__name__
# fully qualified namespace path, tuple.
fqnp: tuple[str, str] = (
ref.__module__,
name,
)
return fqnp
@classmethod
def from_ref(
cls,
ref,
ref: type | object,
) -> NamespacePath:
return cls(':'.join(
(ref.__module__,
getattr(ref, '__name__', ''))
))
fqnp: tuple[str, str] = cls._mk_fqnp(ref)
return cls(':'.join(fqnp))
def to_tuple(
self,
# TODO: could this work re `self:<meth>` case from above?
# load_ref: bool = True,
) -> tuple[str, str]:
return self._mk_fqnp(
self.load_ref()
)

View File

@ -0,0 +1,270 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
Extensions to built-in or (heavily used but 3rd party) friend-lib
types.
'''
from __future__ import annotations
from collections import UserList
from pprint import (
saferepr,
)
from typing import (
Any,
Iterator,
)
from msgspec import (
msgpack,
Struct as _Struct,
structs,
)
# TODO: auto-gen type sig for input func both for
# type-msgs and logging of RPC tasks?
# taken and modified from:
# https://stackoverflow.com/a/57110117
# import inspect
# from typing import List
# def my_function(input_1: str, input_2: int) -> list[int]:
# pass
# def types_of(func):
# specs = inspect.getfullargspec(func)
# return_type = specs.annotations['return']
# input_types = [t.__name__ for s, t in specs.annotations.items() if s != 'return']
# return f'{func.__name__}({": ".join(input_types)}) -> {return_type}'
# types_of(my_function)
class DiffDump(UserList):
'''
Very simple list delegator that repr() dumps (presumed) tuple
elements of the form `tuple[str, Any, Any]` in a nice
multi-line readable form for analyzing `Struct` diffs.
'''
def __repr__(self) -> str:
if not len(self):
return super().__repr__()
# format by displaying item pair's ``repr()`` on multiple,
# indented lines such that they are more easily visually
# comparable when printed to console when printed to
# console.
repstr: str = '[\n'
for k, left, right in self:
repstr += (
f'({k},\n'
f'\t{repr(left)},\n'
f'\t{repr(right)},\n'
')\n'
)
repstr += ']\n'
return repstr
class Struct(
_Struct,
# https://jcristharif.com/msgspec/structs.html#tagged-unions
# tag='pikerstruct',
# tag=True,
):
'''
A "human friendlier" (aka repl buddy) struct subtype.
'''
def _sin_props(self) -> Iterator[
tuple[
structs.FieldIinfo,
str,
Any,
]
]:
'''
Iterate over all non-@property fields of this struct.
'''
fi: structs.FieldInfo
for fi in structs.fields(self):
key: str = fi.name
val: Any = getattr(self, key)
yield fi, key, val
def to_dict(
self,
include_non_members: bool = True,
) -> dict:
'''
Like it sounds.. direct delegation to:
https://jcristharif.com/msgspec/api.html#msgspec.structs.asdict
BUT, by default we pop all non-member (aka not defined as
struct fields) fields by default.
'''
asdict: dict = structs.asdict(self)
if include_non_members:
return asdict
# only return a dict of the struct members
# which were provided as input, NOT anything
# added as type-defined `@property` methods!
sin_props: dict = {}
fi: structs.FieldInfo
for fi, k, v in self._sin_props():
sin_props[k] = asdict[k]
return sin_props
def pformat(
self,
field_indent: int = 2,
indent: int = 0,
) -> str:
'''
Recursion-safe `pprint.pformat()` style formatting of
a `msgspec.Struct` for sane reading by a human using a REPL.
'''
# global whitespace indent
ws: str = ' '*indent
# field whitespace indent
field_ws: str = ' '*(field_indent + indent)
# qtn: str = ws + self.__class__.__qualname__
qtn: str = self.__class__.__qualname__
obj_str: str = '' # accumulator
fi: structs.FieldInfo
k: str
v: Any
for fi, k, v in self._sin_props():
# TODO: how can we prefer `Literal['option1', 'option2,
# ..]` over .__name__ == `Literal` but still get only the
# latter for simple types like `str | int | None` etc..?
ft: type = fi.type
typ_name: str = getattr(ft, '__name__', str(ft))
# recurse to get sub-struct's `.pformat()` output Bo
if isinstance(v, Struct):
val_str: str = v.pformat(
indent=field_indent + indent,
field_indent=indent + field_indent,
)
else: # the `pprint` recursion-safe format:
# https://docs.python.org/3.11/library/pprint.html#pprint.saferepr
val_str: str = saferepr(v)
# TODO: LOLOL use `textwrap.indent()` instead dawwwwwg!
obj_str += (field_ws + f'{k}: {typ_name} = {val_str},\n')
return (
f'{qtn}(\n'
f'{obj_str}'
f'{ws})'
)
# TODO: use a pprint.PrettyPrinter instance around ONLY rendering
# inside a known tty?
# def __repr__(self) -> str:
# ...
# __str__ = __repr__ = pformat
__repr__ = pformat
def copy(
self,
update: dict | None = None,
) -> Struct:
'''
Validate-typecast all self defined fields, return a copy of
us with all such fields.
NOTE: This is kinda like the default behaviour in
`pydantic.BaseModel` except a copy of the object is
returned making it compat with `frozen=True`.
'''
if update:
for k, v in update.items():
setattr(self, k, v)
# NOTE: roundtrip serialize to validate
# - enode to msgpack binary format,
# - decode that back to a struct.
return msgpack.Decoder(type=type(self)).decode(
msgpack.Encoder().encode(self)
)
def typecast(
self,
# TODO: allow only casting a named subset?
# fields: set[str] | None = None,
) -> None:
'''
Cast all fields using their declared type annotations
(kinda like what `pydantic` does by default).
NOTE: this of course won't work on frozen types, use
``.copy()`` above in such cases.
'''
# https://jcristharif.com/msgspec/api.html#msgspec.structs.fields
fi: structs.FieldInfo
for fi in structs.fields(self):
setattr(
self,
fi.name,
fi.type(getattr(self, fi.name)),
)
def __sub__(
self,
other: Struct,
) -> DiffDump[tuple[str, Any, Any]]:
'''
Compare fields/items key-wise and return a ``DiffDump``
for easy visual REPL comparison B)
'''
diffs: DiffDump[tuple[str, Any, Any]] = DiffDump()
for fi in structs.fields(self):
attr_name: str = fi.name
ours: Any = getattr(self, attr_name)
theirs: Any = getattr(other, attr_name)
if ours != theirs:
diffs.append((
attr_name,
ours,
theirs,
))
return diffs

View File

@ -28,16 +28,19 @@ from typing import (
Callable,
AsyncIterator,
Awaitable,
Optional,
)
import trio
from outcome import Error
from .log import get_logger
from ._state import current_actor
from ._exceptions import AsyncioCancelled
from .trionics._broadcast import (
from tractor.log import get_logger
from tractor._state import (
current_actor,
debug_mode,
)
from tractor.devx import _debug
from tractor._exceptions import AsyncioCancelled
from tractor.trionics._broadcast import (
broadcast_receiver,
BroadcastReceiver,
)
@ -65,9 +68,9 @@ class LinkedTaskChannel(trio.abc.Channel):
_trio_exited: bool = False
# set after ``asyncio.create_task()``
_aio_task: Optional[asyncio.Task] = None
_aio_err: Optional[BaseException] = None
_broadcaster: Optional[BroadcastReceiver] = None
_aio_task: asyncio.Task|None = None
_aio_err: BaseException|None = None
_broadcaster: BroadcastReceiver|None = None
async def aclose(self) -> None:
await self._from_aio.aclose()
@ -159,7 +162,9 @@ def _run_asyncio_task(
'''
__tracebackhide__ = True
if not current_actor().is_infected_aio():
raise RuntimeError("`infect_asyncio` mode is not enabled!?")
raise RuntimeError(
"`infect_asyncio` mode is not enabled!?"
)
# ITC (inter task comms), these channel/queue names are mostly from
# ``asyncio``'s perspective.
@ -188,7 +193,7 @@ def _run_asyncio_task(
cancel_scope = trio.CancelScope()
aio_task_complete = trio.Event()
aio_err: Optional[BaseException] = None
aio_err: BaseException|None = None
chan = LinkedTaskChannel(
aio_q, # asyncio.Queue
@ -217,7 +222,14 @@ def _run_asyncio_task(
try:
result = await coro
except BaseException as aio_err:
log.exception('asyncio task errored')
if isinstance(aio_err, CancelledError):
log.runtime(
'`asyncio` task was cancelled..\n'
)
else:
log.exception(
'`asyncio` task errored\n'
)
chan._aio_err = aio_err
raise
@ -247,7 +259,7 @@ def _run_asyncio_task(
if not inspect.isawaitable(coro):
raise TypeError(f"No support for invoking {coro}")
task = asyncio.create_task(
task: asyncio.Task = asyncio.create_task(
wait_on_coro_final_result(
to_trio,
coro,
@ -256,6 +268,18 @@ def _run_asyncio_task(
)
chan._aio_task = task
# XXX TODO XXX get this actually workin.. XD
# maybe setup `greenback` for `asyncio`-side task REPLing
if (
debug_mode()
and
(greenback := _debug.maybe_import_greenback(
force_reload=True,
raise_not_found=False,
))
):
greenback.bestow_portal(task)
def cancel_trio(task: asyncio.Task) -> None:
'''
Cancel the calling ``trio`` task on error.
@ -263,7 +287,7 @@ def _run_asyncio_task(
'''
nonlocal chan
aio_err = chan._aio_err
task_err: Optional[BaseException] = None
task_err: BaseException|None = None
# only to avoid ``asyncio`` complaining about uncaptured
# task exceptions
@ -272,12 +296,22 @@ def _run_asyncio_task(
except BaseException as terr:
task_err = terr
msg: str = (
'Infected `asyncio` task {etype_str}\n'
f'|_{task}\n'
)
if isinstance(terr, CancelledError):
log.cancel(f'`asyncio` task cancelled: {task.get_name()}')
log.cancel(
msg.format(etype_str='cancelled')
)
else:
log.exception(f'`asyncio` task: {task.get_name()} errored')
log.exception(
msg.format(etype_str='cancelled')
)
assert type(terr) is type(aio_err), 'Asyncio task error mismatch?'
assert type(terr) is type(aio_err), (
'`asyncio` task error mismatch?!?'
)
if aio_err is not None:
# XXX: uhh is this true?
@ -290,18 +324,22 @@ def _run_asyncio_task(
# We might want to change this in the future though.
from_aio.close()
if type(aio_err) is CancelledError:
log.cancel("infected task was cancelled")
# TODO: show that the cancellation originated
# from the ``trio`` side? right?
# if cancel_scope.cancelled:
# raise aio_err from err
elif task_err is None:
if task_err is None:
assert aio_err
aio_err.with_traceback(aio_err.__traceback__)
log.error('infected task errorred')
# log.error(
# 'infected task errorred'
# )
# TODO: show that the cancellation originated
# from the ``trio`` side? right?
# elif type(aio_err) is CancelledError:
# log.cancel(
# 'infected task was cancelled'
# )
# if cancel_scope.cancelled:
# raise aio_err from err
# XXX: alway cancel the scope on error
# in case the trio task is blocking
@ -329,11 +367,11 @@ async def translate_aio_errors(
'''
trio_task = trio.lowlevel.current_task()
aio_err: Optional[BaseException] = None
aio_err: BaseException|None = None
# TODO: make thisi a channel method?
def maybe_raise_aio_err(
err: Optional[Exception] = None
err: Exception|None = None
) -> None:
aio_err = chan._aio_err
if (
@ -511,6 +549,16 @@ def run_as_asyncio_guest(
loop = asyncio.get_running_loop()
trio_done_fut = asyncio.Future()
if debug_mode():
# XXX make it obvi we know this isn't supported yet!
log.error(
'Attempting to enter unsupported `greenback` init '
'from `asyncio` task..'
)
await _debug.maybe_init_greenback(
force_reload=True,
)
def trio_done_callback(main_outcome):
if isinstance(main_outcome, Error):

View File

@ -19,22 +19,13 @@ Sugary patterns for trio + tractor designs.
'''
from ._mngrs import (
gather_contexts,
maybe_open_context,
maybe_open_nursery,
gather_contexts as gather_contexts,
maybe_open_context as maybe_open_context,
maybe_open_nursery as maybe_open_nursery,
)
from ._broadcast import (
broadcast_receiver,
BroadcastReceiver,
Lagged,
AsyncReceiver as AsyncReceiver,
broadcast_receiver as broadcast_receiver,
BroadcastReceiver as BroadcastReceiver,
Lagged as Lagged,
)
__all__ = [
'gather_contexts',
'broadcast_receiver',
'BroadcastReceiver',
'Lagged',
'maybe_open_context',
'maybe_open_nursery',
]

View File

@ -25,8 +25,15 @@ from collections import deque
from contextlib import asynccontextmanager
from functools import partial
from operator import ne
from typing import Optional, Callable, Awaitable, Any, AsyncIterator, Protocol
from typing import Generic, TypeVar
from typing import (
Callable,
Awaitable,
Any,
AsyncIterator,
Protocol,
Generic,
TypeVar,
)
import trio
from trio._core._run import Task
@ -37,6 +44,11 @@ from tractor.log import get_logger
log = get_logger(__name__)
# TODO: use new type-vars syntax from 3.12
# https://realpython.com/python312-new-features/#dedicated-type-variable-syntax
# https://docs.python.org/3/whatsnew/3.12.html#whatsnew312-pep695
# https://docs.python.org/3/reference/simple_stmts.html#type
#
# A regular invariant generic type
T = TypeVar("T")
@ -102,7 +114,7 @@ class BroadcastState(Struct):
# broadcast event to wake up all sleeping consumer tasks
# on a newly produced value from the sender.
recv_ready: Optional[tuple[int, trio.Event]] = None
recv_ready: tuple[int, trio.Event]|None = None
# if a ``trio.EndOfChannel`` is received on any
# consumer all consumers should be placed in this state
@ -156,7 +168,7 @@ class BroadcastReceiver(ReceiveChannel):
rx_chan: AsyncReceiver,
state: BroadcastState,
receive_afunc: Optional[Callable[[], Awaitable[Any]]] = None,
receive_afunc: Callable[[], Awaitable[Any]]|None = None,
raise_on_lag: bool = True,
) -> None:
@ -444,7 +456,7 @@ def broadcast_receiver(
recv_chan: AsyncReceiver,
max_buffer_size: int,
receive_afunc: Optional[Callable[[], Awaitable[Any]]] = None,
receive_afunc: Callable[[], Awaitable[Any]]|None = None,
raise_on_lag: bool = True,
) -> BroadcastReceiver:

View File

@ -33,10 +33,9 @@ from typing import (
)
import trio
from trio_typing import TaskStatus
from .._state import current_actor
from ..log import get_logger
from tractor._state import current_actor
from tractor.log import get_logger
log = get_logger(__name__)
@ -70,6 +69,7 @@ async def _enter_and_wait(
unwrapped: dict[int, T],
all_entered: trio.Event,
parent_exit: trio.Event,
seed: int,
) -> None:
'''
@ -80,7 +80,10 @@ async def _enter_and_wait(
async with mngr as value:
unwrapped[id(mngr)] = value
if all(unwrapped.values()):
if all(
val != seed
for val in unwrapped.values()
):
all_entered.set()
await parent_exit.wait()
@ -91,7 +94,13 @@ async def gather_contexts(
mngrs: Sequence[AsyncContextManager[T]],
) -> AsyncGenerator[tuple[Optional[T], ...], None]:
) -> AsyncGenerator[
tuple[
T | None,
...
],
None,
]:
'''
Concurrently enter a sequence of async context managers, each in
a separate ``trio`` task and deliver the unwrapped values in the
@ -104,7 +113,11 @@ async def gather_contexts(
entered and exited, and cancellation just works.
'''
unwrapped: dict[int, Optional[T]] = {}.fromkeys(id(mngr) for mngr in mngrs)
seed: int = id(mngrs)
unwrapped: dict[int, T | None] = {}.fromkeys(
(id(mngr) for mngr in mngrs),
seed,
)
all_entered = trio.Event()
parent_exit = trio.Event()
@ -116,8 +129,9 @@ async def gather_contexts(
if not mngrs:
raise ValueError(
'input mngrs is empty?\n'
'Did try to use inline generator syntax?'
'`.trionics.gather_contexts()` input mngrs is empty?\n'
'Did try to use inline generator syntax?\n'
'Use a non-lazy iterator or sequence type intead!'
)
async with trio.open_nursery() as n:
@ -128,6 +142,7 @@ async def gather_contexts(
unwrapped,
all_entered,
parent_exit,
seed,
)
# deliver control once all managers have started up
@ -168,7 +183,7 @@ class _Cache:
cls,
mng,
ctx_key: tuple,
task_status: TaskStatus[T] = trio.TASK_STATUS_IGNORED,
task_status: trio.TaskStatus[T] = trio.TASK_STATUS_IGNORED,
) -> None:
async with mng as value:
@ -209,6 +224,7 @@ async def maybe_open_context(
# yielded output
yielded: Any = None
lock_registered: bool = False
# Lock resource acquisition around task racing / ``trio``'s
# scheduler protocol.
@ -216,6 +232,7 @@ async def maybe_open_context(
# to allow re-entrant use cases where one `maybe_open_context()`
# wrapped factor may want to call into another.
lock = _Cache.locks.setdefault(fid, trio.Lock())
lock_registered: bool = True
await lock.acquire()
# XXX: one singleton nursery per actor and we want to
@ -237,7 +254,7 @@ async def maybe_open_context(
yielded = _Cache.values[ctx_key]
except KeyError:
log.info(f'Allocating new {acm_func} for {ctx_key}')
log.debug(f'Allocating new {acm_func} for {ctx_key}')
mngr = acm_func(**kwargs)
resources = _Cache.resources
assert not resources.get(ctx_key), f'Resource exists? {ctx_key}'
@ -265,7 +282,7 @@ async def maybe_open_context(
if yielded is not None:
# if no more consumers, teardown the client
if _Cache.users <= 0:
log.info(f'De-allocating resource for {ctx_key}')
log.debug(f'De-allocating resource for {ctx_key}')
# XXX: if we're cancelled we the entry may have never
# been entered since the nursery task was killed.
@ -275,4 +292,9 @@ async def maybe_open_context(
_, no_more_users = entry
no_more_users.set()
_Cache.locks.pop(fid)
if lock_registered:
maybe_lock = _Cache.locks.pop(fid, None)
if maybe_lock is None:
log.error(
f'Resource lock for {fid} ALREADY POPPED?'
)