Compare commits

..

97 Commits

Author SHA1 Message Date
Tyler Goodlet 7f29fd8dcf Let `pack_error()` take a msg injected `cid: str|None` 2024-02-18 17:17:31 -05:00
Tyler Goodlet 7fbada8a15 Add `StreamOverrun.sender: tuple` for better handling
Since it's generally useful to know who is the cause of an overrun (say
bc you want your system to then adjust the writer side to slow tf down)
might as well pack an extra `.sender: tuple[str, str]` actor uid field
which can be relayed through `RemoteActorError` boxing. Add an extra
case for the exc-type to `unpack_error()` to match B)
2024-02-16 15:23:02 -05:00
Tyler Goodlet 286e75d342 Offer `unpack_error(hid_tb: bool)` for `pdbp` REPL config 2024-02-14 16:13:32 -05:00
Tyler Goodlet df641d9d31 Bring in pretty-ified `msgspec.Struct` extension
Originally designed and used throughout `piker`, the subtype adds some
handy pprinting and field diffing extras often handy when viewing struct
types in logging or REPL console interfaces B)

Obvi this rejigs the `tractor.msg` mod into a sub-pkg and moves the
existing namespace obj-pointer stuff into a new `.msg.ptr` sub mod.
2024-01-28 16:33:10 -05:00
Tyler Goodlet 35b0c4bef0 Never mask original `KeyError` in portal-error unwrapper, for now? 2024-01-23 11:14:10 -05:00
Tyler Goodlet c4496f21fc Try allowing multi-pops of `_Cache.locks` for now? 2024-01-23 11:13:07 -05:00
Tyler Goodlet 7e0e627921 Use `import <blah> as blah` over `__all__` in `.trionics` 2024-01-23 11:09:38 -05:00
Tyler Goodlet 28ea8e787a Bump timeout on resource cache test a bitty bit. 2024-01-03 22:27:05 -05:00
Tyler Goodlet 0294455c5e `_root`: drop unused `typing` import 2024-01-02 18:43:43 -05:00
Tyler Goodlet 734bc09b67 Move missing-key-in-msg raiser to `._exceptions`
Since we use basically the exact same set of logic in
`Portal.open_context()` when expecting the first `'started'` msg factor
and generalize `._streaming._raise_from_no_yield_msg()` into a new
`._exceptions._raise_from_no_key_in_msg()` (as per the lingering todo)
which obvi requires a more generalized / optional signature including
a caller specific `log` obj. Obvi call the new func from all the other
modules X)
2024-01-02 18:34:15 -05:00
Tyler Goodlet 0bcdea28a0 Fmt repr as multi-line style call 2024-01-02 11:28:55 -05:00
Tyler Goodlet fdf3a1b01b Only use `greenback` if actor-runtime is up.. 2024-01-02 11:28:02 -05:00
Tyler Goodlet ce7b8a5e18 Drop unused walrus assign of `re` 2024-01-02 11:21:20 -05:00
Tyler Goodlet 00024181cd `StackLevelAdapter._log(stacklevel: int)` for custom levels..
Apparently (and i don't know if this was always broken [i feel like no?]
or is a recent change to stdlib's `logging` stuff) we need increment the
`stacklevel` input by one for our custom level methods now? Without this
you're going to see the path to the method's-callstack-frame on every
emission instead of to the caller's. I first noticed this when debugging
the workspace layer spawning in `modden.bigd` and then verified it in
other depended projects..

I guess we should add some tests for this as well XD
2024-01-02 10:38:04 -05:00
Tyler Goodlet 814384848d Use `import <name> as <name>,` style over `__all__` in pkg mod 2024-01-02 10:25:17 -05:00
Tyler Goodlet bea31f6d19 ._child: remove some unused imports.. 2024-01-02 10:24:39 -05:00
Tyler Goodlet 250275d98d Guarding for IPC failures in `._runtime._invoke()`
Took me longer then i wanted to figure out the source of
a failed-response to a remote-cancellation (in this case in `modden`
where a client was cancelling a workspace layer.. but disconnects before
receiving the ack msg) that was triggering an IPC error when sending the
error msg for the cancellation of a `Actor._cancel_task()`, but since
this (non-rpc) `._invoke()` task was trying to send to a now
disconnected canceller it was resulting in a `BrokenPipeError` (or similar)
error.

Now, we except for such IPC errors and only raise them when,
1. the transport `Channel` is for sure up (bc ow what's the point of
   trying to send an error on the thing that caused it..)
2. it's definitely for handling an RPC task

Similarly if the entire main invoke `try:` excepts,
- we only hide the call-stack frame from the debugger (with
  `__tracebackhide__: bool`) if it's an RPC task that has a connected
  channel since we always want to see the frame when debugging internal
  task or IPC failures.
- we don't bother trying to send errors to the context caller (actor)
  when it's a non-RPC request since failures on actor-runtime-internal
  tasks shouldn't really ever be reported remotely, only maybe raised
  locally.

Also some other tidying,
- this properly corrects for the self-cancel case where an RPC context
  is cancelled due to a local (runtime) task calling a method like
  `Actor.cancel_soon()`. We now set our own `.uid` as the
  `ContextCancelled.canceller` value so that other-end tasks know that
  the cancellation was due to a self-cancellation by the actor itself.
  We still need to properly test for this though!
- add a more detailed module doc-str.
- more explicit imports for `trio` core types throughout.
2024-01-02 10:23:45 -05:00
Tyler Goodlet f415fc43ce `.discovery.get_arbiter()`: add warning around this now deprecated usage 2023-12-11 19:37:45 -05:00
Tyler Goodlet 3f15923537 More thurough hard kill doc strings 2023-12-11 18:17:42 -05:00
Tyler Goodlet 87cd725adb Add `open_root_actor(ensure_registry: bool)`
Allows forcing the opened actor to either obtain the passed registry
addrs or raise a runtime error.
2023-11-07 16:45:24 -05:00
Tyler Goodlet 48accbd28f Fix doc string "its" typo.. 2023-11-06 15:44:21 -05:00
Tyler Goodlet 227c9ea173 Test with `any(portals)` since `gather_contexts()` will return `list[None | tuple]` 2023-11-06 15:43:43 -05:00
Tyler Goodlet d651f3d8e9 Tons of interpeer test cleanup
Drop all the nested `@acm` blocks and defunct comments from initial
validations. Add some todos for cases that are still unclear such as
whether the caller / streamer should have `.cancelled_caught == True` in
it's teardown.
2023-10-25 15:21:41 -04:00
Tyler Goodlet ef0cfc4b20 Get inter-peer suite passing with all `Context` state checks!
Definitely needs some cleaning and refinement but this gets us to stage
1 of being pretty frickin correct i'd say 💃
2023-10-23 18:24:23 -04:00
Tyler Goodlet ecb525a2bc Adjust test details where `Context.cancel()` is called
We can now make asserts on `.cancelled_caught` and `_remote_error` vs.
`_local_error`. Expect a runtime error when `Context.open_stream()` is
called AFTER `.cancel()` and the remote `ContextCancelled` hasn't
arrived (yet). Adjust to `'itself'` string in self-cancel case.
2023-10-23 17:49:02 -04:00
Tyler Goodlet b77d123edd Fix `Context.result()` call to be in runtime scope 2023-10-23 17:48:34 -04:00
Tyler Goodlet f4e63465de Tweak `Channel._cancel_called` comment 2023-10-23 17:47:55 -04:00
Tyler Goodlet df31047ecb Be ultra-correct in `Portal.open_context()`
This took way too long to get right but hopefully will give us grok-able
and correct context exit semantics going forward B)

The main fixes were:
- always shielding the `MsgStream.aclose()` call on teardown to avoid
  bubbling a `Cancelled`.
- properly absorbing any `ContextCancelled` in cases due to "self
  cancellation" using the new `Context.canceller` in the logic.
- capturing any error raised by the `Context.result()` call in the
  "normal exit, result received" case and setting it as the
  `Context._local_error` so that self-cancels can be easily measured via
  `Context.cancelled_caught` in same way as remote-error caused
  cancellations.
- extremely detailed comments around all of the cancellation-error cases
  to avoid ever getting confused about the control flow in the future XD
2023-10-23 17:34:28 -04:00
Tyler Goodlet 131674eabd Be mega-pedantic with `ContextCancelled` semantics
As part of extremely detailed inter-peer-actor testing, add much more
granular `Context` cancellation state tracking via the following (new)
fields:
- `.canceller: tuple[str, str]` the uuid of the actor responsible for
  the cancellation condition - always set by
  `Context._maybe_cancel_and_set_remote_error()` and replaces
  `._cancelled_remote` and `.cancel_called_remote`. If set, this value
  should normally always match a value from some `ContextCancelled`
  raised or caught by one side of the context.
- `._local_error` which is always set to the locally raised (and caller
  or callee task's scope-internal) error which caused any
  eventual cancellation/error condition and thus any closure of the
  context's per-task-side-`trio.Nursery`.
- `.cancelled_caught: bool` is now always `True` whenever the local task
  catches (or "silently absorbs") a `ContextCancelled` (a `ctxc`) that
  indeed originated from one of the context's linked tasks or any other
  context which raised its own `ctxc` in the current `.open_context()` scope.
  => whenever there is a case that no `ContextCancelled` was raised
  **in** the `.open_context().__aexit__()` (eg. `ctx.result()` called
  after a call `ctx.cancel()`), we still consider the context's as
  having "caught a cancellation" since the `ctxc` was indeed silently
  handled by the cancel requester; all other error cases are already
  represented by mirroring the state of the `._scope: trio.CancelScope`
  => IOW there should be **no case** where an error is **not raised** in
  the context's scope and `.cancelled_caught: bool == False`, i.e. no
  case where `._scope.cancelled_caught == False and ._local_error is not
  None`!
- always raise any `ctxc` from `.open_stream()` if `._cancel_called ==
  True` - if the cancellation request has not already resulted in
  a `._remote_error: ContextCancelled` we raise a `RuntimeError` to
  indicate improper usage to the guilty side's task code.
- make `._maybe_raise_remote_err()` a sync func and don't raise
  any `ctxc` which is matched against a `.canceller` determined to
  be the current actor, aka a "self cancel", and always set the
  `._local_error` to any such `ctxc`.
- `.side: str` taken from inside `.cancel()` and unused as of now since
  it might be better re-written as a similar `.is_opener() -> bool`?
- drop unused `._started_received: bool`..
- TONS and TONS of detailed comments/docs to attempt to explain all the
  possible cancellation/exit cases and how they should exhibit as either
  silent closes or raises from the `Context` API!

Adjust the `._runtime._invoke()` code to match:
- use `ctx._maybe_raise_remote_err()` in `._invoke()`.
- adjust to new `.canceller` property.
- more type hints.
- better `log.cancel()` msging around self-cancels vs. peer-cancels.
- always set the `._local_error: BaseException` for the "callee" task
  just like `Portal.open_context()` now will do B)

Prior we were raising any `Context._remote_error` directly and doing
(more or less) the same `ContextCancelled` "absorbing" logic (well
kinda) in block; instead delegate to the method
2023-10-23 16:24:54 -04:00
Tyler Goodlet 5a94e8fb5b Raise a `MessagingError` from the src error on msging edge cases 2023-10-23 14:34:12 -04:00
Tyler Goodlet 0518b3ab04 Move `MessagingError` into `._exceptions` set 2023-10-23 14:17:36 -04:00
Tyler Goodlet 2f0bed3018 Ignore `greenback` import error if not installed 2023-10-19 12:41:15 -04:00
Tyler Goodlet 9da3b63644 Change remaining internals to use `Actor.reg_addrs` 2023-10-19 12:40:37 -04:00
Tyler Goodlet 1d6f55543d Expose per-actor registry addrs via `.reg_addrs`
Since it's handy to be able to debug the *writing* of this instance var
(particularly when checking state passed down to a child in
`Actor._from_parent()`), rename and wrap the underlying
`Actor._reg_addrs` as a settable `@property` and add validation to
the `.setter` for sanity - actor discovery is a critical functionality.

Other tweaks:
- fix `.cancel_soon()` to pass expected argument..
- update internal runtime error message to be simpler and link to GH issues.
- use new `Actor.reg_addrs` throughout core.
2023-10-19 12:38:27 -04:00
Tyler Goodlet a3ed30e62b Get remaining suites passing..
..by ensuring `reg_addr` fixture value passthrough to subactor eps
2023-10-19 11:51:47 -04:00
Tyler Goodlet 42d621bba7 Always dynamically re-read the `._root._default_lo_addrs` value in `find_actor()` 2023-10-18 19:10:04 -04:00
Tyler Goodlet 2e81ccf5b4 Dump `.msgdata` in `RemoteActorError.__repr__()` 2023-10-18 19:09:07 -04:00
Tyler Goodlet 022bf8ce75 Ensure `registry_addrs` is always set to something 2023-10-18 19:08:35 -04:00
Tyler Goodlet 0e9457299c Port all tests to new `reg_addr` fixture name 2023-10-18 15:39:20 -04:00
Tyler Goodlet 6b1ceee19f Type out the full-fledged streaming ex. 2023-10-18 15:36:00 -04:00
Tyler Goodlet 1e689ee701 Rename fixture `arb_addr` -> `reg_addr` and set the session value globally as `._root._default_lo_addrs` 2023-10-18 15:35:35 -04:00
Tyler Goodlet 190845ce1d Add masked super timeout line to `do_hard_kill()` for would-be runtime hackers 2023-10-18 15:29:43 -04:00
Tyler Goodlet 0c74b04c83 Facepalm, `wait_for_actor()` dun take an addr `list`.. 2023-10-18 15:22:54 -04:00
Tyler Goodlet 215fec1d41 Change old `._debug._pause()` name, cherry to #362 re `greenback` 2023-10-18 15:01:04 -04:00
Tyler Goodlet fcc8cee9d3 ._root: set a `_default_lo_addrs` and apply it when not provided by caller 2023-10-18 14:12:58 -04:00
Tyler Goodlet ca3f7a1b6b Add a first serious inter-peer remote cancel suite
Tests that appropriate `Context` exit state, the relay of
a `ContextCancelled` error and its `.canceller: tuple[str, str]` value
are set when an inter-peer cancellation happens via an "out of band"
request method (in this case using `Portal.cancel_actor()` and that
cancellation is propagated "horizontally" to other peers. Verify that
any such cancellation scenario which also experiences an "error during
`ContextCancelled` handling" DOES NOT result in that further error being
suppressed and that the user's exception bubbles out of the
`Context.open_context()` block(s) appropriately!

Likely more tests to come as well as some factoring of the teardown
state checks where possible.

Pertains to serious testing the major work landing in #357
2023-10-18 13:59:08 -04:00
Tyler Goodlet 87c1113de4 Always set default reg addr in `find_actor()` if not defined 2023-10-18 13:20:29 -04:00
Tyler Goodlet 43b659dbe4 Tidy/clarify another `._runtime` comment 2023-10-18 13:19:34 -04:00
Tyler Goodlet 63b1488ab6 Get mega-pedantic in `Portal.open_context()`
Specifically in the `.__aexit__()` phase to ensure remote,
runtime-internal, and locally raised error-during-cancelled-handling
exceptions are NEVER masked by a local `ContextCancelled` or any
exception group of `trio.Cancelled`s.

Also adds a ton of details to doc strings including extreme detail
surrounding the `ContextCancelled` raising cases and their processing
inside `.open_context()`'s exception handler blocks.

Details, details:
- internal rename `err`/`_err` stuff to just be `scope_err` since it's
  effectively the error bubbled up from the context's surrounding (and
  cross-actor) "scope".
- always shield `._recv_chan.aclose()` to avoid any `Cancelled` from
  masking the `scope_err` with a runtime related `trio.Cancelled`.
- explicitly catch the specific set of `scope_err: BaseException` that
  we can reasonably expect to handle instead of the catch-all parent
  type including exception groups, cancels and KBIs.
2023-10-18 13:18:29 -04:00
Tyler Goodlet 7eb31f3fea Runtime import `.get_root()` in stdin hijacker to avoid import cycle 2023-10-17 16:52:31 -04:00
Tyler Goodlet 534e5d150d Drop `msg` kwarg from `Context.cancel()`
Well first off, turns out it's never used and generally speaking
doesn't seem to help much with "runtime hacking/debugging"; why would
we need to "fabricate" a msg when `.cancel()` is called to self-cancel?

Also (and since `._maybe_cancel_and_set_remote_error()` now takes an
`error: BaseException` as input and thus expects error-msg unpacking
prior to being called), we now manually set `Context._cancel_msg: dict`
just prior to any remote error assignment - so any case where we would
have fabbed a "cancel msg" near calling `.cancel()`, just do the manual
assign.

In this vein some other subtle changes:
- obviously don't set `._cancel_msg` in `.cancel()` since it's no longer
  an input.
- generally do walrus-style `error := unpack_error()` before applying
  and setting remote error-msg state.
- always raise any `._remote_error` in `.result()` instead of returning
  the exception instance and check before AND after the underlying mem
  chan read.
- add notes/todos around `raise self._remote_error from None` masking of
  (runtime) errors in `._maybe_raise_remote_err()` and use it inside
  `.result()` since we had the inverse duplicate logic there anyway..

Further, this adds and extends a ton of (internal) interface docs and
details comments around the `Context` API including many subtleties
pertaining to calling `._maybe_cancel_and_set_remote_error()`.
2023-10-17 16:50:52 -04:00
Tyler Goodlet e4a6223256 `._exceptions`: typing and error unpacking updates
Bump type annotations to 3.10+ style throughout module as well as fill
out doc strings a bit. Inside `unpack_error()` pop any `error_dict: dict`
and,
- return `None` early if not found,
- versus pass directly as `**error_dict` to the error constructor
  instead of a double field read.
2023-10-16 16:23:30 -04:00
Tyler Goodlet ab2664da70 Runtime level log on debug REPL exits 2023-10-16 15:46:21 -04:00
Tyler Goodlet ae326cbb9a Ignore kbis in `open_crash_handler()` by default 2023-10-16 15:45:34 -04:00
Tyler Goodlet 07cec02303 Add comments around diff between `C/context` refs 2023-10-16 15:45:02 -04:00
Tyler Goodlet 2fdb8fc25a Factor non-yield stream msg processing into helper
Since both `MsgStream.receive()` and `.receive_nowait()` need the same
raising logic when a non-stream msg arrives (so that maybe an
appropriate IPC translated error can be raised) move the `KeyError`
handler code into a new `._streaming._raise_from_no_yield_msg()` func
and call it from both methods to make the error-interface-raising
symmetrical across both methods.
2023-10-16 15:35:16 -04:00
Tyler Goodlet 6d951c526a Comment all `.pause(shield=True)` attempts again, need to solve cancel scope `.__exit__()` frame hiding issue.. 2023-10-10 09:55:11 -04:00
Tyler Goodlet 575a24adf1 Always raise remote (cancelled) error if set
Previously we weren't raising a remote error if the local scope was
cancelled during a call to `Context.result()` which is problematic if
the caller WAS NOT the requester for said remote cancellation; in that
case we still want a `ContextCancelled` raised with the `.canceller:
str` set to the cancelling actor uid.

Further fix a naming bug where the (seemingly older) `._remote_err` was
being set to such an error instead of `._remote_error` XD
2023-10-10 09:45:49 -04:00
Tyler Goodlet 919e462f88 Write more comprehensive `Portal.cancel_actor()` doc str 2023-10-08 15:57:18 -04:00
Tyler Goodlet a09b8560bb Oof, default reg addrs needs to be in `list[tuple]` form.. 2023-10-07 18:52:37 -04:00
Tyler Goodlet c4cd573b26 Drop pause line from ctx cancel handler block in test 2023-10-07 18:51:59 -04:00
Tyler Goodlet d24a9e158f Msg-ified `ContextCancelled`s sub-error type should always be just, its type.. 2023-10-07 18:51:03 -04:00
Tyler Goodlet 18a1634025 Add shielding support to `.pause()`
Implement it like you'd expect using simply a wrapping
`trio.CancelScope` which is itself shielded by the input `shield: bool`
B)

There's seemingly still some issues with the frame selection when the
REPL engages and not sure how to resolve it yet but at least this does
indeed work for practical purposes. Still needs a test obviously!
2023-10-06 15:49:23 -04:00
Tyler Goodlet 78c0d2b234 Start inter-peer cancellation test mod
Move over relevant test from the "context semantics" test module which
was already verifying peer-caused-`ContextCancelled.canceller: tuple`
error info and propagation during an inter-peer cancellation scenario.

Also begin a more general set of inter-peer cancellation tests starting
with the simplest case where when a peer is cancelled the parent should
NOT get an "muted" `trio.Cancelled` and instead
a `tractor.ContextCancelled` with a `.canceller: tuple` which points to
the sibling actor which requested the peer cancel.
2023-10-06 15:44:26 -04:00
Tyler Goodlet 4314a59327 Add post-mortem catch around failed transport addr binds to aid with runtime debugging 2023-10-03 10:54:46 -04:00
Tyler Goodlet e94f1261b5 Move `maybe_open_crash_handler()` CLI `--pdb`-driven wrapper to debug mod 2023-10-02 18:10:34 -04:00
Tyler Goodlet 86da79a854 Rename to `parse_maddr()` and fill out doc strings 2023-09-29 14:49:18 -04:00
Tyler Goodlet de89e3a9c4 Add libp2p style "multi-address" parser from `piker`
Details are in the module docs; this is a first draft with lotsa room
for refinement and extension.
2023-09-29 14:11:31 -04:00
Tyler Goodlet 7bed470f5c Start `.devx.cli` extensions for pop CLI frameworks
Starting of with just a `typer` (and thus transitively `click`)
`typer.Typer.callback` hook which allows passthrough of the `--ll
<loglevel: str>` and `--pdb <debug_mode: bool>` flags for use when
building CLIs that use the runtime Bo

Still needs lotsa refinement and obviously better docs but, the doc
string for `load_runtime_vars()` shows how to use the underlying
`.devx._debug.open_crash_handler()` via a wrapper that can be passed the
`--pdb` flag and then enable debug mode throughout the entire actor
system.
2023-09-28 15:36:24 -04:00
Tyler Goodlet fa9a9cfb1d Kick off `.devx` subpkg for our dev tools B)
Where `.devx` is "developer experience", a hopefully broad enough subpkg
name for all the slick stuff planned to augment working on the actor
runtime 💥

Move the `._debug` module into the new subpkg and adjust rest of core
code base to reflect import path change. Also add a new
`.devx._debug.open_crash_handler()` manager for wrapping any sync code
outside a `trio.run()` which is handy for eventual CLI addons for
popular frameworks like `click`/`typer`.
2023-09-28 14:14:50 -04:00
Tyler Goodlet 3d0e95513c Init-support for "multi homed" transports
Since we'd like to eventually allow a diverse set of transport
(protocol) methods and stacks, and a multi-peer discovery system for
distributed actor-tree applications, this reworks all runtime internals
to support multi-homing for any given tree on a logical host. In other
words any actor can now bind its transport server (currently only
unsecured TCP + `msgspec`) to more then one address available in its
(linux) network namespace. Further, registry actors (now dubbed
"registars" instead of "arbiters") can also similarly bind to multiple
network addresses and provide discovery services to remote actors via
multiple addresses which can now be provided at runtime startup.

Deats:
- adjust `._runtime` internals to use a `list[tuple[str, int]]` (and
  thus pluralized) socket address sequence where applicable for transport
  server socket binds, now exposed via `Actor.accept_addrs`:
  - `Actor.__init__()` now takes a `registry_addrs: list`.
  - `Actor.is_arbiter` -> `.is_registrar`.
  - `._arb_addr` -> `._reg_addrs: list[tuple]`.
  - always reg and de-reg from all registrars in `async_main()`.
  - only set the global runtime var `'_root_mailbox'` to the loopback
    address since normally all in-tree processes should have access to
    it, right?
  - `._serve_forever()` task now takes `listen_sockaddrs: list[tuple]`
- make `open_root_actor()` take a `registry_addrs: list[tuple[str, int]]`
  and defaults when not passed.
- change `ActorNursery.start_..()` methods take `bind_addrs: list` and
  pass down through the spawning layer(s) via the parent-seed-msg.
- generalize all `._discovery()` APIs to accept `registry_addrs`-like
  inputs and move all relevant subsystems to adopt the "registry" style
  naming instead of "arbiter":
  - make `find_actor()` support batched concurrent portal queries over
    all provided input addresses using `.trionics.gather_contexts()` Bo
  - syntax: move to using `async with <tuples>` 3.9+ style chained
    @acms.
  - a general modernization of the code to a python 3.9+ style.
  - start deprecation and change to "registry" naming / semantics:
    - `._discovery.get_arbiter()` -> `.get_registry()`
2023-09-27 16:25:21 -04:00
Tyler Goodlet ee151b00af Mk `gather_contexts()` support `@acm`s yielding `None`
We were using a `all(<yielded values>)` condition which obviously won't
work if the batched managers yield any non-truthy value. So instead see
the `unwrapped: dict` with the `id(mngrs)` and only unblock once all
values have been filled in to be something that is not that value.
2023-09-27 14:05:22 -04:00
Tyler Goodlet 22c14e235e Expose `Channel` @ pkg level, drop `_debug.pp()` alias 2023-08-18 10:18:25 -04:00
Tyler Goodlet 1102843087 Teensie tidy up on actor doc string 2023-08-18 10:10:36 -04:00
Tyler Goodlet e03bec5efc Move `.to_asyncio` to modern optional value type annots 2023-07-21 15:08:46 -04:00
Tyler Goodlet bee2c36072 Make `NamespacePath` work on object refs
Detect if the input ref is a non-func (like an `object` instance) in
which case grab its type name using `type()`. Wrap all the name-getting
into a new `_mk_fqpn()` static meth: gets the "fully qualified path
name" and returns path and name in tuple; port other methds to use it.
Refine and update the docs B)
2023-07-12 13:07:30 -04:00
Tyler Goodlet b36b3d522f Map `breakpoint()` built-in to new `.pause_from_sync()` ep 2023-07-07 15:35:52 -04:00
Tyler Goodlet 4ace8f6037 Fix frame-selection display on first REPL entry
For whatever reason pdb(p), and in general, will show the frame of the
*next* python instruction/LOC on initial entry (at least using
`.set_trace()`), as such remove the `try/finally` block in the sync
code entrypoint `.pause_from_sync()`, and also since doesn't seem like
we really need it anyway.

Further, and to this end:
- enable hidden frames support in our default config.
- fix/drop/mask all the frame ref-ing/mangling we had prior since it's no
  longer needed as well as manual `Lock` releasing which seems to work
  already by having the `greenback` spawned task do it's normal thing?
- move to no `Union` type annots.
- hide all frames that can add "this is the runtime confusion" to
  traces.
2023-07-07 14:51:44 -04:00
Tyler Goodlet 98a7326c85 ._runtime: log level tweaks, use crit for stale debug lock detection 2023-07-07 14:49:23 -04:00
Tyler Goodlet 46972df041 .log: more correct handling for `get_logger(__name__)` usage 2023-07-07 14:48:37 -04:00
Tyler Goodlet 565d7c3ee5 Add longer "required reading" list B) 2023-07-07 14:47:42 -04:00
Tyler Goodlet ac695a05bf Updates from latest `piker.data._sharedmem` changes 2023-06-22 17:16:17 -04:00
Tyler Goodlet fc56971a2d First proto: use `greenback` for sync func breakpointing
This works now for supporting a new `tractor.pause_from_sync()`
`tractor`-aware-replacement for `Pdb.set_trace()` from sync functions
which are also scheduled from our runtime. Uses `greenback` to do all
the magic of scheduling the bg `tractor._debug._pause()` task and
engaging the normal TTY locking machinery triggered by `await
tractor.breakpoint()`

Further this starts some public API renaming, making a switch to
`tractor.pause()` from `.breakpoint()` which IMO much better expresses
the semantics of the runtime intervention required to suffice
multi-process "breakpointing"; it also is an alternate name for the same
in computer science more generally: https://en.wikipedia.org/wiki/Breakpoint
It also avoids using the same name as the `breakpoint()` built-in which
is important since there **is alot more going on** when you call our
equivalent API.

Deats of that:
- add deprecation warning for `tractor.breakpoint()`
- add `tractor.pause()` and a shorthand, easier-to-type, alias `.pp()`
  for "pause-point" B)
- add `pause_from_sync()` as the new `breakpoint()`-from-sync-function
  hack which does all the `greenback` stuff for the user.

Still TODO:
- figure out where in the runtime and when to call
  `greenback.ensure_portal()`.
- fix the frame selection issue where
  `trio._core._ki._ki_protection_decorator:wrapper` seems to be always
  shown on REPL start as the selected frame..
2023-06-21 16:08:18 -04:00
Tyler Goodlet ee87cf0e29 Add a debug-mode-breakpoint-causes-hang case!
Only found this by luck more or less (while working on something in
a client project) and it turns out we can actually get to (yet another)
hang state where SIGINT will be ignored by the root actor on teardown..

I've added all the necessary logic flags to reproduce. We obviously need
a follow up bug issue and a test suite to replicate!

It appears as though the following are required based on very light
tinkering:
- infected asyncio mode active
- debug mode active
- the `trio` context must breakpoint *before* `.started()`-ing
- the `asyncio` must **not** error
2023-06-21 14:07:31 -04:00
Tyler Goodlet ebcb275cd8 Add (first-draft) infected-`asyncio` actor task uses debugger example 2023-06-21 14:07:31 -04:00
Tyler Goodlet f745da9fb2 Add `numpy` for testing optional integrated shm API layer 2023-06-15 12:20:20 -04:00
Tyler Goodlet 4f442efbd7 Pass `str` dtype for `use_str` case 2023-06-15 12:20:20 -04:00
Tyler Goodlet f9a84f0732 Allocate size-specced "empty" sequence from default values by type 2023-06-15 12:20:20 -04:00
Tyler Goodlet e0bf964ff0 Mod define `_USE_POSIX`, add a of of todos 2023-06-15 12:20:20 -04:00
Tyler Goodlet a9fc4c1b91 Parametrize rw test with variable frame sizes
Demonstrates fixed size frame-oriented reads by the child where the
parent only transmits a "read" stream msg on "frame fill events" such
that the child incrementally reads the shm list data (much like in
a real-time-buffered streaming system).
2023-06-15 12:20:20 -04:00
Tyler Goodlet b52ff270c5 Add `ShmList` slice support in `.__getitem__()` 2023-06-15 12:20:20 -04:00
Tyler Goodlet 1713ecd9f8 Rename token type to `NDToken` in the style of `nptyping` 2023-06-15 12:20:20 -04:00
Tyler Goodlet edb82fdd78 Don't require runtime (for now), type annot fixing 2023-06-15 12:20:20 -04:00
Tyler Goodlet 339d787cf8 Add repetitive attach to existing segment test 2023-06-15 12:20:20 -04:00
Tyler Goodlet c32b21b4b1 Add initial readers-writer shm list tests 2023-06-15 12:20:20 -04:00
Tyler Goodlet 71477290fc Add `ShmList` wrapping the stdlib's `ShareableList`
First attempt at getting `multiprocessing.shared_memory.ShareableList`
working; we wrap the stdlib type with a readonly attr and a `.key` for
cross-actor lookup. Also, rename all `numpy` specific routines to have
a `ndarray` suffix in the func names.
2023-06-15 12:20:20 -04:00
Tyler Goodlet 9716d86825 Initial module import from `piker.data._sharemem`
More or less a verbatim copy-paste minus some edgy variable naming and
internal `piker` module imports. There is a bunch of OHLC related
defaults that need to be dropped and we need to adjust to an optional
dependence on `numpy` by supporting shared lists as per the mp docs.
2023-06-15 12:20:20 -04:00
53 changed files with 4000 additions and 6622 deletions

View File

@ -6,115 +6,47 @@ been an outage) and we want to ensure that despite being in debug mode
actor tree will eventually be cancelled without leaving any zombies. actor tree will eventually be cancelled without leaving any zombies.
''' '''
from contextlib import asynccontextmanager as acm import trio
from functools import partial
from tractor import ( from tractor import (
open_nursery, open_nursery,
context, context,
Context, Context,
ContextCancelled,
MsgStream, MsgStream,
_testing,
) )
import trio
import pytest
async def break_ipc( async def break_channel_silently_then_error(
stream: MsgStream, stream: MsgStream,
method: str|None = None,
pre_close: bool = False,
def_method: str = 'eof',
) -> None:
'''
XXX: close the channel right after an error is raised
purposely breaking the IPC transport to make sure the parent
doesn't get stuck in debug or hang on the connection join.
this more or less simulates an infinite msg-receive hang on
the other end.
'''
# close channel via IPC prot msging before
# any transport breakage
if pre_close:
await stream.aclose()
method: str = method or def_method
print(
'#################################\n'
'Simulating CHILD-side IPC BREAK!\n'
f'method: {method}\n'
f'pre `.aclose()`: {pre_close}\n'
'#################################\n'
)
match method:
case 'trans_aclose':
await stream._ctx.chan.transport.stream.aclose()
case 'eof':
await stream._ctx.chan.transport.stream.send_eof()
case 'msg':
await stream._ctx.chan.send(None)
# TODO: the actual real-world simulated cases like
# transport layer hangs and/or lower layer 2-gens type
# scenarios..
#
# -[ ] already have some issues for this general testing
# area:
# - https://github.com/goodboy/tractor/issues/97
# - https://github.com/goodboy/tractor/issues/124
# - PR from @guille:
# https://github.com/goodboy/tractor/pull/149
# case 'hang':
# TODO: framework research:
#
# - https://github.com/GuoTengda1993/pynetem
# - https://github.com/shopify/toxiproxy
# - https://manpages.ubuntu.com/manpages/trusty/man1/wirefilter.1.html
case _:
raise RuntimeError(
f'IPC break method unsupported: {method}'
)
async def break_ipc_then_error(
stream: MsgStream,
break_ipc_with: str|None = None,
pre_close: bool = False,
): ):
await break_ipc(
stream=stream,
method=break_ipc_with,
pre_close=pre_close,
)
async for msg in stream: async for msg in stream:
await stream.send(msg) await stream.send(msg)
# XXX: close the channel right after an error is raised
# purposely breaking the IPC transport to make sure the parent
# doesn't get stuck in debug or hang on the connection join.
# this more or less simulates an infinite msg-receive hang on
# the other end.
await stream._ctx.chan.send(None)
assert 0 assert 0
async def iter_ipc_stream( async def close_stream_and_error(
stream: MsgStream, stream: MsgStream,
break_ipc_with: str|None = None,
pre_close: bool = False,
): ):
async for msg in stream: async for msg in stream:
await stream.send(msg) await stream.send(msg)
# wipe out channel right before raising
await stream._ctx.chan.send(None)
await stream.aclose()
assert 0
@context @context
async def recv_and_spawn_net_killers( async def recv_and_spawn_net_killers(
ctx: Context, ctx: Context,
break_ipc_after: bool|int = False, break_ipc_after: bool | int = False,
pre_close: bool = False,
) -> None: ) -> None:
''' '''
@ -129,53 +61,26 @@ async def recv_and_spawn_net_killers(
async for i in stream: async for i in stream:
print(f'child echoing {i}') print(f'child echoing {i}')
await stream.send(i) await stream.send(i)
if ( if (
break_ipc_after break_ipc_after
and and i > break_ipc_after
i >= break_ipc_after
): ):
n.start_soon( '#################################\n'
iter_ipc_stream, 'Simulating child-side IPC BREAK!\n'
stream, '#################################'
) n.start_soon(break_channel_silently_then_error, stream)
n.start_soon( n.start_soon(close_stream_and_error, stream)
partial(
break_ipc_then_error,
stream=stream,
pre_close=pre_close,
)
)
@acm
async def stuff_hangin_ctlc(timeout: float = 1) -> None:
with trio.move_on_after(timeout) as cs:
yield timeout
if cs.cancelled_caught:
# pretend to be a user seeing no streaming action
# thinking it's a hang, and then hitting ctl-c..
print(
f"i'm a user on the PARENT side and thingz hangin "
f'after timeout={timeout} ???\n\n'
'MASHING CTlR-C..!?\n'
)
raise KeyboardInterrupt
async def main( async def main(
debug_mode: bool = False, debug_mode: bool = False,
start_method: str = 'trio', start_method: str = 'trio',
loglevel: str = 'cancel',
# by default we break the parent IPC first (if configured to break # by default we break the parent IPC first (if configured to break
# at all), but this can be changed so the child does first (even if # at all), but this can be changed so the child does first (even if
# both are set to break). # both are set to break).
break_parent_ipc_after: int|bool = False, break_parent_ipc_after: int | bool = False,
break_child_ipc_after: int|bool = False, break_child_ipc_after: int | bool = False,
pre_close: bool = False,
) -> None: ) -> None:
@ -186,122 +91,59 @@ async def main(
# NOTE: even debugger is used we shouldn't get # NOTE: even debugger is used we shouldn't get
# a hang since it never engages due to broken IPC # a hang since it never engages due to broken IPC
debug_mode=debug_mode, debug_mode=debug_mode,
loglevel=loglevel, loglevel='warning',
) as an, ) as an,
): ):
sub_name: str = 'chitty_hijo'
portal = await an.start_actor( portal = await an.start_actor(
sub_name, 'chitty_hijo',
enable_modules=[__name__], enable_modules=[__name__],
) )
async with ( async with portal.open_context(
stuff_hangin_ctlc(timeout=2) as timeout,
_testing.expect_ctxc(
yay=(
break_parent_ipc_after
or break_child_ipc_after
),
# TODO: we CAN'T remove this right?
# since we need the ctxc to bubble up from either
# the stream API after the `None` msg is sent
# (which actually implicitly cancels all remote
# tasks in the hijo) or from simluated
# KBI-mash-from-user
# or should we expect that a KBI triggers the ctxc
# and KBI in an eg?
reraise=True,
),
portal.open_context(
recv_and_spawn_net_killers, recv_and_spawn_net_killers,
break_ipc_after=break_child_ipc_after, break_ipc_after=break_child_ipc_after,
pre_close=pre_close,
) as (ctx, sent), ) as (ctx, sent):
):
rx_eoc: bool = False
ipc_break_sent: bool = False
async with ctx.open_stream() as stream: async with ctx.open_stream() as stream:
for i in range(1000): for i in range(1000):
if ( if (
break_parent_ipc_after break_parent_ipc_after
and and i > break_parent_ipc_after
i > break_parent_ipc_after
and
not ipc_break_sent
): ):
print( print(
'#################################\n' '#################################\n'
'Simulating PARENT-side IPC BREAK!\n' 'Simulating parent-side IPC BREAK!\n'
'#################################\n' '#################################'
) )
await stream._ctx.chan.send(None)
# TODO: other methods? see break func above.
# await stream._ctx.chan.send(None)
# await stream._ctx.chan.transport.stream.send_eof()
await stream._ctx.chan.transport.stream.aclose()
ipc_break_sent = True
# it actually breaks right here in the # it actually breaks right here in the
# mp_spawn/forkserver backends and thus the zombie # mp_spawn/forkserver backends and thus the zombie
# reaper never even kicks in? # reaper never even kicks in?
print(f'parent sending {i}') print(f'parent sending {i}')
try:
await stream.send(i) await stream.send(i)
except ContextCancelled as ctxc:
print(
'parent received ctxc on `stream.send()`\n'
f'{ctxc}\n'
)
assert 'root' in ctxc.canceller
assert sub_name in ctx.canceller
# TODO: is this needed or no? with trio.move_on_after(2) as cs:
raise
# timeout: int = 1
# with trio.move_on_after(timeout) as cs:
async with stuff_hangin_ctlc() as timeout:
print(
f'PARENT `stream.receive()` with timeout={timeout}\n'
)
# NOTE: in the parent side IPC failure case this # NOTE: in the parent side IPC failure case this
# will raise an ``EndOfChannel`` after the child # will raise an ``EndOfChannel`` after the child
# is killed and sends a stop msg back to it's # is killed and sends a stop msg back to it's
# caller/this-parent. # caller/this-parent.
try:
rx = await stream.receive() rx = await stream.receive()
print(
"I'm a happy PARENT user and echoed to me is\n" print(f"I'm a happy user and echoed to me is {rx}")
f'{rx}\n'
) if cs.cancelled_caught:
except trio.EndOfChannel: # pretend to be a user seeing no streaming action
rx_eoc: bool = True # thinking it's a hang, and then hitting ctl-c..
print('MsgStream got EoC for PARENT') print("YOO i'm a user anddd thingz hangin..")
raise
print( print(
'Streaming finished and we got Eoc.\n' "YOO i'm mad send side dun but thingz hangin..\n"
'Canceling `.open_context()` in root with\n' 'MASHING CTlR-C Ctl-c..'
'CTlR-C..'
) )
if rx_eoc:
assert stream.closed
try:
await stream.send(i)
pytest.fail('stream not closed?')
except (
trio.ClosedResourceError,
trio.EndOfChannel,
) as send_err:
if rx_eoc:
assert send_err is stream._eoc
else:
assert send_err is stream._closed
raise KeyboardInterrupt raise KeyboardInterrupt

View File

@ -1,9 +0,0 @@
'''
Reproduce a bug where enabling debug mode for a sub-actor actually causes
a hang on teardown...
'''
import asyncio
import trio
import tractor

View File

@ -8,10 +8,7 @@ This uses no extra threads, fancy semaphores or futures; all we need
is ``tractor``'s channels. is ``tractor``'s channels.
""" """
from contextlib import ( from contextlib import asynccontextmanager
asynccontextmanager as acm,
aclosing,
)
from typing import Callable from typing import Callable
import itertools import itertools
import math import math
@ -19,6 +16,7 @@ import time
import tractor import tractor
import trio import trio
from async_generator import aclosing
PRIMES = [ PRIMES = [
@ -46,7 +44,7 @@ async def is_prime(n):
return True return True
@acm @asynccontextmanager
async def worker_pool(workers=4): async def worker_pool(workers=4):
"""Though it's a trivial special case for ``tractor``, the well """Though it's a trivial special case for ``tractor``, the well
known "worker pool" seems to be the defacto "but, I want this known "worker pool" seems to be the defacto "but, I want this

View File

@ -13,7 +13,7 @@ async def simple_rpc(
''' '''
# signal to parent that we're up much like # signal to parent that we're up much like
# ``trio.TaskStatus.started()`` # ``trio_typing.TaskStatus.started()``
await ctx.started(data + 1) await ctx.started(data + 1)
async with ctx.open_stream() as stream: async with ctx.open_stream() as stream:

View File

@ -26,23 +26,3 @@ all_bullets = true
directory = "trivial" directory = "trivial"
name = "Trivial/Internal Changes" name = "Trivial/Internal Changes"
showcontent = true showcontent = true
[tool.pytest.ini_options]
minversion = '6.0'
testpaths = [
'tests'
]
addopts = [
# TODO: figure out why this isn't working..
'--rootdir=./tests',
'--import-mode=importlib',
# don't show frickin captured logs AGAIN in the report..
'--show-capture=no',
]
log_cli = false
# TODO: maybe some of these layout choices?
# https://docs.pytest.org/en/8.0.x/explanation/goodpractices.html#choosing-a-test-layout-import-rules
# pythonpath = "src"

View File

@ -6,3 +6,4 @@ mypy
trio_typing trio_typing
pexpect pexpect
towncrier towncrier
numpy

View File

@ -36,20 +36,18 @@ setup(
platforms=['linux', 'windows'], platforms=['linux', 'windows'],
packages=[ packages=[
'tractor', 'tractor',
'tractor.experimental', # wacky ideas 'tractor.experimental',
'tractor.trionics', # trio extensions 'tractor.trionics',
'tractor.msg', # lowlevel data types
], ],
install_requires=[ install_requires=[
# trio related # trio related
# proper range spec: # proper range spec:
# https://packaging.python.org/en/latest/discussions/install-requires-vs-requirements/#id5 # https://packaging.python.org/en/latest/discussions/install-requires-vs-requirements/#id5
'trio >= 0.24', 'trio >= 0.22',
'async_generator',
# 'async_generator', # in stdlib mostly! 'trio_typing',
# 'trio_typing', # trio==0.23.0 has type hints! 'exceptiongroup',
# 'exceptiongroup', # in stdlib as of 3.11!
# tooling # tooling
'tricycle', 'tricycle',

View File

@ -7,19 +7,91 @@ import os
import random import random
import signal import signal
import platform import platform
import pathlib
import time import time
import inspect
from functools import partial, wraps
import pytest import pytest
import trio
import tractor import tractor
from tractor._testing import (
examples_dir as examples_dir,
tractor_test as tractor_test,
expect_ctxc as expect_ctxc,
)
# TODO: include wtv plugin(s) we build in `._testing.pytest`?
pytest_plugins = ['pytester'] pytest_plugins = ['pytester']
def tractor_test(fn):
"""
Use:
@tractor_test
async def test_whatever():
await ...
If fixtures:
- ``reg_addr`` (a socket addr tuple where arbiter is listening)
- ``loglevel`` (logging level passed to tractor internals)
- ``start_method`` (subprocess spawning backend)
are defined in the `pytest` fixture space they will be automatically
injected to tests declaring these funcargs.
"""
@wraps(fn)
def wrapper(
*args,
loglevel=None,
reg_addr=None,
start_method=None,
**kwargs
):
# __tracebackhide__ = True
if 'reg_addr' in inspect.signature(fn).parameters:
# injects test suite fixture value to test as well
# as `run()`
kwargs['reg_addr'] = reg_addr
if 'loglevel' in inspect.signature(fn).parameters:
# allows test suites to define a 'loglevel' fixture
# that activates the internal logging
kwargs['loglevel'] = loglevel
if start_method is None:
if platform.system() == "Windows":
start_method = 'trio'
if 'start_method' in inspect.signature(fn).parameters:
# set of subprocess spawning backends
kwargs['start_method'] = start_method
if kwargs:
# use explicit root actor start
async def _main():
async with tractor.open_root_actor(
# **kwargs,
registry_addrs=[reg_addr] if reg_addr else None,
loglevel=loglevel,
start_method=start_method,
# TODO: only enable when pytest is passed --pdb
# debug_mode=True,
):
await fn(*args, **kwargs)
main = _main
else:
# use implicit root actor start
main = partial(fn, *args, **kwargs)
return trio.run(main)
return wrapper
# Sending signal.SIGINT on subprocess fails on windows. Use CTRL_* alternatives # Sending signal.SIGINT on subprocess fails on windows. Use CTRL_* alternatives
if platform.system() == 'Windows': if platform.system() == 'Windows':
_KILL_SIGNAL = signal.CTRL_BREAK_EVENT _KILL_SIGNAL = signal.CTRL_BREAK_EVENT
@ -39,6 +111,23 @@ no_windows = pytest.mark.skipif(
) )
def repodir() -> pathlib.Path:
'''
Return the abspath to the repo directory.
'''
# 2 parents up to step up through tests/<repo_dir>
return pathlib.Path(__file__).parent.parent.absolute()
def examples_dir() -> pathlib.Path:
'''
Return the abspath to the examples directory as `pathlib.Path`.
'''
return repodir() / 'examples'
def pytest_addoption(parser): def pytest_addoption(parser):
parser.addoption( parser.addoption(
"--ll", action="store", dest='loglevel', "--ll", action="store", dest='loglevel',
@ -76,18 +165,11 @@ _ci_env: bool = os.environ.get('CI', False)
@pytest.fixture(scope='session') @pytest.fixture(scope='session')
def ci_env() -> bool: def ci_env() -> bool:
''' """Detect CI envoirment.
Detect CI envoirment. """
'''
return _ci_env return _ci_env
# TODO: also move this to `._testing` for now?
# -[ ] possibly generalize and re-use for multi-tree spawning
# along with the new stuff for multi-addrs in distribute_dis
# branch?
#
# choose randomly at import time # choose randomly at import time
_reg_addr: tuple[str, int] = ( _reg_addr: tuple[str, int] = (
'127.0.0.1', '127.0.0.1',
@ -141,7 +223,6 @@ def sig_prog(proc, sig):
assert ret assert ret
# TODO: factor into @cm and move to `._testing`?
@pytest.fixture @pytest.fixture
def daemon( def daemon(
loglevel: str, loglevel: str,

View File

@ -3,29 +3,22 @@ Sketchy network blackoutz, ugly byzantine gens, puedes eschuchar la
cancelacion?.. cancelacion?..
''' '''
import itertools
from functools import partial from functools import partial
from types import ModuleType
import pytest import pytest
from _pytest.pathlib import import_path from _pytest.pathlib import import_path
import trio import trio
import tractor import tractor
from tractor._testing import (
from conftest import (
examples_dir, examples_dir,
) )
@pytest.mark.parametrize( @pytest.mark.parametrize(
'pre_aclose_msgstream', 'debug_mode',
[ [False, True],
False, ids=['no_debug_mode', 'debug_mode'],
True,
],
ids=[
'no_msgstream_aclose',
'pre_aclose_msgstream',
],
) )
@pytest.mark.parametrize( @pytest.mark.parametrize(
'ipc_break', 'ipc_break',
@ -70,10 +63,8 @@ from tractor._testing import (
) )
def test_ipc_channel_break_during_stream( def test_ipc_channel_break_during_stream(
debug_mode: bool, debug_mode: bool,
loglevel: str,
spawn_backend: str, spawn_backend: str,
ipc_break: dict|None, ipc_break: dict | None,
pre_aclose_msgstream: bool,
): ):
''' '''
Ensure we can have an IPC channel break its connection during Ensure we can have an IPC channel break its connection during
@ -92,130 +83,70 @@ def test_ipc_channel_break_during_stream(
# requires the user to do ctl-c to cancel the actor tree. # requires the user to do ctl-c to cancel the actor tree.
expect_final_exc = trio.ClosedResourceError expect_final_exc = trio.ClosedResourceError
mod: ModuleType = import_path( mod = import_path(
examples_dir() / 'advanced_faults' / 'ipc_failure_during_stream.py', examples_dir() / 'advanced_faults' / 'ipc_failure_during_stream.py',
root=examples_dir(), root=examples_dir(),
) )
# by def we expect KBI from user after a simulated "hang
# period" wherein the user eventually hits ctl-c to kill the
# root-actor tree.
expect_final_exc: BaseException = KeyboardInterrupt
if (
# only expect EoC if trans is broken on the child side,
ipc_break['break_child_ipc_after'] is not False
# AND we tell the child to call `MsgStream.aclose()`.
and pre_aclose_msgstream
):
# expect_final_exc = trio.EndOfChannel
# ^XXX NOPE! XXX^ since now `.open_stream()` absorbs this
# gracefully!
expect_final_exc = KeyboardInterrupt expect_final_exc = KeyboardInterrupt
# NOTE when ONLY the child breaks or it breaks BEFORE the # when ONLY the child breaks we expect the parent to get a closed
# parent we expect the parent to get a closed resource error # resource error on the next `MsgStream.receive()` and then fail out
# on the next `MsgStream.receive()` and then fail out and # and cancel the child from there.
# cancel the child from there.
#
# ONLY CHILD breaks
if ( if (
# only child breaks
(
ipc_break['break_child_ipc_after'] ipc_break['break_child_ipc_after']
and and ipc_break['break_parent_ipc_after'] is False
ipc_break['break_parent_ipc_after'] is False )
):
# NOTE: we DO NOT expect this any more since
# the child side's channel will be broken silently
# and nothing on the parent side will indicate this!
# expect_final_exc = trio.ClosedResourceError
# NOTE: child will send a 'stop' msg before it breaks # both break but, parent breaks first
# the transport channel BUT, that will be absorbed by the or (
# `ctx.open_stream()` block and thus the `.open_context()`
# should hang, after which the test script simulates
# a user sending ctl-c by raising a KBI.
if pre_aclose_msgstream:
expect_final_exc = KeyboardInterrupt
# XXX OLD XXX
# if child calls `MsgStream.aclose()` then expect EoC.
# ^ XXX not any more ^ since eoc is always absorbed
# gracefully and NOT bubbled to the `.open_context()`
# block!
# expect_final_exc = trio.EndOfChannel
# BOTH but, CHILD breaks FIRST
elif (
ipc_break['break_child_ipc_after'] is not False ipc_break['break_child_ipc_after'] is not False
and ( and (
ipc_break['break_parent_ipc_after'] ipc_break['break_parent_ipc_after']
> ipc_break['break_child_ipc_after'] > ipc_break['break_child_ipc_after']
) )
): )
if pre_aclose_msgstream:
expect_final_exc = KeyboardInterrupt
# NOTE when the parent IPC side dies (even if the child's does as well
# but the child fails BEFORE the parent) we always expect the
# IPC layer to raise a closed-resource, NEVER do we expect
# a stop msg since the parent-side ctx apis will error out
# IMMEDIATELY before the child ever sends any 'stop' msg.
#
# ONLY PARENT breaks
elif (
ipc_break['break_parent_ipc_after']
and
ipc_break['break_child_ipc_after'] is False
): ):
expect_final_exc = trio.ClosedResourceError expect_final_exc = trio.ClosedResourceError
# BOTH but, PARENT breaks FIRST # when the parent IPC side dies (even if the child's does as well
# but the child fails BEFORE the parent) we expect the channel to be
# sent a stop msg from the child at some point which will signal the
# parent that the stream has been terminated.
# NOTE: when the parent breaks "after" the child you get this same
# case as well, the child breaks the IPC channel with a stop msg
# before any closure takes place.
elif ( elif (
# only parent breaks
(
ipc_break['break_parent_ipc_after']
and ipc_break['break_child_ipc_after'] is False
)
# both break but, child breaks first
or (
ipc_break['break_parent_ipc_after'] is not False ipc_break['break_parent_ipc_after'] is not False
and ( and (
ipc_break['break_child_ipc_after'] ipc_break['break_child_ipc_after']
> > ipc_break['break_parent_ipc_after']
ipc_break['break_parent_ipc_after'] )
) )
): ):
expect_final_exc = trio.ClosedResourceError expect_final_exc = trio.EndOfChannel
with pytest.raises( with pytest.raises(expect_final_exc):
expected_exception=(
expect_final_exc,
ExceptionGroup,
),
) as excinfo:
try:
trio.run( trio.run(
partial( partial(
mod.main, mod.main,
debug_mode=debug_mode, debug_mode=debug_mode,
start_method=spawn_backend, start_method=spawn_backend,
loglevel=loglevel,
pre_close=pre_aclose_msgstream,
**ipc_break, **ipc_break,
) )
) )
except KeyboardInterrupt as kbi:
_err = kbi
if expect_final_exc is not KeyboardInterrupt:
pytest.fail(
'Rxed unexpected KBI !?\n'
f'{repr(kbi)}'
)
raise
# get raw instance from pytest wrapper
value = excinfo.value
if isinstance(value, ExceptionGroup):
value = next(
itertools.dropwhile(
lambda exc: not isinstance(exc, expect_final_exc),
value.exceptions,
)
)
assert value
@tractor.context @tractor.context
@ -238,7 +169,6 @@ def test_stream_closed_right_after_ipc_break_and_zombie_lord_engages():
''' '''
async def main(): async def main():
with trio.fail_after(3):
async with tractor.open_nursery() as n: async with tractor.open_nursery() as n:
portal = await n.start_actor( portal = await n.start_actor(
'ipc_breaker', 'ipc_breaker',
@ -256,10 +186,7 @@ def test_stream_closed_right_after_ipc_break_and_zombie_lord_engages():
print('parent waiting on context') print('parent waiting on context')
print( print('parent exited context')
'parent exited context\n'
'parent raising KBI..\n'
)
raise KeyboardInterrupt raise KeyboardInterrupt
with pytest.raises(KeyboardInterrupt): with pytest.raises(KeyboardInterrupt):

View File

@ -6,7 +6,6 @@ from collections import Counter
import itertools import itertools
import platform import platform
import pytest
import trio import trio
import tractor import tractor
@ -144,16 +143,8 @@ def test_dynamic_pub_sub():
try: try:
trio.run(main) trio.run(main)
except ( except trio.TooSlowError:
trio.TooSlowError, pass
ExceptionGroup,
) as err:
if isinstance(err, ExceptionGroup):
for suberr in err.exceptions:
if isinstance(suberr, trio.TooSlowError):
break
else:
pytest.fail('Never got a `TooSlowError` ?')
@tractor.context @tractor.context
@ -307,69 +298,44 @@ async def inf_streamer(
async with ( async with (
ctx.open_stream() as stream, ctx.open_stream() as stream,
trio.open_nursery() as tn, trio.open_nursery() as n,
): ):
async def close_stream_on_sentinel(): async def bail_on_sentinel():
async for msg in stream: async for msg in stream:
if msg == 'done': if msg == 'done':
print(
'streamer RXed "done" sentinel msg!\n'
'CLOSING `MsgStream`!'
)
await stream.aclose() await stream.aclose()
else: else:
print(f'streamer received {msg}') print(f'streamer received {msg}')
else:
print('streamer exited recv loop')
# start termination detector # start termination detector
tn.start_soon(close_stream_on_sentinel) n.start_soon(bail_on_sentinel)
cap: int = 10000 # so that we don't spin forever when bug.. for val in itertools.count():
for val in range(cap):
try: try:
print(f'streamer sending {val}')
await stream.send(val) await stream.send(val)
if val > cap:
raise RuntimeError(
'Streamer never cancelled by setinel?'
)
await trio.sleep(0.001)
# close out the stream gracefully
except trio.ClosedResourceError: except trio.ClosedResourceError:
print('transport closed on streamer side!') # close out the stream gracefully
assert stream.closed
break break
else:
raise RuntimeError(
'Streamer not cancelled before finished sending?'
)
print('streamer exited .open_streamer() block') print('terminating streamer')
def test_local_task_fanout_from_stream( def test_local_task_fanout_from_stream():
debug_mode: bool,
):
''' '''
Single stream with multiple local consumer tasks using the Single stream with multiple local consumer tasks using the
``MsgStream.subscribe()` api. ``MsgStream.subscribe()` api.
Ensure all tasks receive all values after stream completes Ensure all tasks receive all values after stream completes sending.
sending.
''' '''
consumers: int = 22 consumers = 22
async def main(): async def main():
counts = Counter() counts = Counter()
async with tractor.open_nursery( async with tractor.open_nursery() as tn:
debug_mode=debug_mode, p = await tn.start_actor(
) as tn:
p: tractor.Portal = await tn.start_actor(
'inf_streamer', 'inf_streamer',
enable_modules=[__name__], enable_modules=[__name__],
) )
@ -377,6 +343,7 @@ def test_local_task_fanout_from_stream(
p.open_context(inf_streamer) as (ctx, _), p.open_context(inf_streamer) as (ctx, _),
ctx.open_stream() as stream, ctx.open_stream() as stream,
): ):
async def pull_and_count(name: str): async def pull_and_count(name: str):
# name = trio.lowlevel.current_task().name # name = trio.lowlevel.current_task().name
async with stream.subscribe() as recver: async with stream.subscribe() as recver:
@ -385,7 +352,7 @@ def test_local_task_fanout_from_stream(
tractor.trionics.BroadcastReceiver tractor.trionics.BroadcastReceiver
) )
async for val in recver: async for val in recver:
print(f'bx {name} rx: {val}') # print(f'{name}: {val}')
counts[name] += 1 counts[name] += 1
print(f'{name} bcaster ended') print(f'{name} bcaster ended')
@ -395,14 +362,10 @@ def test_local_task_fanout_from_stream(
with trio.fail_after(3): with trio.fail_after(3):
async with trio.open_nursery() as nurse: async with trio.open_nursery() as nurse:
for i in range(consumers): for i in range(consumers):
nurse.start_soon( nurse.start_soon(pull_and_count, i)
pull_and_count,
i,
)
# delay to let bcast consumers pull msgs
await trio.sleep(0.5) await trio.sleep(0.5)
print('terminating nursery of bcast rxer consumers!') print('\nterminating')
await stream.send('done') await stream.send('done')
print('closed stream connection') print('closed stream connection')

View File

@ -8,13 +8,15 @@ import platform
import time import time
from itertools import repeat from itertools import repeat
from exceptiongroup import (
BaseExceptionGroup,
ExceptionGroup,
)
import pytest import pytest
import trio import trio
import tractor import tractor
from tractor._testing import (
tractor_test, from conftest import tractor_test, no_windows
)
from conftest import no_windows
def is_win(): def is_win():
@ -46,13 +48,11 @@ async def do_nuthin():
ids=['no_args', 'unexpected_args'], ids=['no_args', 'unexpected_args'],
) )
def test_remote_error(reg_addr, args_err): def test_remote_error(reg_addr, args_err):
''' """Verify an error raised in a subactor that is propagated
Verify an error raised in a subactor that is propagated
to the parent nursery, contains the underlying boxed builtin to the parent nursery, contains the underlying boxed builtin
error type info and causes cancellation and reraising all the error type info and causes cancellation and reraising all the
way up the stack. way up the stack.
"""
'''
args, errtype = args_err args, errtype = args_err
async def main(): async def main():
@ -65,9 +65,7 @@ def test_remote_error(reg_addr, args_err):
# an exception group outside the nursery since the error # an exception group outside the nursery since the error
# here and the far end task error are one in the same? # here and the far end task error are one in the same?
portal = await nursery.run_in_actor( portal = await nursery.run_in_actor(
assert_err, assert_err, name='errorer', **args
name='errorer',
**args
) )
# get result(s) from main task # get result(s) from main task

View File

@ -6,15 +6,14 @@ sub-sub-actor daemons.
''' '''
from typing import Optional from typing import Optional
import asyncio import asyncio
from contextlib import ( from contextlib import asynccontextmanager as acm
asynccontextmanager as acm,
aclosing,
)
import pytest import pytest
import trio import trio
from trio_typing import TaskStatus
import tractor import tractor
from tractor import RemoteActorError from tractor import RemoteActorError
from async_generator import aclosing
async def aio_streamer( async def aio_streamer(

View File

@ -5,7 +5,9 @@ import trio
import tractor import tractor
from tractor import open_actor_cluster from tractor import open_actor_cluster
from tractor.trionics import gather_contexts from tractor.trionics import gather_contexts
from tractor._testing import tractor_test
from conftest import tractor_test
MESSAGE = 'tractoring at full speed' MESSAGE = 'tractoring at full speed'

View File

@ -5,12 +5,10 @@ Verify the we raise errors when streams are opened prior to
sync-opening a ``tractor.Context`` beforehand. sync-opening a ``tractor.Context`` beforehand.
''' '''
# from contextlib import asynccontextmanager as acm
from itertools import count from itertools import count
import platform import platform
from pprint import pformat from typing import Optional
from typing import (
Callable,
)
import pytest import pytest
import trio import trio
@ -25,10 +23,7 @@ from tractor._exceptions import (
ContextCancelled, ContextCancelled,
) )
from tractor._testing import ( from conftest import tractor_test
tractor_test,
expect_ctxc,
)
# ``Context`` semantics are as follows, # ``Context`` semantics are as follows,
# ------------------------------------ # ------------------------------------
@ -74,7 +69,7 @@ _state: bool = False
@tractor.context @tractor.context
async def too_many_starteds( async def too_many_starteds(
ctx: Context, ctx: tractor.Context,
) -> None: ) -> None:
''' '''
Call ``Context.started()`` more then once (an error). Call ``Context.started()`` more then once (an error).
@ -89,7 +84,7 @@ async def too_many_starteds(
@tractor.context @tractor.context
async def not_started_but_stream_opened( async def not_started_but_stream_opened(
ctx: Context, ctx: tractor.Context,
) -> None: ) -> None:
''' '''
Enter ``Context.open_stream()`` without calling ``.started()``. Enter ``Context.open_stream()`` without calling ``.started()``.
@ -110,15 +105,11 @@ async def not_started_but_stream_opened(
], ],
ids='misuse_type={}'.format, ids='misuse_type={}'.format,
) )
def test_started_misuse( def test_started_misuse(target):
target: Callable,
debug_mode: bool,
):
async def main(): async def main():
async with tractor.open_nursery( async with tractor.open_nursery() as n:
debug_mode=debug_mode, portal = await n.start_actor(
) as an:
portal = await an.start_actor(
target.__name__, target.__name__,
enable_modules=[__name__], enable_modules=[__name__],
) )
@ -133,7 +124,7 @@ def test_started_misuse(
@tractor.context @tractor.context
async def simple_setup_teardown( async def simple_setup_teardown(
ctx: Context, ctx: tractor.Context,
data: int, data: int,
block_forever: bool = False, block_forever: bool = False,
@ -179,7 +170,6 @@ def test_simple_context(
error_parent, error_parent,
callee_blocks_forever, callee_blocks_forever,
pointlessly_open_stream, pointlessly_open_stream,
debug_mode: bool,
): ):
timeout = 1.5 if not platform.system() == 'Windows' else 4 timeout = 1.5 if not platform.system() == 'Windows' else 4
@ -187,22 +177,20 @@ def test_simple_context(
async def main(): async def main():
with trio.fail_after(timeout): with trio.fail_after(timeout):
async with tractor.open_nursery( async with tractor.open_nursery() as nursery:
debug_mode=debug_mode,
) as an: portal = await nursery.start_actor(
portal = await an.start_actor(
'simple_context', 'simple_context',
enable_modules=[__name__], enable_modules=[__name__],
) )
try: try:
async with ( async with portal.open_context(
portal.open_context(
simple_setup_teardown, simple_setup_teardown,
data=10, data=10,
block_forever=callee_blocks_forever, block_forever=callee_blocks_forever,
) as (ctx, sent), ) as (ctx, sent):
):
assert sent == 11 assert sent == 11
if callee_blocks_forever: if callee_blocks_forever:
@ -272,7 +260,6 @@ def test_caller_cancels(
cancel_method: str, cancel_method: str,
chk_ctx_result_before_exit: bool, chk_ctx_result_before_exit: bool,
callee_returns_early: bool, callee_returns_early: bool,
debug_mode: bool,
): ):
''' '''
Verify that when the opening side of a context (aka the caller) Verify that when the opening side of a context (aka the caller)
@ -281,100 +268,37 @@ def test_caller_cancels(
''' '''
async def check_canceller( async def check_canceller(
ctx: Context, ctx: tractor.Context,
) -> None: ) -> None:
actor: Actor = current_actor() # should not raise yet return the remote
uid: tuple = actor.uid # context cancelled error.
_ctxc: ContextCancelled|None = None
if (
cancel_method == 'portal'
and not callee_returns_early
):
try:
res = await ctx.result() res = await ctx.result()
assert 0, 'Portal cancel should raise!'
except ContextCancelled as ctxc:
# with trio.CancelScope(shield=True):
# await tractor.pause()
_ctxc = ctxc
assert ctx.chan._cancel_called
assert ctxc.canceller == uid
assert ctxc is ctx.maybe_error
# NOTE: should not ever raise even in the `ctx`
# case since self-cancellation should swallow the ctxc
# silently!
else:
try:
res = await ctx.result()
except ContextCancelled as ctxc:
pytest.fail(f'should not have raised ctxc\n{ctxc}')
# we actually get a result
if callee_returns_early: if callee_returns_early:
assert res == 'yo' assert res == 'yo'
assert ctx.outcome is res
assert ctx.maybe_error is None
else: else:
err: Exception = ctx.outcome err = res
assert isinstance(err, ContextCancelled) assert isinstance(err, ContextCancelled)
assert ( assert (
tuple(err.canceller) tuple(err.canceller)
== ==
uid current_actor().uid
) )
assert (
err
is ctx.maybe_error
is ctx._remote_error
)
if le := ctx._local_error:
assert err is le
# else:
# TODO: what should this be then?
# not defined until block closes right?
#
# await tractor.pause()
# assert ctx._local_error is None
# TODO: don't need this right?
# if _ctxc:
# raise _ctxc
async def main(): async def main():
async with tractor.open_nursery() as nursery:
async with tractor.open_nursery( portal = await nursery.start_actor(
debug_mode=debug_mode,
) as an:
portal = await an.start_actor(
'simple_context', 'simple_context',
enable_modules=[__name__], enable_modules=[__name__],
) )
timeout: float = ( timeout = 0.5 if not callee_returns_early else 2
0.5
if not callee_returns_early
else 2
)
with trio.fail_after(timeout): with trio.fail_after(timeout):
async with ( async with portal.open_context(
expect_ctxc(
yay=(
not callee_returns_early
and cancel_method == 'portal'
)
),
portal.open_context(
simple_setup_teardown, simple_setup_teardown,
data=10, data=10,
block_forever=not callee_returns_early, block_forever=not callee_returns_early,
) as (ctx, sent), ) as (ctx, sent):
):
if callee_returns_early: if callee_returns_early:
# ensure we block long enough before sending # ensure we block long enough before sending
@ -383,17 +307,9 @@ def test_caller_cancels(
await trio.sleep(0.5) await trio.sleep(0.5)
if cancel_method == 'ctx': if cancel_method == 'ctx':
print('cancelling with `Context.cancel()`')
await ctx.cancel() await ctx.cancel()
elif cancel_method == 'portal':
print('cancelling with `Portal.cancel_actor()`')
await portal.cancel_actor()
else: else:
pytest.fail( await portal.cancel_actor()
f'Unknown `cancel_method={cancel_method} ?'
)
if chk_ctx_result_before_exit: if chk_ctx_result_before_exit:
await check_canceller(ctx) await check_canceller(ctx)
@ -404,23 +320,6 @@ def test_caller_cancels(
if cancel_method != 'portal': if cancel_method != 'portal':
await portal.cancel_actor() await portal.cancel_actor()
# XXX NOTE XXX: non-normal yet purposeful
# test-specific ctxc suppression is implemented!
#
# WHY: the `.cancel_actor()` case (cancel_method='portal')
# will cause both:
# * the `ctx.result()` inside `.open_context().__aexit__()`
# * AND the `ctx.result()` inside `check_canceller()`
# to raise ctxc.
#
# which should in turn cause `ctx._scope` to
# catch any cancellation?
if (
not callee_returns_early
and cancel_method != 'portal'
):
assert not ctx._scope.cancelled_caught
trio.run(main) trio.run(main)
@ -439,7 +338,7 @@ def test_caller_cancels(
@tractor.context @tractor.context
async def close_ctx_immediately( async def close_ctx_immediately(
ctx: Context, ctx: tractor.Context,
) -> None: ) -> None:
@ -451,33 +350,17 @@ async def close_ctx_immediately(
@tractor_test @tractor_test
async def test_callee_closes_ctx_after_stream_open( async def test_callee_closes_ctx_after_stream_open():
debug_mode: bool, 'callee context closes without using stream'
):
'''
callee context closes without using stream.
This should result in a msg sequence async with tractor.open_nursery() as n:
|_<root>_
|_<fast_stream_closer>
<= {'started': <Any>, 'cid': <str>} portal = await n.start_actor(
<= {'stop': True, 'cid': <str>}
<= {'result': Any, ..}
(ignored by child)
=> {'stop': True, 'cid': <str>}
'''
async with tractor.open_nursery(
debug_mode=debug_mode,
) as an:
portal = await an.start_actor(
'fast_stream_closer', 'fast_stream_closer',
enable_modules=[__name__], enable_modules=[__name__],
) )
with trio.fail_after(0.5): with trio.fail_after(2):
async with portal.open_context( async with portal.open_context(
close_ctx_immediately, close_ctx_immediately,
@ -485,9 +368,10 @@ async def test_callee_closes_ctx_after_stream_open(
# cancel_on_exit=True, # cancel_on_exit=True,
) as (ctx, sent): ) as (ctx, sent):
assert sent is None assert sent is None
with trio.fail_after(0.4): with trio.fail_after(0.5):
async with ctx.open_stream() as stream: async with ctx.open_stream() as stream:
# should fall through since ``StopAsyncIteration`` # should fall through since ``StopAsyncIteration``
@ -495,14 +379,11 @@ async def test_callee_closes_ctx_after_stream_open(
# a ``trio.EndOfChannel`` by # a ``trio.EndOfChannel`` by
# ``trio.abc.ReceiveChannel.__anext__()`` # ``trio.abc.ReceiveChannel.__anext__()``
async for _ in stream: async for _ in stream:
# trigger failure if we DO NOT
# get an EOC!
assert 0 assert 0
else: else:
# verify stream is now closed # verify stream is now closed
try: try:
with trio.fail_after(0.3):
await stream.receive() await stream.receive()
except trio.EndOfChannel: except trio.EndOfChannel:
pass pass
@ -523,7 +404,8 @@ async def test_callee_closes_ctx_after_stream_open(
@tractor.context @tractor.context
async def expect_cancelled( async def expect_cancelled(
ctx: Context,
ctx: tractor.Context,
) -> None: ) -> None:
global _state global _state
@ -537,29 +419,12 @@ async def expect_cancelled(
await stream.send(msg) # echo server await stream.send(msg) # echo server
except trio.Cancelled: except trio.Cancelled:
# on ctx.cancel() the internal RPC scope is cancelled but
# never caught until the func exits.
assert ctx._scope.cancel_called
assert not ctx._scope.cancelled_caught
# should be the RPC cmd request for `._cancel_task()`
assert ctx._cancel_msg
# which, has not yet resolved to an error outcome
# since this rpc func has not yet exited.
assert not ctx.maybe_error
assert not ctx._final_result_is_set()
# debug REPL if needed
# with trio.CancelScope(shield=True):
# await tractor.pause()
# expected case # expected case
_state = False _state = False
raise raise
else: else:
assert 0, "callee wasn't cancelled !?" assert 0, "Wasn't cancelled!?"
@pytest.mark.parametrize( @pytest.mark.parametrize(
@ -569,17 +434,13 @@ async def expect_cancelled(
@tractor_test @tractor_test
async def test_caller_closes_ctx_after_callee_opens_stream( async def test_caller_closes_ctx_after_callee_opens_stream(
use_ctx_cancel_method: bool, use_ctx_cancel_method: bool,
debug_mode: bool,
): ):
''' 'caller context closes without using stream'
caller context closes without using/opening stream
''' async with tractor.open_nursery() as an:
async with tractor.open_nursery(
debug_mode=debug_mode,
) as an:
root: Actor = current_actor() root: Actor = current_actor()
portal = await an.start_actor( portal = await an.start_actor(
'ctx_cancelled', 'ctx_cancelled',
enable_modules=[__name__], enable_modules=[__name__],
@ -592,13 +453,11 @@ async def test_caller_closes_ctx_after_callee_opens_stream(
await portal.run(assert_state, value=True) await portal.run(assert_state, value=True)
# call `ctx.cancel()` explicitly # call cancel explicitly
if use_ctx_cancel_method: if use_ctx_cancel_method:
await ctx.cancel() await ctx.cancel()
# NOTE: means the local side `ctx._scope` will
# have been cancelled by an ctxc ack and thus
# `._scope.cancelled_caught` should be set.
try: try:
async with ctx.open_stream() as stream: async with ctx.open_stream() as stream:
async for msg in stream: async for msg in stream:
@ -627,10 +486,7 @@ async def test_caller_closes_ctx_after_callee_opens_stream(
assert portal.channel.connected() assert portal.channel.connected()
# ctx is closed here # ctx is closed here
await portal.run( await portal.run(assert_state, value=False)
assert_state,
value=False,
)
else: else:
try: try:
@ -641,21 +497,9 @@ async def test_caller_closes_ctx_after_callee_opens_stream(
# NO-OP -> since already called above # NO-OP -> since already called above
await ctx.cancel() await ctx.cancel()
# NOTE: local scope should have absorbed the cancellation since # local scope should have absorbed the cancellation
# in this case we call `ctx.cancel()` and the local assert ctx.cancelled_caught
# `._scope` does not get `.cancel_called` and thus assert ctx._remote_error is ctx._local_error
# `.cancelled_caught` neither will ever bet set.
if use_ctx_cancel_method:
assert not ctx._scope.cancelled_caught
# rxed ctxc response from far end
assert ctx.cancel_acked
assert (
ctx._remote_error
is ctx._local_error
is ctx.maybe_error
is ctx.outcome
)
try: try:
async with ctx.open_stream() as stream: async with ctx.open_stream() as stream:
@ -678,13 +522,11 @@ async def test_caller_closes_ctx_after_callee_opens_stream(
@tractor_test @tractor_test
async def test_multitask_caller_cancels_from_nonroot_task( async def test_multitask_caller_cancels_from_nonroot_task():
debug_mode: bool,
): async with tractor.open_nursery() as n:
async with tractor.open_nursery(
debug_mode=debug_mode, portal = await n.start_actor(
) as an:
portal = await an.start_actor(
'ctx_cancelled', 'ctx_cancelled',
enable_modules=[__name__], enable_modules=[__name__],
) )
@ -731,7 +573,7 @@ async def test_multitask_caller_cancels_from_nonroot_task(
@tractor.context @tractor.context
async def cancel_self( async def cancel_self(
ctx: Context, ctx: tractor.Context,
) -> None: ) -> None:
global _state global _state
@ -768,20 +610,16 @@ async def cancel_self(
raise RuntimeError('Context didnt cancel itself?!') raise RuntimeError('Context didnt cancel itself?!')
@tractor_test @tractor_test
async def test_callee_cancels_before_started( async def test_callee_cancels_before_started():
debug_mode: bool,
):
''' '''
Callee calls `Context.cancel()` while streaming and caller Callee calls `Context.cancel()` while streaming and caller
sees stream terminated in `ContextCancelled`. sees stream terminated in `ContextCancelled`.
''' '''
async with tractor.open_nursery( async with tractor.open_nursery() as n:
debug_mode=debug_mode,
) as an: portal = await n.start_actor(
portal = await an.start_actor(
'cancels_self', 'cancels_self',
enable_modules=[__name__], enable_modules=[__name__],
) )
@ -807,7 +645,7 @@ async def test_callee_cancels_before_started(
@tractor.context @tractor.context
async def never_open_stream( async def never_open_stream(
ctx: Context, ctx: tractor.Context,
) -> None: ) -> None:
''' '''
@ -821,8 +659,8 @@ async def never_open_stream(
@tractor.context @tractor.context
async def keep_sending_from_callee( async def keep_sending_from_callee(
ctx: Context, ctx: tractor.Context,
msg_buffer_size: int|None = None, msg_buffer_size: Optional[int] = None,
) -> None: ) -> None:
''' '''
@ -847,10 +685,7 @@ async def keep_sending_from_callee(
], ],
ids='overrun_condition={}'.format, ids='overrun_condition={}'.format,
) )
def test_one_end_stream_not_opened( def test_one_end_stream_not_opened(overrun_by):
overrun_by: tuple[str, int, Callable],
debug_mode: bool,
):
''' '''
This should exemplify the bug from: This should exemplify the bug from:
https://github.com/goodboy/tractor/issues/265 https://github.com/goodboy/tractor/issues/265
@ -861,10 +696,8 @@ def test_one_end_stream_not_opened(
buf_size = buf_size_increase + Actor.msg_buffer_size buf_size = buf_size_increase + Actor.msg_buffer_size
async def main(): async def main():
async with tractor.open_nursery( async with tractor.open_nursery() as n:
debug_mode=debug_mode, portal = await n.start_actor(
) as an:
portal = await an.start_actor(
entrypoint.__name__, entrypoint.__name__,
enable_modules=[__name__], enable_modules=[__name__],
) )
@ -921,7 +754,7 @@ def test_one_end_stream_not_opened(
@tractor.context @tractor.context
async def echo_back_sequence( async def echo_back_sequence(
ctx: Context, ctx: tractor.Context,
seq: list[int], seq: list[int],
wait_for_cancel: bool, wait_for_cancel: bool,
allow_overruns_side: str, allow_overruns_side: str,
@ -938,10 +771,7 @@ async def echo_back_sequence(
# NOTE: ensure that if the caller is expecting to cancel this task # NOTE: ensure that if the caller is expecting to cancel this task
# that we stay echoing much longer then they are so we don't # that we stay echoing much longer then they are so we don't
# return early instead of receive the cancel msg. # return early instead of receive the cancel msg.
total_batches: int = ( total_batches: int = 1000 if wait_for_cancel else 6
1000 if wait_for_cancel
else 6
)
await ctx.started() await ctx.started()
# await tractor.breakpoint() # await tractor.breakpoint()
@ -960,23 +790,8 @@ async def echo_back_sequence(
) )
seq = list(seq) # bleh, msgpack sometimes ain't decoded right seq = list(seq) # bleh, msgpack sometimes ain't decoded right
for i in range(total_batches): for _ in range(total_batches):
print(f'starting new stream batch {i} iter in child')
batch = [] batch = []
# EoC case, delay a little instead of hot
# iter-stopping (since apparently py3.11+ can do that
# faster then a ctxc can be sent) on the async for
# loop when child was requested to ctxc.
if (
stream.closed
or
ctx.cancel_called
):
print('child stream already closed!?!')
await trio.sleep(0.05)
continue
async for msg in stream: async for msg in stream:
batch.append(msg) batch.append(msg)
if batch == seq: if batch == seq:
@ -987,18 +802,15 @@ async def echo_back_sequence(
print('callee waiting on next') print('callee waiting on next')
print(f'callee echoing back latest batch\n{batch}')
for msg in batch: for msg in batch:
print(f'callee sending msg\n{msg}') print(f'callee sending {msg}')
await stream.send(msg) await stream.send(msg)
try:
return 'yo'
finally:
print( print(
'exiting callee with context:\n' 'EXITING CALLEEE:\n'
f'{pformat(ctx)}\n' f'{ctx.canceller}'
) )
return 'yo'
@pytest.mark.parametrize( @pytest.mark.parametrize(
@ -1024,10 +836,7 @@ def test_maybe_allow_overruns_stream(
cancel_ctx: bool, cancel_ctx: bool,
slow_side: str, slow_side: str,
allow_overruns_side: str, allow_overruns_side: str,
# conftest wide
loglevel: str, loglevel: str,
debug_mode: bool,
): ):
''' '''
Demonstrate small overruns of each task back and forth Demonstrate small overruns of each task back and forth
@ -1046,14 +855,13 @@ def test_maybe_allow_overruns_stream(
''' '''
async def main(): async def main():
async with tractor.open_nursery( async with tractor.open_nursery() as n:
debug_mode=debug_mode, portal = await n.start_actor(
) as an:
portal = await an.start_actor(
'callee_sends_forever', 'callee_sends_forever',
enable_modules=[__name__], enable_modules=[__name__],
loglevel=loglevel, loglevel=loglevel,
debug_mode=debug_mode,
# debug_mode=True,
) )
seq = list(range(10)) seq = list(range(10))
async with portal.open_context( async with portal.open_context(
@ -1062,8 +870,8 @@ def test_maybe_allow_overruns_stream(
wait_for_cancel=cancel_ctx, wait_for_cancel=cancel_ctx,
be_slow=(slow_side == 'child'), be_slow=(slow_side == 'child'),
allow_overruns_side=allow_overruns_side, allow_overruns_side=allow_overruns_side,
) as (ctx, sent): ) as (ctx, sent):
assert sent is None assert sent is None
async with ctx.open_stream( async with ctx.open_stream(
@ -1091,10 +899,10 @@ def test_maybe_allow_overruns_stream(
if cancel_ctx: if cancel_ctx:
# cancel the remote task # cancel the remote task
print('Requesting `ctx.cancel()` in parent!') print('sending root side cancel')
await ctx.cancel() await ctx.cancel()
res: str|ContextCancelled = await ctx.result() res = await ctx.result()
if cancel_ctx: if cancel_ctx:
assert isinstance(res, ContextCancelled) assert isinstance(res, ContextCancelled)
@ -1149,52 +957,3 @@ def test_maybe_allow_overruns_stream(
# if this hits the logic blocks from above are not # if this hits the logic blocks from above are not
# exhaustive.. # exhaustive..
pytest.fail('PARAMETRIZED CASE GEN PROBLEM YO') pytest.fail('PARAMETRIZED CASE GEN PROBLEM YO')
def test_ctx_with_self_actor(
loglevel: str,
debug_mode: bool,
):
'''
NOTE: for now this is an INVALID OP!
BUT, eventually presuming we add a "side" key to `Actor.get_context()`,
we might be able to get this working symmetrically, but should we??
Open a context back to the same actor and ensure all cancellation
and error semantics hold the same.
'''
async def main():
async with tractor.open_nursery(
debug_mode=debug_mode,
enable_modules=[__name__],
) as an:
assert an
async with (
tractor.find_actor('root') as portal,
portal.open_context(
expect_cancelled,
# echo_back_sequence,
# seq=seq,
# wait_for_cancel=cancel_ctx,
# be_slow=(slow_side == 'child'),
# allow_overruns_side=allow_overruns_side,
) as (ctx, sent),
ctx.open_stream() as ipc,
):
assert sent is None
seq = list(range(10))
for i in seq:
await ipc.send(i)
rx: int = await ipc.receive()
assert rx == i
await ctx.cancel()
with pytest.raises(RuntimeError) as excinfo:
trio.run(main)
assert 'Invalid Operation' in repr(excinfo.value)

View File

@ -11,9 +11,11 @@ TODO:
""" """
import itertools import itertools
from os import path
from typing import Optional from typing import Optional
import platform import platform
import pathlib import pathlib
import sys
import time import time
import pytest import pytest
@ -23,10 +25,8 @@ from pexpect.exceptions import (
EOF, EOF,
) )
from tractor._testing import (
examples_dir,
)
from conftest import ( from conftest import (
examples_dir,
_ci_env, _ci_env,
) )

View File

@ -9,9 +9,10 @@ import itertools
import pytest import pytest
import tractor import tractor
from tractor._testing import tractor_test
import trio import trio
from conftest import tractor_test
@tractor_test @tractor_test
async def test_reg_then_unreg(reg_addr): async def test_reg_then_unreg(reg_addr):

View File

@ -11,7 +11,8 @@ import platform
import shutil import shutil
import pytest import pytest
from tractor._testing import (
from conftest import (
examples_dir, examples_dir,
) )

View File

@ -8,6 +8,7 @@ import builtins
import itertools import itertools
import importlib import importlib
from exceptiongroup import BaseExceptionGroup
import pytest import pytest
import trio import trio
import tractor import tractor
@ -17,7 +18,6 @@ from tractor import (
ContextCancelled, ContextCancelled,
) )
from tractor.trionics import BroadcastReceiver from tractor.trionics import BroadcastReceiver
from tractor._testing import expect_ctxc
async def sleep_and_err( async def sleep_and_err(
@ -68,7 +68,7 @@ def test_trio_cancels_aio_on_actor_side(reg_addr):
async def asyncio_actor( async def asyncio_actor(
target: str, target: str,
expect_err: Exception|None = None expect_err: Optional[Exception] = None
) -> None: ) -> None:
@ -112,21 +112,10 @@ def test_aio_simple_error(reg_addr):
infect_asyncio=True, infect_asyncio=True,
) )
with pytest.raises( with pytest.raises(RemoteActorError) as excinfo:
expected_exception=(RemoteActorError, ExceptionGroup),
) as excinfo:
trio.run(main) trio.run(main)
err = excinfo.value err = excinfo.value
# might get multiple `trio.Cancelled`s as well inside an inception
if isinstance(err, ExceptionGroup):
err = next(itertools.dropwhile(
lambda exc: not isinstance(exc, tractor.RemoteActorError),
err.exceptions
))
assert err
assert isinstance(err, RemoteActorError) assert isinstance(err, RemoteActorError)
assert err.type == AssertionError assert err.type == AssertionError
@ -201,8 +190,7 @@ async def trio_ctx(
@pytest.mark.parametrize( @pytest.mark.parametrize(
'parent_cancels', 'parent_cancels', [False, True],
['context', 'actor', False],
ids='parent_actor_cancels_child={}'.format ids='parent_actor_cancels_child={}'.format
) )
def test_context_spawns_aio_task_that_errors( def test_context_spawns_aio_task_that_errors(
@ -226,36 +214,18 @@ def test_context_spawns_aio_task_that_errors(
# debug_mode=True, # debug_mode=True,
loglevel='cancel', loglevel='cancel',
) )
async with ( async with p.open_context(
expect_ctxc(
yay=parent_cancels == 'actor',
),
p.open_context(
trio_ctx, trio_ctx,
) as (ctx, first), ) as (ctx, first):
):
assert first == 'start' assert first == 'start'
if parent_cancels == 'actor': if parent_cancels:
await p.cancel_actor() await p.cancel_actor()
elif parent_cancels == 'context':
await ctx.cancel()
else:
await trio.sleep_forever() await trio.sleep_forever()
async with expect_ctxc( return await ctx.result()
yay=parent_cancels == 'actor',
):
await ctx.result()
if parent_cancels == 'context':
# to tear down sub-acor
await p.cancel_actor()
return ctx.outcome
if parent_cancels: if parent_cancels:
# bc the parent made the cancel request, # bc the parent made the cancel request,
@ -299,22 +269,11 @@ def test_aio_cancelled_from_aio_causes_trio_cancelled(reg_addr):
infect_asyncio=True, infect_asyncio=True,
) )
with pytest.raises( with pytest.raises(RemoteActorError) as excinfo:
expected_exception=(RemoteActorError, ExceptionGroup),
) as excinfo:
trio.run(main) trio.run(main)
# might get multiple `trio.Cancelled`s as well inside an inception
err = excinfo.value
if isinstance(err, ExceptionGroup):
err = next(itertools.dropwhile(
lambda exc: not isinstance(exc, tractor.RemoteActorError),
err.exceptions
))
assert err
# ensure boxed error is correct # ensure boxed error is correct
assert err.type == to_asyncio.AsyncioCancelled assert excinfo.value.type == to_asyncio.AsyncioCancelled
# TODO: verify open_channel_from will fail on this.. # TODO: verify open_channel_from will fail on this..

View File

@ -10,9 +10,6 @@ import pytest
import trio import trio
import tractor import tractor
from tractor import ( # typing from tractor import ( # typing
Actor,
current_actor,
open_nursery,
Portal, Portal,
Context, Context,
ContextCancelled, ContextCancelled,
@ -126,9 +123,7 @@ async def error_before_started(
await peer_ctx.cancel() await peer_ctx.cancel()
def test_do_not_swallow_error_before_started_by_remote_contextcancelled( def test_do_not_swallow_error_before_started_by_remote_contextcancelled():
debug_mode: bool,
):
''' '''
Verify that an error raised in a remote context which itself Verify that an error raised in a remote context which itself
opens YET ANOTHER remote context, which it then cancels, does not opens YET ANOTHER remote context, which it then cancels, does not
@ -137,9 +132,7 @@ def test_do_not_swallow_error_before_started_by_remote_contextcancelled(
''' '''
async def main(): async def main():
async with tractor.open_nursery( async with tractor.open_nursery() as n:
debug_mode=debug_mode,
) as n:
portal = await n.start_actor( portal = await n.start_actor(
'errorer', 'errorer',
enable_modules=[__name__], enable_modules=[__name__],
@ -220,12 +213,11 @@ async def stream_from_peer(
# - what about IPC-transport specific errors, should # - what about IPC-transport specific errors, should
# they bubble from the async for and trigger # they bubble from the async for and trigger
# other special cases? # other special cases?
#
# NOTE: current ctl flow: # NOTE: current ctl flow:
# - stream raises `trio.EndOfChannel` and # - stream raises `trio.EndOfChannel` and
# exits the loop # exits the loop
# - `.open_context()` will raise the ctxc received # - `.open_context()` will raise the ctxcanc
# from the sleeper. # received from the sleeper.
async for msg in stream: async for msg in stream:
assert msg is not None assert msg is not None
print(msg) print(msg)
@ -233,37 +225,25 @@ async def stream_from_peer(
# NOTE: cancellation of the (sleeper) peer should always # NOTE: cancellation of the (sleeper) peer should always
# cause a `ContextCancelled` raise in this streaming # cause a `ContextCancelled` raise in this streaming
# actor. # actor.
except ContextCancelled as ctxc: except ContextCancelled as ctxerr:
ctxerr = ctxc err = ctxerr
assert peer_ctx._remote_error is ctxerr assert peer_ctx._remote_error is ctxerr
assert peer_ctx._remote_error.msgdata == ctxerr.msgdata assert peer_ctx.canceller == ctxerr.canceller
# the peer ctx is the canceller even though it's canceller
# is the "canceller" XD
assert peer_name in peer_ctx.canceller
assert "canceller" in ctxerr.canceller
# caller peer should not be the cancel requester # caller peer should not be the cancel requester
assert not ctx.cancel_called assert not ctx.cancel_called
assert not ctx.cancel_acked # XXX can never be true since `._invoke` only
# XXX can NEVER BE TRUE since `._invoke` only
# sets this AFTER the nursery block this task # sets this AFTER the nursery block this task
# was started in, exits. # was started in, exits.
assert not ctx._scope.cancelled_caught assert not ctx.cancelled_caught
# we never requested cancellation, it was the 'canceller' # we never requested cancellation
# peer.
assert not peer_ctx.cancel_called assert not peer_ctx.cancel_called
assert not peer_ctx.cancel_acked
# the `.open_context()` exit definitely caught # the `.open_context()` exit definitely caught
# a cancellation in the internal `Context._scope` since # a cancellation in the internal `Context._scope` since
# likely the runtime called `_deliver_msg()` after # likely the runtime called `_deliver_msg()` after
# receiving the remote error from the streaming task. # receiving the remote error from the streaming task.
assert not peer_ctx._scope.cancelled_caught assert peer_ctx.cancelled_caught
# TODO / NOTE `.canceller` won't have been set yet # TODO / NOTE `.canceller` won't have been set yet
# here because that machinery is inside # here because that machinery is inside
@ -272,11 +252,10 @@ async def stream_from_peer(
# checkpoint) that cancellation was due to # checkpoint) that cancellation was due to
# a remote, we COULD assert this here..see, # a remote, we COULD assert this here..see,
# https://github.com/goodboy/tractor/issues/368 # https://github.com/goodboy/tractor/issues/368
#
# assert 'canceller' in ctx.canceller
# root/parent actor task should NEVER HAVE cancelled us! # root/parent actor task should NEVER HAVE cancelled us!
assert not ctx.canceller assert not ctx.canceller
assert 'canceller' in peer_ctx.canceller
raise raise
# TODO: IN THEORY we could have other cases depending on # TODO: IN THEORY we could have other cases depending on
@ -290,17 +269,17 @@ async def stream_from_peer(
# assert ctx.canceller[0] == 'root' # assert ctx.canceller[0] == 'root'
# assert peer_ctx.canceller[0] == 'sleeper' # assert peer_ctx.canceller[0] == 'sleeper'
raise RuntimeError('Never triggered local `ContextCancelled` ?!?') raise RuntimeError(
'peer never triggered local `ContextCancelled`?'
)
@pytest.mark.parametrize( @pytest.mark.parametrize(
'error_during_ctxerr_handling', 'error_during_ctxerr_handling',
[False, True], [False, True],
ids=lambda item: f'rte_during_ctxerr={item}',
) )
def test_peer_canceller( def test_peer_canceller(
error_during_ctxerr_handling: bool, error_during_ctxerr_handling: bool,
debug_mode: bool,
): ):
''' '''
Verify that a cancellation triggered by an in-actor-tree peer Verify that a cancellation triggered by an in-actor-tree peer
@ -357,7 +336,7 @@ def test_peer_canceller(
async def main(): async def main():
async with tractor.open_nursery( async with tractor.open_nursery(
# NOTE: to halt the peer tasks on ctxc, uncomment this. # NOTE: to halt the peer tasks on ctxc, uncomment this.
debug_mode=debug_mode, # debug_mode=True
) as an: ) as an:
canceller: Portal = await an.start_actor( canceller: Portal = await an.start_actor(
'canceller', 'canceller',
@ -371,7 +350,8 @@ def test_peer_canceller(
'just_caller', # but i just met her? 'just_caller', # but i just met her?
enable_modules=[__name__], enable_modules=[__name__],
) )
root: Actor = current_actor()
root = tractor.current_actor()
try: try:
async with ( async with (
@ -389,16 +369,15 @@ def test_peer_canceller(
) as (canceller_ctx, sent), ) as (canceller_ctx, sent),
): ):
ctxs: dict[str, Context] = { ctxs: list[Context] = [
'sleeper': sleeper_ctx, sleeper_ctx,
'caller': caller_ctx, caller_ctx,
'canceller': canceller_ctx, canceller_ctx,
} ]
try: try:
print('PRE CONTEXT RESULT') print('PRE CONTEXT RESULT')
res = await sleeper_ctx.result() await sleeper_ctx.result()
assert res
# should never get here # should never get here
pytest.fail( pytest.fail(
@ -408,19 +387,13 @@ def test_peer_canceller(
# should always raise since this root task does # should always raise since this root task does
# not request the sleeper cancellation ;) # not request the sleeper cancellation ;)
except ContextCancelled as ctxerr: except ContextCancelled as ctxerr:
print( print(f'CAUGHT REMOTE CONTEXT CANCEL {ctxerr}')
'CAUGHT REMOTE CONTEXT CANCEL\n\n'
f'{ctxerr}\n'
)
# canceller and caller peers should not # canceller and caller peers should not
# have been remotely cancelled. # have been remotely cancelled.
assert canceller_ctx.canceller is None assert canceller_ctx.canceller is None
assert caller_ctx.canceller is None assert caller_ctx.canceller is None
# we were not the actor, our peer was
assert not sleeper_ctx.cancel_acked
assert ctxerr.canceller[0] == 'canceller' assert ctxerr.canceller[0] == 'canceller'
# XXX NOTE XXX: since THIS `ContextCancelled` # XXX NOTE XXX: since THIS `ContextCancelled`
@ -428,273 +401,123 @@ def test_peer_canceller(
# `sleeper.open_context().__aexit__()` this # `sleeper.open_context().__aexit__()` this
# value is not yet set, however outside this # value is not yet set, however outside this
# block it should be. # block it should be.
assert not sleeper_ctx._scope.cancelled_caught assert not sleeper_ctx.cancelled_caught
# CASE_1: error-during-ctxc-handling,
if error_during_ctxerr_handling: if error_during_ctxerr_handling:
raise RuntimeError('Simulated error during teardown') raise RuntimeError('Simulated error during teardown')
# CASE_2: standard teardown inside in `.open_context()` block
raise raise
# XXX SHOULD NEVER EVER GET HERE XXX # XXX SHOULD NEVER EVER GET HERE XXX
except BaseException as berr: except BaseException as berr:
raise err = berr
pytest.fail('did not rx ctx-cancelled error?')
# XXX if needed to debug failure
# _err = berr
# await tractor.pause()
# await trio.sleep_forever()
pytest.fail(
'did not rx ctxc ?!?\n\n'
f'{berr}\n'
)
else: else:
pytest.fail( pytest.fail('did not rx ctx-cancelled error?')
'did not rx ctxc ?!?\n\n'
f'{ctxs}\n'
)
except ( except (
ContextCancelled, ContextCancelled,
RuntimeError, RuntimeError,
)as loc_err: )as ctxerr:
_loc_err = loc_err _err = ctxerr
# NOTE: the main state to check on `Context` is: # NOTE: the main state to check on `Context` is:
# - `.cancelled_caught` (maps to nursery cs)
# - `.cancel_called` (bool of whether this side # - `.cancel_called` (bool of whether this side
# requested) # requested)
# - `.cancel_acked` (bool of whether a ctxc
# response was received due to cancel req).
# - `.maybe_error` (highest prio error to raise
# locally)
# - `.outcome` (final error or result value)
# - `.canceller` (uid of cancel-causing actor-task) # - `.canceller` (uid of cancel-causing actor-task)
# - `._remote_error` (any `RemoteActorError` # - `._remote_error` (any `RemoteActorError`
# instance from other side of context) # instance from other side of context)
# - `._local_error` (any error caught inside the
# `.open_context()` block).
#
# XXX: Deprecated and internal only
# - `.cancelled_caught` (maps to nursery cs)
# - now just use `._scope.cancelled_caught`
# since it maps to the internal (maps to nursery cs)
#
# TODO: are we really planning to use this tho? # TODO: are we really planning to use this tho?
# - `._cancel_msg` (any msg that caused the # - `._cancel_msg` (any msg that caused the
# cancel) # cancel)
# CASE_1: error-during-ctxc-handling, # CASE: error raised during handling of
# - far end cancels due to peer 'canceller', # `ContextCancelled` inside `.open_context()`
# - `ContextCancelled` relayed to this scope, # block
# - inside `.open_context()` ctxc is caught and
# a rte raised instead
#
# => block should raise the rte but all peers
# should be cancelled by US.
#
if error_during_ctxerr_handling: if error_during_ctxerr_handling:
# since we do a rte reraise above, the assert isinstance(ctxerr, RuntimeError)
# `.open_context()` error handling should have
# raised a local rte, thus the internal
# `.open_context()` enterer task's
# cancel-scope should have raised the RTE, NOT
# a `trio.Cancelled` due to a local
# `._scope.cancel()` call.
assert not sleeper_ctx._scope.cancelled_caught
assert isinstance(loc_err, RuntimeError)
print(f'_loc_err: {_loc_err}\n')
# assert sleeper_ctx._local_error is _loc_err
# assert sleeper_ctx._local_error is _loc_err
assert not (
loc_err
is sleeper_ctx.maybe_error
is sleeper_ctx.outcome
is sleeper_ctx._remote_error
)
# NOTE: this root actor task should have # NOTE: this root actor task should have
# called `Context.cancel()` on the # called `Context.cancel()` on the
# `.__aexit__()` to every opened ctx. # `.__aexit__()` to every opened ctx.
for name, ctx in ctxs.items(): for ctx in ctxs:
assert ctx.cancel_called
# this root actor task should have # this root actor task should have
# cancelled all opened contexts except the # cancelled all opened contexts except the
# sleeper which is obvi by the "canceller" # sleeper which is obvi by the "canceller"
# peer. # peer.
re = ctx._remote_error re = ctx._remote_error
le = ctx._local_error if (
ctx is sleeper_ctx
assert ctx.cancel_called or ctx is caller_ctx
):
if ctx is sleeper_ctx:
assert 'canceller' in re.canceller
assert 'sleeper' in ctx.canceller
if ctx is canceller_ctx:
assert ( assert (
re.canceller re.canceller
== ==
root.uid ctx.canceller
)
else: # the other 2 ctxs
assert (
re.canceller
== ==
canceller.channel.uid canceller.channel.uid
) )
# since the sleeper errors while handling a else:
# peer-cancelled (by ctxc) scenario, we expect
# that the `.open_context()` block DOES call
# `.cancel() (despite in this test case it
# being unecessary).
assert ( assert (
sleeper_ctx.cancel_called re.canceller
and ==
not sleeper_ctx.cancel_acked ctx.canceller
==
root.uid
) )
# CASE_2: standard teardown inside in `.open_context()` block # CASE: standard teardown inside in `.open_context()` block
# - far end cancels due to peer 'canceller',
# - `ContextCancelled` relayed to this scope and
# raised locally without any raise-during-handle,
#
# => inside `.open_context()` ctxc is raised and
# propagated
#
else: else:
# since sleeper_ctx.result() IS called above assert ctxerr.canceller == sleeper_ctx.canceller
# we should have (silently) absorbed the assert (
# corresponding `ContextCancelled` for it and ctxerr.canceller[0]
# `._scope.cancel()` should never have been ==
# called. sleeper_ctx.canceller[0]
assert not sleeper_ctx._scope.cancelled_caught ==
'canceller'
assert isinstance(loc_err, ContextCancelled) )
# the received remote error's `.canceller`
# will of course be the "canceller" actor BUT
# the canceller set on the local handle to
# `sleeper_ctx` will be the "sleeper" uid
# since it's the actor that relayed us the
# error which was **caused** by the
# "canceller".
assert 'sleeper' in sleeper_ctx.canceller
assert 'canceller' == loc_err.canceller[0]
# the sleeper's remote error is the error bubbled # the sleeper's remote error is the error bubbled
# out of the context-stack above! # out of the context-stack above!
final_err = sleeper_ctx.outcome re = sleeper_ctx._remote_error
assert ( assert re is ctxerr
final_err is loc_err
is sleeper_ctx.maybe_error
is sleeper_ctx._remote_error
)
for name, ctx in ctxs.items(): for ctx in ctxs:
re: BaseException | None = ctx._remote_error
re: BaseException|None = ctx._remote_error assert re
le: BaseException|None = ctx._local_error
err = ctx.maybe_error
out = ctx.outcome
# every ctx should error!
assert out is err
# the recorded local erro should always be
# the same as the one raised by the
# `sleeper_ctx.result()` call
assert (
le
and
le is loc_err
)
# root doesn't cancel sleeper since it's # root doesn't cancel sleeper since it's
# cancelled by its peer. # cancelled by its peer.
if ctx is sleeper_ctx: if ctx is sleeper_ctx:
assert re
assert (
ctx._remote_error
is ctx.maybe_error
is ctx.outcome
is ctx._local_error
)
assert not ctx.cancel_called assert not ctx.cancel_called
assert not ctx.cancel_acked
# since sleeper_ctx.result() IS called # since sleeper_ctx.result() IS called
# above we should have (silently) # above we should have (silently)
# absorbed the corresponding # absorbed the corresponding
# `ContextCancelled` for it and thus # `ContextCancelled` for it and thus
# the logic inside `.cancelled_caught` # the logic inside `.cancelled_caught`
# should trigger! # should trigger!
assert not ctx._scope.cancelled_caught assert ctx.cancelled_caught
elif ctx in ( elif ctx is caller_ctx:
caller_ctx, # since its context was remotely
canceller_ctx, # cancelled, we never needed to
): # call `Context.cancel()` bc it was
# done by the peer and also we never
assert ctx.cancel_called
assert not ctx._remote_error # TODO: figure out the details of
# this..
# neither of the `caller/canceller_ctx` should
# have called `ctx.cancel()` bc the
# canceller's task internally issues
# a `Portal.cancel_actor()` to the
# sleeper and thus never should call
# `ctx.cancel()` per say UNLESS the
# sleeper's `.result()` call above
# ctxc exception results in the
# canceller's
# `.open_context().__aexit__()` error
# handling to kick in BEFORE a remote
# error is delivered - which since
# we're asserting what we are above,
# that should normally be the case
# right?
#
assert not ctx.cancel_called
#
# assert ctx.cancel_called
# orig ^
# TODO: figure out the details of this..?
# if you look the `._local_error` here # if you look the `._local_error` here
# is a multi of ctxc + 2 Cancelleds? # is a multi of ctxc + 2 Cancelleds?
# assert not ctx._scope.cancelled_caught # assert not ctx.cancelled_caught
assert (
not ctx.cancel_called
and not ctx.cancel_acked
)
assert not ctx._scope.cancelled_caught
# elif ctx is canceller_ctx:
# assert not ctx._remote_error
# XXX NOTE XXX: ONLY the canceller
# will get a self-cancelled outcome
# whilst everyone else gets
# a peer-caused cancellation!
#
# TODO: really we should avoid calling
# .cancel() whenever an interpeer
# cancel takes place since each
# reception of a ctxc
else: else:
pytest.fail( assert ctx.cancel_called
'Uhh wut ctx is this?\n' assert not ctx.cancelled_caught
f'{ctx}\n'
)
# TODO: do we even need this flag? # TODO: do we even need this flag?
# -> each context should have received # -> each context should have received
@ -710,24 +533,14 @@ def test_peer_canceller(
# `Context.cancel()` SHOULD NOT have been # `Context.cancel()` SHOULD NOT have been
# called inside # called inside
# `Portal.open_context().__aexit__()`. # `Portal.open_context().__aexit__()`.
assert not ( assert not sleeper_ctx.cancel_called
sleeper_ctx.cancel_called
or
sleeper_ctx.cancel_acked
)
# XXX NOTE XXX: and see matching comment above but, # XXX NOTE XXX: and see matching comment above but,
# the `._scope` is only set by `trio` AFTER the # this flag is set only AFTER the `.open_context()`
# `.open_context()` block has exited and should be # has exited and should be set in both outcomes
# set in both outcomes including the case where # including the case where ctx-cancel handling
# ctx-cancel handling itself errors. # itself errors.
assert not sleeper_ctx._scope.cancelled_caught assert sleeper_ctx.cancelled_caught
assert _loc_err is sleeper_ctx._local_error
assert (
sleeper_ctx.outcome
is sleeper_ctx.maybe_error
is sleeper_ctx._remote_error
)
raise # always to ensure teardown raise # always to ensure teardown
@ -741,317 +554,3 @@ def test_peer_canceller(
assert excinfo.value.type == ContextCancelled assert excinfo.value.type == ContextCancelled
assert excinfo.value.canceller[0] == 'canceller' assert excinfo.value.canceller[0] == 'canceller'
@tractor.context
async def basic_echo_server(
ctx: Context,
peer_name: str = 'stepbro',
) -> None:
'''
Just the simplest `MsgStream` echo server which resays what
you told it but with its uid in front ;)
'''
actor: Actor = tractor.current_actor()
uid: tuple = actor.uid
await ctx.started(uid)
async with ctx.open_stream() as ipc:
async for msg in ipc:
# repack msg pair with our uid
# as first element.
(
client_uid,
i,
) = msg
resp: tuple = (
uid,
i,
)
# OOF! looks like my runtime-error is causing a lockup
# assert 0
await ipc.send(resp)
@tractor.context
async def serve_subactors(
ctx: Context,
peer_name: str,
) -> None:
async with open_nursery() as an:
await ctx.started(peer_name)
async with ctx.open_stream() as reqs:
async for msg in reqs:
peer_name: str = msg
peer: Portal = await an.start_actor(
name=peer_name,
enable_modules=[__name__],
)
print(
'Spawning new subactor\n'
f'{peer_name}\n'
f'|_{peer}\n'
)
await reqs.send((
peer.chan.uid,
peer.chan.raddr,
))
print('Spawner exiting spawn serve loop!')
@tractor.context
async def client_req_subactor(
ctx: Context,
peer_name: str,
# used to simulate a user causing an error to be raised
# directly in thread (like a KBI) to better replicate the
# case where a `modden` CLI client would hang afer requesting
# a `Context.cancel()` to `bigd`'s wks spawner.
reraise_on_cancel: str|None = None,
) -> None:
# TODO: other cases to do with sub lifetimes:
# -[ ] test that we can have the server spawn a sub
# that lives longer then ctx with this client.
# -[ ] test that
# open ctx with peer spawn server and ask it to spawn a little
# bro which we'll then connect and stream with.
async with (
tractor.find_actor(
name='spawn_server',
raise_on_none=True,
# TODO: we should be isolating this from other runs!
# => ideally so we can eventually use something like
# `pytest-xdist` Bo
# registry_addrs=bigd._reg_addrs,
) as spawner,
spawner.open_context(
serve_subactors,
peer_name=peer_name,
) as (spawner_ctx, first),
):
assert first == peer_name
await ctx.started(
'yup i had brudder',
)
async with spawner_ctx.open_stream() as reqs:
# send single spawn request to the server
await reqs.send(peer_name)
with trio.fail_after(3):
(
sub_uid,
sub_raddr,
) = await reqs.receive()
await tell_little_bro(
actor_name=sub_uid[0],
caller='client',
)
# TODO: test different scope-layers of
# cancellation?
# with trio.CancelScope() as cs:
try:
await trio.sleep_forever()
# TODO: would be super nice to have a special injected
# cancel type here (maybe just our ctxc) but using
# some native mechanism in `trio` :p
except (
trio.Cancelled
) as err:
_err = err
if reraise_on_cancel:
errtype = globals()['__builtins__'][reraise_on_cancel]
assert errtype
to_reraise: BaseException = errtype()
print(f'client re-raising on cancel: {repr(to_reraise)}')
raise err
raise
# if cs.cancelled_caught:
# print('client handling expected KBI!')
# await ctx.
# await trio.sleep(
# await tractor.pause()
# await spawner_ctx.cancel()
# cancel spawned sub-actor directly?
# await sub_ctx.cancel()
# maybe cancel runtime?
# await sub.cancel_actor()
async def tell_little_bro(
actor_name: str,
caller: str = ''
):
# contact target actor, do a stream dialog.
async with (
tractor.wait_for_actor(
name=actor_name
) as lb,
lb.open_context(
basic_echo_server,
) as (sub_ctx, first),
sub_ctx.open_stream(
basic_echo_server,
) as echo_ipc,
):
actor: Actor = current_actor()
uid: tuple = actor.uid
for i in range(100):
msg: tuple = (
uid,
i,
)
await echo_ipc.send(msg)
resp = await echo_ipc.receive()
print(
f'{caller} => {actor_name}: {msg}\n'
f'{caller} <= {actor_name}: {resp}\n'
)
(
sub_uid,
_i,
) = resp
assert sub_uid != uid
assert _i == i
@pytest.mark.parametrize(
'raise_client_error',
[None, 'KeyboardInterrupt'],
)
def test_peer_spawns_and_cancels_service_subactor(
debug_mode: bool,
raise_client_error: str,
):
# NOTE: this tests for the modden `mod wks open piker` bug
# discovered as part of implementing workspace ctx
# open-.pause()-ctx.cancel() as part of the CLI..
# -> start actor-tree (server) that offers sub-actor spawns via
# context API
# -> start another full actor-tree (client) which requests to the first to
# spawn over its `@context` ep / api.
# -> client actor cancels the context and should exit gracefully
# and the server's spawned child should cancel and terminate!
peer_name: str = 'little_bro'
async def main():
async with tractor.open_nursery(
# NOTE: to halt the peer tasks on ctxc, uncomment this.
debug_mode=debug_mode,
) as an:
server: Portal = await an.start_actor(
(server_name := 'spawn_server'),
enable_modules=[__name__],
)
print(f'Spawned `{server_name}`')
client: Portal = await an.start_actor(
client_name := 'client',
enable_modules=[__name__],
)
print(f'Spawned `{client_name}`')
try:
async with (
server.open_context(
serve_subactors,
peer_name=peer_name,
) as (spawn_ctx, first),
client.open_context(
client_req_subactor,
peer_name=peer_name,
reraise_on_cancel=raise_client_error,
) as (client_ctx, client_says),
):
print(
f'Server says: {first}\n'
f'Client says: {client_says}\n'
)
# attach to client-requested-to-spawn
# (grandchild of this root actor) "little_bro"
# and ensure we can also use it as an echo
# server.
async with tractor.wait_for_actor(
name=peer_name,
) as sub:
assert sub
print(
'Sub-spawn came online\n'
f'portal: {sub}\n'
f'.uid: {sub.actor.uid}\n'
f'chan.raddr: {sub.chan.raddr}\n'
)
await tell_little_bro(
actor_name=peer_name,
caller='root',
)
# signal client to raise a KBI
await client_ctx.cancel()
print('root cancelled client, checking that sub-spawn is down')
async with tractor.find_actor(
name=peer_name,
) as sub:
assert not sub
print('root cancelling server/client sub-actors')
# await tractor.pause()
res = await client_ctx.result(hide_tb=False)
assert isinstance(res, ContextCancelled)
assert client_ctx.cancel_acked
assert res.canceller == current_actor().uid
await spawn_ctx.cancel()
# await server.cancel_actor()
# since we called `.cancel_actor()`, `.cancel_ack`
# will not be set on the ctx bc `ctx.cancel()` was not
# called directly fot this confext.
except ContextCancelled as ctxc:
print('caught ctxc from contexts!')
assert ctxc.canceller == current_actor().uid
assert ctxc is spawn_ctx.outcome
assert ctxc is spawn_ctx.maybe_error
raise
# assert spawn_ctx.cancel_acked
assert spawn_ctx.cancel_acked
assert client_ctx.cancel_acked
await client.cancel_actor()
await server.cancel_actor()
# WOA WOA WOA! we need this to close..!!!??
# that's super bad XD
# TODO: why isn't this working!?!?
# we're now outside the `.open_context()` block so
# the internal `Context._scope: CancelScope` should be
# gracefully "closed" ;)
# assert spawn_ctx.cancelled_caught
trio.run(main)

View File

@ -9,7 +9,7 @@ import trio
import tractor import tractor
import pytest import pytest
from tractor._testing import tractor_test from conftest import tractor_test
def test_must_define_ctx(): def test_must_define_ctx():

View File

@ -7,7 +7,7 @@ import pytest
import trio import trio
import tractor import tractor
from tractor._testing import tractor_test from conftest import tractor_test
@pytest.mark.trio @pytest.mark.trio

View File

@ -7,10 +7,8 @@ import time
import pytest import pytest
import trio import trio
import tractor import tractor
from tractor._testing import (
tractor_test,
)
from conftest import ( from conftest import (
tractor_test,
sig_prog, sig_prog,
_INT_SIGNAL, _INT_SIGNAL,
_INT_RETURN_CODE, _INT_RETURN_CODE,

View File

@ -5,7 +5,8 @@ import pytest
import trio import trio
import tractor import tractor
from tractor.experimental import msgpub from tractor.experimental import msgpub
from tractor._testing import tractor_test
from conftest import tractor_test
def test_type_checks(): def test_type_checks():

View File

@ -1,8 +1,6 @@
''' """
RPC (or maybe better labelled as "RTS: remote task scheduling"?) RPC related
related API and error checks. """
'''
import itertools import itertools
import pytest import pytest
@ -54,13 +52,8 @@ async def short_sleep():
(['tmp_mod'], 'import doggy', ModuleNotFoundError), (['tmp_mod'], 'import doggy', ModuleNotFoundError),
(['tmp_mod'], '4doggy', SyntaxError), (['tmp_mod'], '4doggy', SyntaxError),
], ],
ids=[ ids=['no_mods', 'this_mod', 'this_mod_bad_func', 'fail_to_import',
'no_mods', 'fail_on_syntax'],
'this_mod',
'this_mod_bad_func',
'fail_to_import',
'fail_on_syntax',
],
) )
def test_rpc_errors( def test_rpc_errors(
reg_addr, reg_addr,
@ -134,16 +127,14 @@ def test_rpc_errors(
run() run()
else: else:
# underlying errors aren't propagated upwards (yet) # underlying errors aren't propagated upwards (yet)
with pytest.raises( with pytest.raises(remote_err) as err:
expected_exception=(remote_err, ExceptionGroup),
) as err:
run() run()
# get raw instance from pytest wrapper # get raw instance from pytest wrapper
value = err.value value = err.value
# might get multiple `trio.Cancelled`s as well inside an inception # might get multiple `trio.Cancelled`s as well inside an inception
if isinstance(value, ExceptionGroup): if isinstance(value, trio.MultiError):
value = next(itertools.dropwhile( value = next(itertools.dropwhile(
lambda exc: not isinstance(exc, tractor.RemoteActorError), lambda exc: not isinstance(exc, tractor.RemoteActorError),
value.exceptions value.exceptions

View File

@ -8,7 +8,7 @@ import pytest
import trio import trio
import tractor import tractor
from tractor._testing import tractor_test from conftest import tractor_test
_file_path: str = '' _file_path: str = ''
@ -64,8 +64,7 @@ async def test_lifetime_stack_wipes_tmpfile(
except ( except (
tractor.RemoteActorError, tractor.RemoteActorError,
# tractor.BaseExceptionGroup, tractor.BaseExceptionGroup,
BaseExceptionGroup,
): ):
pass pass

167
tests/test_shm.py 100644
View File

@ -0,0 +1,167 @@
"""
Shared mem primitives and APIs.
"""
import uuid
# import numpy
import pytest
import trio
import tractor
from tractor._shm import (
open_shm_list,
attach_shm_list,
)
@tractor.context
async def child_attach_shml_alot(
ctx: tractor.Context,
shm_key: str,
) -> None:
await ctx.started(shm_key)
# now try to attach a boatload of times in a loop..
for _ in range(1000):
shml = attach_shm_list(
key=shm_key,
readonly=False,
)
assert shml.shm.name == shm_key
await trio.sleep(0.001)
def test_child_attaches_alot():
async def main():
async with tractor.open_nursery() as an:
# allocate writeable list in parent
key = f'shml_{uuid.uuid4()}'
shml = open_shm_list(
key=key,
)
portal = await an.start_actor(
'shm_attacher',
enable_modules=[__name__],
)
async with (
portal.open_context(
child_attach_shml_alot,
shm_key=shml.key,
) as (ctx, start_val),
):
assert start_val == key
await ctx.result()
await portal.cancel_actor()
trio.run(main)
@tractor.context
async def child_read_shm_list(
ctx: tractor.Context,
shm_key: str,
use_str: bool,
frame_size: int,
) -> None:
# attach in child
shml = attach_shm_list(
key=shm_key,
# dtype=str if use_str else float,
)
await ctx.started(shml.key)
async with ctx.open_stream() as stream:
async for i in stream:
print(f'(child): reading shm list index: {i}')
if use_str:
expect = str(float(i))
else:
expect = float(i)
if frame_size == 1:
val = shml[i]
assert expect == val
print(f'(child): reading value: {val}')
else:
frame = shml[i - frame_size:i]
print(f'(child): reading frame: {frame}')
@pytest.mark.parametrize(
'use_str',
[False, True],
ids=lambda i: f'use_str_values={i}',
)
@pytest.mark.parametrize(
'frame_size',
[1, 2**6, 2**10],
ids=lambda i: f'frame_size={i}',
)
def test_parent_writer_child_reader(
use_str: bool,
frame_size: int,
):
async def main():
async with tractor.open_nursery(
# debug_mode=True,
) as an:
portal = await an.start_actor(
'shm_reader',
enable_modules=[__name__],
debug_mode=True,
)
# allocate writeable list in parent
key = 'shm_list'
seq_size = int(2 * 2 ** 10)
shml = open_shm_list(
key=key,
size=seq_size,
dtype=str if use_str else float,
readonly=False,
)
async with (
portal.open_context(
child_read_shm_list,
shm_key=key,
use_str=use_str,
frame_size=frame_size,
) as (ctx, sent),
ctx.open_stream() as stream,
):
assert sent == key
for i in range(seq_size):
val = float(i)
if use_str:
val = str(val)
# print(f'(parent): writing {val}')
shml[i] = val
# only on frame fills do we
# signal to the child that a frame's
# worth is ready.
if (i % frame_size) == 0:
print(f'(parent): signalling frame full on {val}')
await stream.send(i)
else:
print(f'(parent): signalling final frame on {val}')
await stream.send(i)
await portal.cancel_actor()
trio.run(main)

View File

@ -8,7 +8,7 @@ import pytest
import trio import trio
import tractor import tractor
from tractor._testing import tractor_test from conftest import tractor_test
data_to_pass_down = {'doggy': 10, 'kitty': 4} data_to_pass_down = {'doggy': 10, 'kitty': 4}

View File

@ -5,7 +5,7 @@ want to see changed.
''' '''
import pytest import pytest
import trio import trio
from trio import TaskStatus from trio_typing import TaskStatus
@pytest.mark.parametrize( @pytest.mark.parametrize(

File diff suppressed because it is too large Load Diff

View File

@ -106,29 +106,25 @@ def _trio_main(
Entry point for a `trio_run_in_process` subactor. Entry point for a `trio_run_in_process` subactor.
''' '''
log.info(f"Started new trio process for {actor.uid}")
if actor.loglevel is not None:
log.info(
f"Setting loglevel for {actor.uid} to {actor.loglevel}")
get_console_log(actor.loglevel)
log.info(
f"Started {actor.uid}")
_state._current_actor = actor _state._current_actor = actor
log.debug(f"parent_addr is {parent_addr}")
trio_main = partial( trio_main = partial(
async_main, async_main,
actor, actor,
parent_addr=parent_addr parent_addr=parent_addr
) )
if actor.loglevel is not None:
get_console_log(actor.loglevel)
import os
actor_info: str = (
f'|_{actor}\n'
f' uid: {actor.uid}\n'
f' pid: {os.getpid()}\n'
f' parent_addr: {parent_addr}\n'
f' loglevel: {actor.loglevel}\n'
)
log.info(
'Started new trio process:\n'
+
actor_info
)
try: try:
if infect_asyncio: if infect_asyncio:
actor._infected_aio = True actor._infected_aio = True
@ -137,14 +133,8 @@ def _trio_main(
trio.run(trio_main) trio.run(trio_main)
except KeyboardInterrupt: except KeyboardInterrupt:
log.cancel( log.cancel(
'Actor received KBI\n' f'Actor@{actor.uid} received KBI'
+
actor_info
) )
finally: finally:
log.info( log.info(f"Actor {actor.uid} terminated")
'Actor terminated\n'
+
actor_info
)

View File

@ -27,21 +27,17 @@ from typing import (
Type, Type,
TYPE_CHECKING, TYPE_CHECKING,
) )
import textwrap
import traceback import traceback
import exceptiongroup as eg
import trio import trio
from tractor._state import current_actor from ._state import current_actor
from tractor.log import get_logger
if TYPE_CHECKING: if TYPE_CHECKING:
from ._context import Context from ._context import Context
from .log import StackLevelAdapter
from ._stream import MsgStream from ._stream import MsgStream
from ._ipc import Channel from .log import StackLevelAdapter
log = get_logger('tractor')
_this_mod = importlib.import_module(__name__) _this_mod = importlib.import_module(__name__)
@ -50,25 +46,6 @@ class ActorFailure(Exception):
"General actor failure" "General actor failure"
class InternalError(RuntimeError):
'''
Entirely unexpected internal machinery error indicating
a completely invalid state or interface.
'''
_body_fields: list[str] = [
'src_actor_uid',
'canceller',
'sender',
]
_msgdata_keys: list[str] = [
'type_str',
] + _body_fields
# TODO: rename to just `RemoteError`? # TODO: rename to just `RemoteError`?
class RemoteActorError(Exception): class RemoteActorError(Exception):
''' '''
@ -80,10 +57,6 @@ class RemoteActorError(Exception):
a special "error" IPC msg sent by some remote actor-runtime. a special "error" IPC msg sent by some remote actor-runtime.
''' '''
reprol_fields: list[str] = [
'src_actor_uid',
]
def __init__( def __init__(
self, self,
message: str, message: str,
@ -101,83 +74,24 @@ class RemoteActorError(Exception):
# - .remote_type # - .remote_type
# also pertains to our long long oustanding issue XD # also pertains to our long long oustanding issue XD
# https://github.com/goodboy/tractor/issues/5 # https://github.com/goodboy/tractor/issues/5
self.boxed_type: str = suberror_type self.type: str = suberror_type
self.msgdata: dict[str, Any] = msgdata self.msgdata: dict[str, Any] = msgdata
@property @property
def type(self) -> str: def src_actor_uid(self) -> tuple[str, str] | None:
return self.boxed_type
@property
def type_str(self) -> str:
return str(type(self.boxed_type).__name__)
@property
def src_actor_uid(self) -> tuple[str, str]|None:
return self.msgdata.get('src_actor_uid') return self.msgdata.get('src_actor_uid')
@property
def tb_str(
self,
indent: str = ' '*3,
) -> str:
if remote_tb := self.msgdata.get('tb_str'):
return textwrap.indent(
remote_tb,
prefix=indent,
)
return ''
def reprol(self) -> str:
'''
Represent this error for "one line" display, like in
a field of our `Context.__repr__()` output.
'''
_repr: str = f'{type(self).__name__}('
for key in self.reprol_fields:
val: Any|None = self.msgdata.get(key)
if val:
_repr += f'{key}={repr(val)} '
return _repr
def __repr__(self) -> str: def __repr__(self) -> str:
if remote_tb := self.msgdata.get('tb_str'):
fields: str = '' pformat(remote_tb)
for key in _body_fields:
val: str|None = self.msgdata.get(key)
if val:
fields += f'{key}={val}\n'
fields: str = textwrap.indent(
fields,
# prefix=' '*2,
prefix=' |_',
)
indent: str = ''*1
body: str = (
f'{fields}'
f' |\n'
f' ------ - ------\n\n'
f'{self.tb_str}\n'
f' ------ - ------\n'
f' _|\n'
)
# f'|\n'
# f' |\n'
if indent:
body: str = textwrap.indent(
body,
prefix=indent,
)
return ( return (
f'<{type(self).__name__}(\n' f'{type(self).__name__}(\n'
f'{body}' f'msgdata={pformat(self.msgdata)}\n'
')>' ')'
) )
return super().__repr__()
# TODO: local recontruction of remote exception deats # TODO: local recontruction of remote exception deats
# def unbox(self) -> BaseException: # def unbox(self) -> BaseException:
# ... # ...
@ -185,9 +99,8 @@ class RemoteActorError(Exception):
class InternalActorError(RemoteActorError): class InternalActorError(RemoteActorError):
''' '''
(Remote) internal `tractor` error indicating failure of some Remote internal ``tractor`` error indicating
primitive, machinery state or lowlevel task that should never failure of some primitive or machinery.
occur.
''' '''
@ -198,43 +111,12 @@ class ContextCancelled(RemoteActorError):
``Portal.cancel_actor()`` or ``Context.cancel()``. ``Portal.cancel_actor()`` or ``Context.cancel()``.
''' '''
reprol_fields: list[str] = [
'canceller',
]
@property @property
def canceller(self) -> tuple[str, str]|None: def canceller(self) -> tuple[str, str] | None:
'''
Return the (maybe) `Actor.uid` for the requesting-author
of this ctxc.
Emit a warning msg when `.canceller` has not been set,
which usually idicates that a `None` msg-loop setinel was
sent before expected in the runtime. This can happen in
a few situations:
- (simulating) an IPC transport network outage
- a (malicious) pkt sent specifically to cancel an actor's
runtime non-gracefully without ensuring ongoing RPC tasks are
incrementally cancelled as is done with:
`Actor`
|_`.cancel()`
|_`.cancel_soon()`
|_`._cancel_task()`
'''
value = self.msgdata.get('canceller') value = self.msgdata.get('canceller')
if value: if value:
return tuple(value) return tuple(value)
log.warning(
'IPC Context cancelled without a requesting actor?\n'
'Maybe the IPC transport ended abruptly?\n\n'
f'{self}'
)
# to make `.__repr__()` work uniformly
# src_actor_uid = canceller
class TransportClosed(trio.ClosedResourceError): class TransportClosed(trio.ClosedResourceError):
"Underlying channel transport was closed prior to use" "Underlying channel transport was closed prior to use"
@ -256,9 +138,6 @@ class StreamOverrun(
RemoteActorError, RemoteActorError,
trio.TooSlowError, trio.TooSlowError,
): ):
reprol_fields: list[str] = [
'sender',
]
''' '''
This stream was overrun by sender This stream was overrun by sender
@ -306,7 +185,6 @@ def pack_error(
] = { ] = {
'tb_str': tb_str, 'tb_str': tb_str,
'type_str': type(exc).__name__, 'type_str': type(exc).__name__,
'boxed_type': type(exc).__name__,
'src_actor_uid': current_actor().uid, 'src_actor_uid': current_actor().uid,
} }
@ -321,6 +199,7 @@ def pack_error(
): ):
error_msg.update(exc.msgdata) error_msg.update(exc.msgdata)
pkt: dict = {'error': error_msg} pkt: dict = {'error': error_msg}
if cid: if cid:
pkt['cid'] = cid pkt['cid'] = cid
@ -331,10 +210,8 @@ def pack_error(
def unpack_error( def unpack_error(
msg: dict[str, Any], msg: dict[str, Any],
chan=None,
chan: Channel|None = None, err_type=RemoteActorError,
box_type: RemoteActorError = RemoteActorError,
hide_tb: bool = True, hide_tb: bool = True,
) -> None|Exception: ) -> None|Exception:
@ -358,20 +235,18 @@ def unpack_error(
# retrieve the remote error's msg encoded details # retrieve the remote error's msg encoded details
tb_str: str = error_dict.get('tb_str', '') tb_str: str = error_dict.get('tb_str', '')
message: str = f'{chan.uid}\n' + tb_str message: str = f'{chan.uid}\n' + tb_str
type_name: str = ( type_name: str = error_dict['type_str']
error_dict.get('type_str')
or error_dict['boxed_type']
)
suberror_type: Type[BaseException] = Exception suberror_type: Type[BaseException] = Exception
if type_name == 'ContextCancelled': if type_name == 'ContextCancelled':
box_type = ContextCancelled err_type = ContextCancelled
suberror_type = box_type suberror_type = err_type
else: # try to lookup a suitable local error type else: # try to lookup a suitable local error type
for ns in [ for ns in [
builtins, builtins,
_this_mod, _this_mod,
eg,
trio, trio,
]: ]:
if suberror_type := getattr( if suberror_type := getattr(
@ -381,7 +256,7 @@ def unpack_error(
): ):
break break
exc = box_type( exc = err_type(
message, message,
suberror_type=suberror_type, suberror_type=suberror_type,
@ -394,13 +269,12 @@ def unpack_error(
def is_multi_cancelled(exc: BaseException) -> bool: def is_multi_cancelled(exc: BaseException) -> bool:
''' '''
Predicate to determine if a possible ``BaseExceptionGroup`` contains Predicate to determine if a possible ``eg.BaseExceptionGroup`` contains
only ``trio.Cancelled`` sub-exceptions (and is likely the result of only ``trio.Cancelled`` sub-exceptions (and is likely the result of
cancelling a collection of subtasks. cancelling a collection of subtasks.
''' '''
# if isinstance(exc, eg.BaseExceptionGroup): if isinstance(exc, eg.BaseExceptionGroup):
if isinstance(exc, BaseExceptionGroup):
return exc.subgroup( return exc.subgroup(
lambda exc: isinstance(exc, trio.Cancelled) lambda exc: isinstance(exc, trio.Cancelled)
) is not None ) is not None
@ -413,63 +287,37 @@ def _raise_from_no_key_in_msg(
msg: dict, msg: dict,
src_err: KeyError, src_err: KeyError,
log: StackLevelAdapter, # caller specific `log` obj log: StackLevelAdapter, # caller specific `log` obj
expect_key: str = 'yield', expect_key: str = 'yield',
stream: MsgStream | None = None, stream: MsgStream | None = None,
# allow "deeper" tbs when debugging B^o
hide_tb: bool = True,
) -> bool: ) -> bool:
''' '''
Raise an appopriate local error when a Raise an appopriate local error when a `MsgStream` msg arrives
`MsgStream` msg arrives which does not which does not contain the expected (under normal operation)
contain the expected (at least under normal `'yield'` field.
operation) `'yield'` field.
`Context` and any embedded `MsgStream` termination,
as well as remote task errors are handled in order
of priority as:
- any 'error' msg is re-boxed and raised locally as
-> `RemoteActorError`|`ContextCancelled`
- a `MsgStream` 'stop' msg is constructed, assigned
and raised locally as -> `trio.EndOfChannel`
- All other mis-keyed msgss (like say a "final result"
'return' msg, normally delivered from `Context.result()`)
are re-boxed inside a `MessagingError` with an explicit
exc content describing the missing IPC-msg-key.
''' '''
__tracebackhide__: bool = hide_tb __tracebackhide__: bool = True
# an internal error should never get here # internal error should never get here
try: try:
cid: str = msg['cid'] cid: str = msg['cid']
except KeyError as src_err: except KeyError as src_err:
raise MessagingError( raise MessagingError(
f'IPC `Context` rx-ed msg without a ctx-id (cid)!?\n' f'IPC `Context` rx-ed msg without a ctx-id (cid)!?\n'
f'cid: {cid}\n\n' f'cid: {cid}\n'
'received msg:\n'
f'{pformat(msg)}\n' f'{pformat(msg)}\n'
) from src_err ) from src_err
# TODO: test that shows stream raising an expected error!!! # TODO: test that shows stream raising an expected error!!!
# raise the error message in a boxed exception type!
if msg.get('error'): if msg.get('error'):
# raise the error message
raise unpack_error( raise unpack_error(
msg, msg,
ctx.chan, ctx.chan,
hide_tb=hide_tb,
) from None ) from None
# `MsgStream` termination msg.
# TODO: does it make more sense to pack
# the stream._eoc outside this in the calleer always?
elif ( elif (
msg.get('stop') msg.get('stop')
or ( or (
@ -482,26 +330,29 @@ def _raise_from_no_key_in_msg(
f'cid: {cid}\n' f'cid: {cid}\n'
) )
# XXX: important to set so that a new ``.receive()``
# call (likely by another task using a broadcast receiver)
# doesn't accidentally pull the ``return`` message
# value out of the underlying feed mem chan!
stream._eoc: bool = True
# TODO: if the a local task is already blocking on # TODO: if the a local task is already blocking on
# a `Context.result()` and thus a `.receive()` on the # a `Context.result()` and thus a `.receive()` on the
# rx-chan, we close the chan and set state ensuring that # rx-chan, we close the chan and set state ensuring that
# an eoc is raised! # an eoc is raised!
# # when the send is closed we assume the stream has
# # terminated and signal this local iterator to stop
# await stream.aclose()
# XXX: this causes ``ReceiveChannel.__anext__()`` to # XXX: this causes ``ReceiveChannel.__anext__()`` to
# raise a ``StopAsyncIteration`` **and** in our catch # raise a ``StopAsyncIteration`` **and** in our catch
# block below it will trigger ``.aclose()``. # block below it will trigger ``.aclose()``.
eoc = trio.EndOfChannel( raise trio.EndOfChannel(
f'Context stream ended due to msg:\n\n' f'Context stream ended due to msg:\n'
f'{pformat(msg)}\n' f'{pformat(msg)}'
) ) from src_err
# XXX: important to set so that a new `.receive()`
# call (likely by another task using a broadcast receiver)
# doesn't accidentally pull the `return` message
# value out of the underlying feed mem chan which is
# destined for the `Context.result()` call during ctx-exit!
stream._eoc: Exception = eoc
raise eoc from src_err
if ( if (
stream stream

View File

@ -19,33 +19,34 @@ Inter-process comms abstractions
""" """
from __future__ import annotations from __future__ import annotations
import platform
import struct
import typing
from collections.abc import ( from collections.abc import (
AsyncGenerator, AsyncGenerator,
AsyncIterator, AsyncIterator,
) )
from contextlib import asynccontextmanager as acm
import platform
from pprint import pformat
import struct
import typing
from typing import ( from typing import (
Any, Any,
runtime_checkable, runtime_checkable,
Optional,
Protocol, Protocol,
Type, Type,
TypeVar, TypeVar,
) )
import msgspec
from tricycle import BufferedReceiveStream from tricycle import BufferedReceiveStream
import msgspec
import trio import trio
from async_generator import asynccontextmanager
from tractor.log import get_logger from .log import get_logger
from tractor._exceptions import TransportClosed from ._exceptions import TransportClosed
log = get_logger(__name__) log = get_logger(__name__)
_is_windows = platform.system() == 'Windows' _is_windows = platform.system() == 'Windows'
log = get_logger(__name__)
def get_stream_addrs(stream: trio.SocketStream) -> tuple: def get_stream_addrs(stream: trio.SocketStream) -> tuple:
@ -111,13 +112,6 @@ class MsgpackTCPStream(MsgTransport):
using the ``msgspec`` codec lib. using the ``msgspec`` codec lib.
''' '''
layer_key: int = 4
name_key: str = 'tcp'
# TODO: better naming for this?
# -[ ] check how libp2p does naming for such things?
codec_key: str = 'msgpack'
def __init__( def __init__(
self, self,
stream: trio.SocketStream, stream: trio.SocketStream,
@ -205,17 +199,7 @@ class MsgpackTCPStream(MsgTransport):
else: else:
raise raise
async def send( async def send(self, msg: Any) -> None:
self,
msg: Any,
# hide_tb: bool = False,
) -> None:
'''
Send a msgpack coded blob-as-msg over TCP.
'''
# __tracebackhide__: bool = hide_tb
async with self._send_lock: async with self._send_lock:
bytes_data: bytes = self.encode(msg) bytes_data: bytes = self.encode(msg)
@ -283,7 +267,7 @@ class Channel:
def __init__( def __init__(
self, self,
destaddr: tuple[str, int]|None, destaddr: Optional[tuple[str, int]],
msg_transport_type_key: tuple[str, str] = ('msgpack', 'tcp'), msg_transport_type_key: tuple[str, str] = ('msgpack', 'tcp'),
@ -301,14 +285,14 @@ class Channel:
# Either created in ``.connect()`` or passed in by # Either created in ``.connect()`` or passed in by
# user in ``.from_stream()``. # user in ``.from_stream()``.
self._stream: trio.SocketStream|None = None self._stream: Optional[trio.SocketStream] = None
self._transport: MsgTransport|None = None self.msgstream: Optional[MsgTransport] = None
# set after handshake - always uid of far end # set after handshake - always uid of far end
self.uid: tuple[str, str]|None = None self.uid: Optional[tuple[str, str]] = None
self._agen = self._aiter_recv() self._agen = self._aiter_recv()
self._exc: Exception|None = None # set if far end actor errors self._exc: Optional[Exception] = None # set if far end actor errors
self._closed: bool = False self._closed: bool = False
# flag set by ``Portal.cancel_actor()`` indicating remote # flag set by ``Portal.cancel_actor()`` indicating remote
@ -316,15 +300,6 @@ class Channel:
# runtime. # runtime.
self._cancel_called: bool = False self._cancel_called: bool = False
@property
def msgstream(self) -> MsgTransport:
log.info('`Channel.msgstream` is an old name, use `._transport`')
return self._transport
@property
def transport(self) -> MsgTransport:
return self._transport
@classmethod @classmethod
def from_stream( def from_stream(
cls, cls,
@ -334,44 +309,40 @@ class Channel:
) -> Channel: ) -> Channel:
src, dst = get_stream_addrs(stream) src, dst = get_stream_addrs(stream)
chan = Channel( chan = Channel(destaddr=dst, **kwargs)
destaddr=dst,
**kwargs,
)
# set immediately here from provided instance # set immediately here from provided instance
chan._stream: trio.SocketStream = stream chan._stream = stream
chan.set_msg_transport(stream) chan.set_msg_transport(stream)
return chan return chan
def set_msg_transport( def set_msg_transport(
self, self,
stream: trio.SocketStream, stream: trio.SocketStream,
type_key: tuple[str, str]|None = None, type_key: Optional[tuple[str, str]] = None,
) -> MsgTransport: ) -> MsgTransport:
type_key = type_key or self._transport_key type_key = type_key or self._transport_key
self._transport = get_msg_transport(type_key)(stream) self.msgstream = get_msg_transport(type_key)(stream)
return self._transport return self.msgstream
def __repr__(self) -> str: def __repr__(self) -> str:
if not self._transport: if self.msgstream:
return '<Channel with inactive transport?>'
return repr( return repr(
self._transport.stream.socket._sock self.msgstream.stream.socket._sock
).replace( # type: ignore ).replace( # type: ignore
"socket.socket", "socket.socket",
"Channel", "Channel",
) )
return object.__repr__(self)
@property @property
def laddr(self) -> tuple[str, int]|None: def laddr(self) -> Optional[tuple[str, int]]:
return self._transport.laddr if self._transport else None return self.msgstream.laddr if self.msgstream else None
@property @property
def raddr(self) -> tuple[str, int]|None: def raddr(self) -> Optional[tuple[str, int]]:
return self._transport.raddr if self._transport else None return self.msgstream.raddr if self.msgstream else None
async def connect( async def connect(
self, self,
@ -390,42 +361,26 @@ class Channel:
*destaddr, *destaddr,
**kwargs **kwargs
) )
transport = self.set_msg_transport(stream) msgstream = self.set_msg_transport(stream)
log.transport( log.transport(
f'Opened channel[{type(transport)}]: {self.laddr} -> {self.raddr}' f'Opened channel[{type(msgstream)}]: {self.laddr} -> {self.raddr}'
) )
return transport return msgstream
async def send( async def send(self, item: Any) -> None:
self,
payload: Any,
# hide_tb: bool = False, log.transport(f"send `{item}`") # type: ignore
assert self.msgstream
) -> None: await self.msgstream.send(item)
'''
Send a coded msg-blob over the transport.
'''
# __tracebackhide__: bool = hide_tb
log.transport(
'=> send IPC msg:\n\n'
f'{pformat(payload)}\n'
) # type: ignore
assert self._transport
await self._transport.send(
payload,
# hide_tb=hide_tb,
)
async def recv(self) -> Any: async def recv(self) -> Any:
assert self._transport assert self.msgstream
return await self._transport.recv() return await self.msgstream.recv()
# try: # try:
# return await self._transport.recv() # return await self.msgstream.recv()
# except trio.BrokenResourceError: # except trio.BrokenResourceError:
# if self._autorecon: # if self._autorecon:
# await self._reconnect() # await self._reconnect()
@ -438,8 +393,8 @@ class Channel:
f'Closing channel to {self.uid} ' f'Closing channel to {self.uid} '
f'{self.laddr} -> {self.raddr}' f'{self.laddr} -> {self.raddr}'
) )
assert self._transport assert self.msgstream
await self._transport.stream.aclose() await self.msgstream.stream.aclose()
self._closed = True self._closed = True
async def __aenter__(self): async def __aenter__(self):
@ -490,16 +445,16 @@ class Channel:
Async iterate items from underlying stream. Async iterate items from underlying stream.
''' '''
assert self._transport assert self.msgstream
while True: while True:
try: try:
async for item in self._transport: async for item in self.msgstream:
yield item yield item
# sent = yield item # sent = yield item
# if sent is not None: # if sent is not None:
# # optimization, passing None through all the # # optimization, passing None through all the
# # time is pointless # # time is pointless
# await self._transport.send(sent) # await self.msgstream.send(sent)
except trio.BrokenResourceError: except trio.BrokenResourceError:
# if not self._autorecon: # if not self._autorecon:
@ -512,10 +467,10 @@ class Channel:
# continue # continue
def connected(self) -> bool: def connected(self) -> bool:
return self._transport.connected() if self._transport else False return self.msgstream.connected() if self.msgstream else False
@acm @asynccontextmanager
async def _connect_chan( async def _connect_chan(
host: str, host: str,
port: int port: int

View File

@ -24,73 +24,55 @@ OS processes, possibly on different (hardware) hosts.
''' '''
from __future__ import annotations from __future__ import annotations
from contextlib import asynccontextmanager as acm
import importlib import importlib
import inspect import inspect
from typing import ( from typing import (
Any, Any, Optional,
Callable, Callable, AsyncGenerator,
AsyncGenerator, Type,
# Type,
) )
from functools import partial from functools import partial
from dataclasses import dataclass from dataclasses import dataclass
import warnings import warnings
import trio import trio
from async_generator import asynccontextmanager
from .trionics import maybe_open_nursery from .trionics import maybe_open_nursery
from ._state import ( from ._state import current_actor
current_actor,
)
from ._ipc import Channel from ._ipc import Channel
from .log import get_logger from .log import get_logger
from .msg import NamespacePath from .msg import NamespacePath
from ._exceptions import ( from ._exceptions import (
_raise_from_no_key_in_msg,
unpack_error, unpack_error,
NoResult, NoResult,
ContextCancelled,
) )
from ._context import ( from ._context import (
Context, Context,
open_context_from_portal,
) )
from ._streaming import ( from ._streaming import (
MsgStream, MsgStream,
) )
from .devx._debug import maybe_wait_for_debugger
log = get_logger(__name__) log = get_logger(__name__)
# TODO: rename to `unwrap_result()` and use
# `._raise_from_no_key_in_msg()` (after tweak to
# accept a `chan: Channel` arg) in key block!
def _unwrap_msg( def _unwrap_msg(
msg: dict[str, Any], msg: dict[str, Any],
channel: Channel, channel: Channel
hide_tb: bool = True,
) -> Any: ) -> Any:
''' __tracebackhide__ = True
Unwrap a final result from a `{return: <Any>}` IPC msg.
'''
__tracebackhide__: bool = hide_tb
try: try:
return msg['return'] return msg['return']
except KeyError as ke: except KeyError as ke:
# internal error should never get here # internal error should never get here
assert msg.get('cid'), ( assert msg.get('cid'), "Received internal error at portal?"
"Received internal error at portal?" raise unpack_error(msg, channel) from ke
)
raise unpack_error(
msg,
channel
) from ke
class Portal: class Portal:
@ -117,9 +99,9 @@ class Portal:
cancel_timeout: float = 0.5 cancel_timeout: float = 0.5
def __init__(self, channel: Channel) -> None: def __init__(self, channel: Channel) -> None:
self.chan = channel self.channel = channel
# during the portal's lifetime # during the portal's lifetime
self._result_msg: dict|None = None self._result_msg: Optional[dict] = None
# When set to a ``Context`` (when _submit_for_result is called) # When set to a ``Context`` (when _submit_for_result is called)
# it is expected that ``result()`` will be awaited at some # it is expected that ``result()`` will be awaited at some
@ -128,18 +110,6 @@ class Portal:
self._streams: set[MsgStream] = set() self._streams: set[MsgStream] = set()
self.actor = current_actor() self.actor = current_actor()
@property
def channel(self) -> Channel:
'''
Proxy to legacy attr name..
Consider the shorter `Portal.chan` instead of `.channel` ;)
'''
log.debug(
'Consider the shorter `Portal.chan` instead of `.channel` ;)'
)
return self.chan
async def _submit_for_result( async def _submit_for_result(
self, self,
ns: str, ns: str,
@ -147,14 +117,14 @@ class Portal:
**kwargs **kwargs
) -> None: ) -> None:
assert self._expect_result is None, ( assert self._expect_result is None, \
"A pending main result has already been submitted" "A pending main result has already been submitted"
)
self._expect_result = await self.actor.start_remote_task( self._expect_result = await self.actor.start_remote_task(
self.channel, self.channel,
nsf=NamespacePath(f'{ns}:{func}'), ns,
kwargs=kwargs func,
kwargs
) )
async def _return_once( async def _return_once(
@ -164,7 +134,7 @@ class Portal:
) -> dict[str, Any]: ) -> dict[str, Any]:
assert ctx._remote_func_type == 'asyncfunc' # single response assert ctx._remote_func_type == 'asyncfunc' # single response
msg: dict = await ctx._recv_chan.receive() msg = await ctx._recv_chan.receive()
return msg return msg
async def result(self) -> Any: async def result(self) -> Any:
@ -195,10 +165,7 @@ class Portal:
self._expect_result self._expect_result
) )
return _unwrap_msg( return _unwrap_msg(self._result_msg, self.channel)
self._result_msg,
self.channel,
)
async def _cancel_streams(self): async def _cancel_streams(self):
# terminate all locally running async generator # terminate all locally running async generator
@ -240,33 +207,26 @@ class Portal:
purpose. purpose.
''' '''
chan: Channel = self.channel if not self.channel.connected():
if not chan.connected(): log.cancel("This channel is already closed can't cancel")
log.runtime(
'This channel is already closed, skipping cancel request..'
)
return False return False
reminfo: str = (
f'`Portal.cancel_actor()` => {self.channel.uid}\n'
f' |_{chan}\n'
)
log.cancel( log.cancel(
f'Sending runtime `.cancel()` request to peer\n\n' f"Sending actor cancel request to {self.channel.uid} on "
f'{reminfo}' f"{self.channel}")
)
self.channel._cancel_called = True
self.channel._cancel_called: bool = True
try: try:
# send cancel cmd - might not get response # send cancel cmd - might not get response
# XXX: sure would be nice to make this work with # XXX: sure would be nice to make this work with
# a proper shield # a proper shield
with trio.move_on_after( with trio.move_on_after(
timeout timeout
or or self.cancel_timeout
self.cancel_timeout
) as cs: ) as cs:
cs.shield: bool = True cs.shield = True
await self.run_from_ns( await self.run_from_ns(
'self', 'self',
'cancel', 'cancel',
@ -274,12 +234,7 @@ class Portal:
return True return True
if cs.cancelled_caught: if cs.cancelled_caught:
# may timeout and we never get an ack (obvi racy) log.cancel(f"May have failed to cancel {self.channel.uid}")
# but that doesn't mean it wasn't cancelled.
log.debug(
'May have failed to cancel peer?\n'
f'{reminfo}'
)
# if we get here some weird cancellation case happened # if we get here some weird cancellation case happened
return False return False
@ -288,11 +243,9 @@ class Portal:
trio.ClosedResourceError, trio.ClosedResourceError,
trio.BrokenResourceError, trio.BrokenResourceError,
): ):
log.debug( log.cancel(
'IPC chan for actor already closed or broken?\n\n' f"{self.channel} for {self.channel.uid} was already "
f'{self.channel.uid}\n' "closed or broken?")
f' |_{self.channel}\n'
)
return False return False
async def run_from_ns( async def run_from_ns(
@ -313,31 +266,25 @@ class Portal:
A special namespace `self` can be used to invoke `Actor` A special namespace `self` can be used to invoke `Actor`
instance methods in the remote runtime. Currently this instance methods in the remote runtime. Currently this
should only ever be used for `Actor` (method) runtime should only be used solely for ``tractor`` runtime
internals! internals.
''' '''
nsf = NamespacePath(
f'{namespace_path}:{function_name}'
)
ctx = await self.actor.start_remote_task( ctx = await self.actor.start_remote_task(
chan=self.channel, self.channel,
nsf=nsf, namespace_path,
kwargs=kwargs, function_name,
kwargs,
) )
ctx._portal = self ctx._portal = self
msg = await self._return_once(ctx) msg = await self._return_once(ctx)
return _unwrap_msg( return _unwrap_msg(msg, self.channel)
msg,
self.channel,
)
async def run( async def run(
self, self,
func: str, func: str,
fn_name: str|None = None, fn_name: Optional[str] = None,
**kwargs **kwargs
) -> Any: ) -> Any:
''' '''
Submit a remote function to be scheduled and run by actor, in Submit a remote function to be scheduled and run by actor, in
@ -356,9 +303,8 @@ class Portal:
DeprecationWarning, DeprecationWarning,
stacklevel=2, stacklevel=2,
) )
fn_mod_path: str = func fn_mod_path = func
assert isinstance(fn_name, str) assert isinstance(fn_name, str)
nsf = NamespacePath(f'{fn_mod_path}:{fn_name}')
else: # function reference was passed directly else: # function reference was passed directly
if ( if (
@ -371,12 +317,13 @@ class Portal:
raise TypeError( raise TypeError(
f'{func} must be a non-streaming async function!') f'{func} must be a non-streaming async function!')
nsf = NamespacePath.from_ref(func) fn_mod_path, fn_name = NamespacePath.from_ref(func).to_tuple()
ctx = await self.actor.start_remote_task( ctx = await self.actor.start_remote_task(
self.channel, self.channel,
nsf=nsf, fn_mod_path,
kwargs=kwargs, fn_name,
kwargs,
) )
ctx._portal = self ctx._portal = self
return _unwrap_msg( return _unwrap_msg(
@ -384,7 +331,7 @@ class Portal:
self.channel, self.channel,
) )
@acm @asynccontextmanager
async def open_stream_from( async def open_stream_from(
self, self,
async_gen_func: Callable, # typing: ignore async_gen_func: Callable, # typing: ignore
@ -400,10 +347,15 @@ class Portal:
raise TypeError( raise TypeError(
f'{async_gen_func} must be an async generator function!') f'{async_gen_func} must be an async generator function!')
ctx: Context = await self.actor.start_remote_task( fn_mod_path, fn_name = NamespacePath.from_ref(
async_gen_func
).to_tuple()
ctx = await self.actor.start_remote_task(
self.channel, self.channel,
nsf=NamespacePath.from_ref(async_gen_func), fn_mod_path,
kwargs=kwargs, fn_name,
kwargs
) )
ctx._portal = self ctx._portal = self
@ -413,8 +365,7 @@ class Portal:
try: try:
# deliver receive only stream # deliver receive only stream
async with MsgStream( async with MsgStream(
ctx=ctx, ctx, ctx._recv_chan,
rx_chan=ctx._recv_chan,
) as rchan: ) as rchan:
self._streams.add(rchan) self._streams.add(rchan)
yield rchan yield rchan
@ -441,12 +392,367 @@ class Portal:
# await recv_chan.aclose() # await recv_chan.aclose()
self._streams.remove(rchan) self._streams.remove(rchan)
# NOTE: impl is found in `._context`` mod to make @asynccontextmanager
# reading/groking the details simpler code-org-wise. This async def open_context(
# method does not have to be used over that `@acm` module func
# directly, it is for conventience and from the original API self,
# design. func: Callable,
open_context = open_context_from_portal allow_overruns: bool = False,
**kwargs,
) -> AsyncGenerator[tuple[Context, Any], None]:
'''
Open an inter-actor "task context"; a remote task is
scheduled and cancel-scope-state-linked to a `trio.run()` across
memory boundaries in another actor's runtime.
This is an `@acm` API which allows for deterministic setup
and teardown of a remotely scheduled task in another remote
actor. Once opened, the 2 now "linked" tasks run completely
in parallel in each actor's runtime with their enclosing
`trio.CancelScope`s kept in a synced state wherein if
either side errors or cancels an equivalent error is
relayed to the other side via an SC-compat IPC protocol.
The yielded `tuple` is a pair delivering a `tractor.Context`
and any first value "sent" by the "callee" task via a call
to `Context.started(<value: Any>)`; this side of the
context does not unblock until the "callee" task calls
`.started()` in similar style to `trio.Nursery.start()`.
When the "callee" (side that is "called"/started by a call
to *this* method) returns, the caller side (this) unblocks
and any final value delivered from the other end can be
retrieved using the `Contex.result()` api.
The yielded ``Context`` instance further allows for opening
bidirectional streams, explicit cancellation and
structurred-concurrency-synchronized final result-msg
collection. See ``tractor.Context`` for more details.
'''
# conduct target func method structural checks
if not inspect.iscoroutinefunction(func) and (
getattr(func, '_tractor_contex_function', False)
):
raise TypeError(
f'{func} must be an async generator function!')
# TODO: i think from here onward should probably
# just be factored into an `@acm` inside a new
# a new `_context.py` mod.
fn_mod_path, fn_name = NamespacePath.from_ref(func).to_tuple()
ctx = await self.actor.start_remote_task(
self.channel,
fn_mod_path,
fn_name,
kwargs,
# NOTE: it's imporant to expose this since you might
# get the case where the parent who opened the context does
# not open a stream until after some slow startup/init
# period, in which case when the first msg is read from
# the feeder mem chan, say when first calling
# `Context.open_stream(allow_overruns=True)`, the overrun condition will be
# raised before any ignoring of overflow msgs can take
# place..
allow_overruns=allow_overruns,
)
assert ctx._remote_func_type == 'context'
msg: dict = await ctx._recv_chan.receive()
try:
# the "first" value here is delivered by the callee's
# ``Context.started()`` call.
first: Any = msg['started']
ctx._started_called: bool = True
except KeyError as src_error:
_raise_from_no_key_in_msg(
ctx=ctx,
msg=msg,
src_err=src_error,
log=log,
expect_key='started',
)
ctx._portal: Portal = self
uid: tuple = self.channel.uid
cid: str = ctx.cid
# placeholder for any exception raised in the runtime
# or by user tasks which cause this context's closure.
scope_err: BaseException | None = None
try:
async with trio.open_nursery() as nurse:
ctx._scope_nursery: trio.Nursery = nurse
ctx._scope: trio.CancelScope = nurse.cancel_scope
# deliver context instance and .started() msg value
# in enter tuple.
yield ctx, first
# when in allow_overruns mode there may be
# lingering overflow sender tasks remaining?
if nurse.child_tasks:
# XXX: ensure we are in overrun state
# with ``._allow_overruns=True`` bc otherwise
# there should be no tasks in this nursery!
if (
not ctx._allow_overruns
or len(nurse.child_tasks) > 1
):
raise RuntimeError(
'Context has sub-tasks but is '
'not in `allow_overruns=True` mode!?'
)
# ensure cancel of all overflow sender tasks
# started in the ctx nursery.
ctx._scope.cancel()
# XXX NOTE XXX: maybe shield against
# self-context-cancellation (which raises a local
# `ContextCancelled`) when requested (via
# `Context.cancel()`) by the same task (tree) which entered
# THIS `.open_context()`.
#
# NOTE: There are 2 operating cases for a "graceful cancel"
# of a `Context`. In both cases any `ContextCancelled`
# raised in this scope-block came from a transport msg
# relayed from some remote-actor-task which our runtime set
# as a `Context._remote_error`
#
# the CASES:
#
# - if that context IS THE SAME ONE that called
# `Context.cancel()`, we want to absorb the error
# silently and let this `.open_context()` block to exit
# without raising.
#
# - if it is from some OTHER context (we did NOT call
# `.cancel()`), we want to re-RAISE IT whilst also
# setting our own ctx's "reason for cancel" to be that
# other context's cancellation condition; we set our
# `.canceller: tuple[str, str]` to be same value as
# caught here in a `ContextCancelled.canceller`.
#
# Again, there are 2 cases:
#
# 1-some other context opened in this `.open_context()`
# block cancelled due to a self or peer cancellation
# request in which case we DO let the error bubble to the
# opener.
#
# 2-THIS "caller" task somewhere invoked `Context.cancel()`
# and received a `ContextCanclled` from the "callee"
# task, in which case we mask the `ContextCancelled` from
# bubbling to this "caller" (much like how `trio.Nursery`
# swallows any `trio.Cancelled` bubbled by a call to
# `Nursery.cancel_scope.cancel()`)
except ContextCancelled as ctxc:
scope_err = ctxc
# CASE 2: context was cancelled by local task calling
# `.cancel()`, we don't raise and the exit block should
# exit silently.
if (
ctx._cancel_called
and (
ctxc is ctx._remote_error
or
ctxc.canceller is self.canceller
)
):
log.debug(
f'Context {ctx} cancelled gracefully with:\n'
f'{ctxc}'
)
# CASE 1: this context was never cancelled via a local
# task (tree) having called `Context.cancel()`, raise
# the error since it was caused by someone else!
else:
raise
# the above `._scope` can be cancelled due to:
# 1. an explicit self cancel via `Context.cancel()` or
# `Actor.cancel()`,
# 2. any "callee"-side remote error, possibly also a cancellation
# request by some peer,
# 3. any "caller" (aka THIS scope's) local error raised in the above `yield`
except (
# CASE 3: standard local error in this caller/yieldee
Exception,
# CASES 1 & 2: normally manifested as
# a `Context._scope_nursery` raised
# exception-group of,
# 1.-`trio.Cancelled`s, since
# `._scope.cancel()` will have been called and any
# `ContextCancelled` absorbed and thus NOT RAISED in
# any `Context._maybe_raise_remote_err()`,
# 2.-`BaseExceptionGroup[ContextCancelled | RemoteActorError]`
# from any error raised in the "callee" side with
# a group only raised if there was any more then one
# task started here in the "caller" in the
# `yield`-ed to task.
BaseExceptionGroup, # since overrun handler tasks may have been spawned
trio.Cancelled, # NOTE: NOT from inside the ctx._scope
KeyboardInterrupt,
) as err:
scope_err = err
# XXX: ALWAYS request the context to CANCEL ON any ERROR.
# NOTE: `Context.cancel()` is conversely NEVER CALLED in
# the `ContextCancelled` "self cancellation absorbed" case
# handled in the block above!
log.cancel(
'Context cancelled for task due to\n'
f'{err}\n'
'Sending cancel request..\n'
f'task:{cid}\n'
f'actor:{uid}'
)
try:
await ctx.cancel()
except trio.BrokenResourceError:
log.warning(
'IPC connection for context is broken?\n'
f'task:{cid}\n'
f'actor:{uid}'
)
raise # duh
# no local scope error, the "clean exit with a result" case.
else:
if ctx.chan.connected():
log.info(
'Waiting on final context-task result for\n'
f'task: {cid}\n'
f'actor: {uid}'
)
# XXX NOTE XXX: the below call to
# `Context.result()` will ALWAYS raise
# a `ContextCancelled` (via an embedded call to
# `Context._maybe_raise_remote_err()`) IFF
# a `Context._remote_error` was set by the runtime
# via a call to
# `Context._maybe_cancel_and_set_remote_error()`.
# As per `Context._deliver_msg()`, that error IS
# ALWAYS SET any time "callee" side fails and causes "caller
# side" cancellation via a `ContextCancelled` here.
# result = await ctx.result()
try:
result = await ctx.result()
log.runtime(
f'Context {fn_name} returned value from callee:\n'
f'`{result}`'
)
except BaseException as berr:
# on normal teardown, if we get some error
# raised in `Context.result()` we still want to
# save that error on the ctx's state to
# determine things like `.cancelled_caught` for
# cases where there was remote cancellation but
# this task didn't know until final teardown
# / value collection.
scope_err = berr
raise
finally:
# though it should be impossible for any tasks
# operating *in* this scope to have survived
# we tear down the runtime feeder chan last
# to avoid premature stream clobbers.
rxchan: trio.ReceiveChannel = ctx._recv_chan
if (
rxchan
# maybe TODO: yes i know the below check is
# touching `trio` memchan internals..BUT, there are
# only a couple ways to avoid a `trio.Cancelled`
# bubbling from the `.aclose()` call below:
#
# - catch and mask it via the cancel-scope-shielded call
# as we are rn (manual and frowned upon) OR,
# - specially handle the case where `scope_err` is
# one of {`BaseExceptionGroup`, `trio.Cancelled`}
# and then presume that the `.aclose()` call will
# raise a `trio.Cancelled` and just don't call it
# in those cases..
#
# that latter approach is more logic, LOC, and more
# convoluted so for now stick with the first
# psuedo-hack-workaround where we just try to avoid
# the shielded call as much as we can detect from
# the memchan's `._closed` state..
#
# XXX MOTIVATION XXX-> we generally want to raise
# any underlying actor-runtime/internals error that
# surfaces from a bug in tractor itself so it can
# be easily detected/fixed AND, we also want to
# minimize noisy runtime tracebacks (normally due
# to the cross-actor linked task scope machinery
# teardown) displayed to user-code and instead only
# displaying `ContextCancelled` traces where the
# cause of crash/exit IS due to something in
# user/app code on either end of the context.
and not rxchan._closed
):
# XXX NOTE XXX: and again as per above, we mask any
# `trio.Cancelled` raised here so as to NOT mask
# out any exception group or legit (remote) ctx
# error that sourced from the remote task or its
# runtime.
with trio.CancelScope(shield=True):
await ctx._recv_chan.aclose()
# XXX: we always raise remote errors locally and
# generally speaking mask runtime-machinery related
# multi-`trio.Cancelled`s. As such, any `scope_error`
# which was the underlying cause of this context's exit
# should be stored as the `Context._local_error` and
# used in determining `Context.cancelled_caught: bool`.
if scope_err is not None:
ctx._local_error: BaseException = scope_err
etype: Type[BaseException] = type(scope_err)
# CASE 2
if ctx._cancel_called:
log.cancel(
f'Context {fn_name} cancelled by caller with\n'
f'{etype}'
)
# CASE 1
else:
log.cancel(
f'Context cancelled by callee with {etype}\n'
f'target: `{fn_name}`\n'
f'task:{cid}\n'
f'actor:{uid}'
)
# XXX: (MEGA IMPORTANT) if this is a root opened process we
# wait for any immediate child in debug before popping the
# context from the runtime msg loop otherwise inside
# ``Actor._push_result()`` the msg will be discarded and in
# the case where that msg is global debugger unlock (via
# a "stop" msg for a stream), this can result in a deadlock
# where the root is waiting on the lock to clear but the
# child has already cleared it and clobbered IPC.
await maybe_wait_for_debugger()
# FINALLY, remove the context from runtime tracking and
# exit!
self.actor._contexts.pop(
(self.channel.uid, ctx.cid),
None,
)
@dataclass @dataclass
@ -477,11 +783,11 @@ class LocalPortal:
return await func(**kwargs) return await func(**kwargs)
@acm @asynccontextmanager
async def open_portal( async def open_portal(
channel: Channel, channel: Channel,
nursery: trio.Nursery|None = None, nursery: Optional[trio.Nursery] = None,
start_msg_loop: bool = True, start_msg_loop: bool = True,
shield: bool = False, shield: bool = False,
@ -506,7 +812,7 @@ async def open_portal(
if channel.uid is None: if channel.uid is None:
await actor._do_handshake(channel) await actor._do_handshake(channel)
msg_loop_cs: trio.CancelScope|None = None msg_loop_cs: Optional[trio.CancelScope] = None
if start_msg_loop: if start_msg_loop:
from ._runtime import process_messages from ._runtime import process_messages
msg_loop_cs = await nursery.start( msg_loop_cs = await nursery.start(

View File

@ -28,16 +28,15 @@ import os
import warnings import warnings
from exceptiongroup import BaseExceptionGroup
import trio import trio
from ._runtime import ( from ._runtime import (
Actor, Actor,
Arbiter, Arbiter,
# TODO: rename and make a non-actor subtype?
# Arbiter as Registry,
async_main, async_main,
) )
from . import _debug from .devx import _debug
from . import _spawn from . import _spawn
from . import _state from . import _state
from . import log from . import log
@ -99,7 +98,7 @@ async def open_root_actor(
# https://github.com/python-trio/trio/issues/1155#issuecomment-742964018 # https://github.com/python-trio/trio/issues/1155#issuecomment-742964018
builtin_bp_handler = sys.breakpointhook builtin_bp_handler = sys.breakpointhook
orig_bp_path: str | None = os.environ.get('PYTHONBREAKPOINT', None) orig_bp_path: str | None = os.environ.get('PYTHONBREAKPOINT', None)
os.environ['PYTHONBREAKPOINT'] = 'tractor._debug.pause_from_sync' os.environ['PYTHONBREAKPOINT'] = 'tractor.devx._debug.pause_from_sync'
# attempt to retreive ``trio``'s sigint handler and stash it # attempt to retreive ``trio``'s sigint handler and stash it
# on our debugger lock state. # on our debugger lock state.
@ -146,7 +145,7 @@ async def open_root_actor(
# expose internal debug module to every actor allowing # expose internal debug module to every actor allowing
# for use of ``await tractor.breakpoint()`` # for use of ``await tractor.breakpoint()``
enable_modules.append('tractor._debug') enable_modules.append('tractor.devx._debug')
# if debug mode get's enabled *at least* use that level of # if debug mode get's enabled *at least* use that level of
# logging for some informative console prompts. # logging for some informative console prompts.
@ -303,12 +302,12 @@ async def open_root_actor(
) as err: ) as err:
entered: bool = await _debug._maybe_enter_pm(err) entered: bool = await _debug._maybe_enter_pm(err)
if ( if (
not entered not entered
and and not is_multi_cancelled(err)
not is_multi_cancelled(err)
): ):
logger.exception('Root actor crashed:\n') logger.exception("Root actor crashed:")
# ALWAYS re-raise any error bubbled up from the # ALWAYS re-raise any error bubbled up from the
# runtime! # runtime!
@ -324,13 +323,12 @@ async def open_root_actor(
# for an in nurseries: # for an in nurseries:
# tempn.start_soon(an.exited.wait) # tempn.start_soon(an.exited.wait)
logger.info( logger.cancel("Shutting down root actor")
'Closing down root actor' await actor.cancel(
requesting_uid=actor.uid,
) )
await actor.cancel(None) # self cancel
finally: finally:
_state._current_actor = None _state._current_actor = None
_state._last_actor_terminated = actor
# restore built-in `breakpoint()` hook state # restore built-in `breakpoint()` hook state
sys.breakpointhook = builtin_bp_handler sys.breakpointhook = builtin_bp_handler

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

833
tractor/_shm.py 100644
View File

@ -0,0 +1,833 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
"""
SC friendly shared memory management geared at real-time
processing.
Support for ``numpy`` compatible array-buffers is provided but is
considered optional within the context of this runtime-library.
"""
from __future__ import annotations
from sys import byteorder
import time
from typing import Optional
from multiprocessing import shared_memory as shm
from multiprocessing.shared_memory import (
SharedMemory,
ShareableList,
)
from msgspec import Struct
import tractor
from .log import get_logger
_USE_POSIX = getattr(shm, '_USE_POSIX', False)
if _USE_POSIX:
from _posixshmem import shm_unlink
try:
import numpy as np
from numpy.lib import recfunctions as rfn
import nptyping
except ImportError:
pass
log = get_logger(__name__)
def disable_mantracker():
'''
Disable all ``multiprocessing``` "resource tracking" machinery since
it's an absolute multi-threaded mess of non-SC madness.
'''
from multiprocessing import resource_tracker as mantracker
# Tell the "resource tracker" thing to fuck off.
class ManTracker(mantracker.ResourceTracker):
def register(self, name, rtype):
pass
def unregister(self, name, rtype):
pass
def ensure_running(self):
pass
# "know your land and know your prey"
# https://www.dailymotion.com/video/x6ozzco
mantracker._resource_tracker = ManTracker()
mantracker.register = mantracker._resource_tracker.register
mantracker.ensure_running = mantracker._resource_tracker.ensure_running
mantracker.unregister = mantracker._resource_tracker.unregister
mantracker.getfd = mantracker._resource_tracker.getfd
disable_mantracker()
class SharedInt:
'''
Wrapper around a single entry shared memory array which
holds an ``int`` value used as an index counter.
'''
def __init__(
self,
shm: SharedMemory,
) -> None:
self._shm = shm
@property
def value(self) -> int:
return int.from_bytes(self._shm.buf, byteorder)
@value.setter
def value(self, value) -> None:
self._shm.buf[:] = value.to_bytes(self._shm.size, byteorder)
def destroy(self) -> None:
if _USE_POSIX:
# We manually unlink to bypass all the "resource tracker"
# nonsense meant for non-SC systems.
name = self._shm.name
try:
shm_unlink(name)
except FileNotFoundError:
# might be a teardown race here?
log.warning(f'Shm for {name} already unlinked?')
class NDToken(Struct, frozen=True):
'''
Internal represenation of a shared memory ``numpy`` array "token"
which can be used to key and load a system (OS) wide shm entry
and correctly read the array by type signature.
This type is msg safe.
'''
shm_name: str # this servers as a "key" value
shm_first_index_name: str
shm_last_index_name: str
dtype_descr: tuple
size: int # in struct-array index / row terms
# TODO: use nptyping here on dtypes
@property
def dtype(self) -> list[tuple[str, str, tuple[int, ...]]]:
return np.dtype(
list(
map(tuple, self.dtype_descr)
)
).descr
def as_msg(self):
return self.to_dict()
@classmethod
def from_msg(cls, msg: dict) -> NDToken:
if isinstance(msg, NDToken):
return msg
# TODO: native struct decoding
# return _token_dec.decode(msg)
msg['dtype_descr'] = tuple(map(tuple, msg['dtype_descr']))
return NDToken(**msg)
# _token_dec = msgspec.msgpack.Decoder(NDToken)
# TODO: this api?
# _known_tokens = tractor.ActorVar('_shm_tokens', {})
# _known_tokens = tractor.ContextStack('_known_tokens', )
# _known_tokens = trio.RunVar('shms', {})
# TODO: this should maybe be provided via
# a `.trionics.maybe_open_context()` wrapper factory?
# process-local store of keys to tokens
_known_tokens: dict[str, NDToken] = {}
def get_shm_token(key: str) -> NDToken | None:
'''
Convenience func to check if a token
for the provided key is known by this process.
Returns either the ``numpy`` token or a string for a shared list.
'''
return _known_tokens.get(key)
def _make_token(
key: str,
size: int,
dtype: np.dtype,
) -> NDToken:
'''
Create a serializable token that can be used
to access a shared array.
'''
return NDToken(
shm_name=key,
shm_first_index_name=key + "_first",
shm_last_index_name=key + "_last",
dtype_descr=tuple(np.dtype(dtype).descr),
size=size,
)
class ShmArray:
'''
A shared memory ``numpy.ndarray`` API.
An underlying shared memory buffer is allocated based on
a user specified ``numpy.ndarray``. This fixed size array
can be read and written to by pushing data both onto the "front"
or "back" of a set index range. The indexes for the "first" and
"last" index are themselves stored in shared memory (accessed via
``SharedInt`` interfaces) values such that multiple processes can
interact with the same array using a synchronized-index.
'''
def __init__(
self,
shmarr: np.ndarray,
first: SharedInt,
last: SharedInt,
shm: SharedMemory,
# readonly: bool = True,
) -> None:
self._array = shmarr
# indexes for first and last indices corresponding
# to fille data
self._first = first
self._last = last
self._len = len(shmarr)
self._shm = shm
self._post_init: bool = False
# pushing data does not write the index (aka primary key)
self._write_fields: list[str] | None = None
dtype = shmarr.dtype
if dtype.fields:
self._write_fields = list(shmarr.dtype.fields.keys())[1:]
# TODO: ringbuf api?
@property
def _token(self) -> NDToken:
return NDToken(
shm_name=self._shm.name,
shm_first_index_name=self._first._shm.name,
shm_last_index_name=self._last._shm.name,
dtype_descr=tuple(self._array.dtype.descr),
size=self._len,
)
@property
def token(self) -> dict:
"""Shared memory token that can be serialized and used by
another process to attach to this array.
"""
return self._token.as_msg()
@property
def index(self) -> int:
return self._last.value % self._len
@property
def array(self) -> np.ndarray:
'''
Return an up-to-date ``np.ndarray`` view of the
so-far-written data to the underlying shm buffer.
'''
a = self._array[self._first.value:self._last.value]
# first, last = self._first.value, self._last.value
# a = self._array[first:last]
# TODO: eventually comment this once we've not seen it in the
# wild in a long time..
# XXX: race where first/last indexes cause a reader
# to load an empty array..
if len(a) == 0 and self._post_init:
raise RuntimeError('Empty array race condition hit!?')
# breakpoint()
return a
def ustruct(
self,
fields: Optional[list[str]] = None,
# type that all field values will be cast to
# in the returned view.
common_dtype: np.dtype = float,
) -> np.ndarray:
array = self._array
if fields:
selection = array[fields]
# fcount = len(fields)
else:
selection = array
# fcount = len(array.dtype.fields)
# XXX: manual ``.view()`` attempt that also doesn't work.
# uview = selection.view(
# dtype='<f16',
# ).reshape(-1, 4, order='A')
# assert len(selection) == len(uview)
u = rfn.structured_to_unstructured(
selection,
# dtype=float,
copy=True,
)
# unstruct = np.ndarray(u.shape, dtype=a.dtype, buffer=shm.buf)
# array[:] = a[:]
return u
# return ShmArray(
# shmarr=u,
# first=self._first,
# last=self._last,
# shm=self._shm
# )
def last(
self,
length: int = 1,
) -> np.ndarray:
'''
Return the last ``length``'s worth of ("row") entries from the
array.
'''
return self.array[-length:]
def push(
self,
data: np.ndarray,
field_map: Optional[dict[str, str]] = None,
prepend: bool = False,
update_first: bool = True,
start: int | None = None,
) -> int:
'''
Ring buffer like "push" to append data
into the buffer and return updated "last" index.
NB: no actual ring logic yet to give a "loop around" on overflow
condition, lel.
'''
length = len(data)
if prepend:
index = (start or self._first.value) - length
if index < 0:
raise ValueError(
f'Array size of {self._len} was overrun during prepend.\n'
f'You have passed {abs(index)} too many datums.'
)
else:
index = start if start is not None else self._last.value
end = index + length
if field_map:
src_names, dst_names = zip(*field_map.items())
else:
dst_names = src_names = self._write_fields
try:
self._array[
list(dst_names)
][index:end] = data[list(src_names)][:]
# NOTE: there was a race here between updating
# the first and last indices and when the next reader
# tries to access ``.array`` (which due to the index
# overlap will be empty). Pretty sure we've fixed it now
# but leaving this here as a reminder.
if (
prepend
and update_first
and length
):
assert index < self._first.value
if (
index < self._first.value
and update_first
):
assert prepend, 'prepend=True not passed but index decreased?'
self._first.value = index
elif not prepend:
self._last.value = end
self._post_init = True
return end
except ValueError as err:
if field_map:
raise
# should raise if diff detected
self.diff_err_fields(data)
raise err
def diff_err_fields(
self,
data: np.ndarray,
) -> None:
# reraise with any field discrepancy
our_fields, their_fields = (
set(self._array.dtype.fields),
set(data.dtype.fields),
)
only_in_ours = our_fields - their_fields
only_in_theirs = their_fields - our_fields
if only_in_ours:
raise TypeError(
f"Input array is missing field(s): {only_in_ours}"
)
elif only_in_theirs:
raise TypeError(
f"Input array has unknown field(s): {only_in_theirs}"
)
# TODO: support "silent" prepends that don't update ._first.value?
def prepend(
self,
data: np.ndarray,
) -> int:
end = self.push(data, prepend=True)
assert end
def close(self) -> None:
self._first._shm.close()
self._last._shm.close()
self._shm.close()
def destroy(self) -> None:
if _USE_POSIX:
# We manually unlink to bypass all the "resource tracker"
# nonsense meant for non-SC systems.
shm_unlink(self._shm.name)
self._first.destroy()
self._last.destroy()
def flush(self) -> None:
# TODO: flush to storage backend like markestore?
...
def open_shm_ndarray(
size: int,
key: str | None = None,
dtype: np.dtype | None = None,
append_start_index: int | None = None,
readonly: bool = False,
) -> ShmArray:
'''
Open a memory shared ``numpy`` using the standard library.
This call unlinks (aka permanently destroys) the buffer on teardown
and thus should be used from the parent-most accessor (process).
'''
# create new shared mem segment for which we
# have write permission
a = np.zeros(size, dtype=dtype)
a['index'] = np.arange(len(a))
shm = SharedMemory(
name=key,
create=True,
size=a.nbytes
)
array = np.ndarray(
a.shape,
dtype=a.dtype,
buffer=shm.buf
)
array[:] = a[:]
array.setflags(write=int(not readonly))
token = _make_token(
key=key,
size=size,
dtype=dtype,
)
# create single entry arrays for storing an first and last indices
first = SharedInt(
shm=SharedMemory(
name=token.shm_first_index_name,
create=True,
size=4, # std int
)
)
last = SharedInt(
shm=SharedMemory(
name=token.shm_last_index_name,
create=True,
size=4, # std int
)
)
# Start the "real-time" append-updated (or "pushed-to") section
# after some start index: ``append_start_index``. This allows appending
# from a start point in the array which isn't the 0 index and looks
# something like,
# -------------------------
# | | i
# _________________________
# <-------------> <------->
# history real-time
#
# Once fully "prepended", the history section will leave the
# ``ShmArray._start.value: int = 0`` and the yet-to-be written
# real-time section will start at ``ShmArray.index: int``.
# this sets the index to nearly 2/3rds into the the length of
# the buffer leaving at least a "days worth of second samples"
# for the real-time section.
if append_start_index is None:
append_start_index = round(size * 0.616)
last.value = first.value = append_start_index
shmarr = ShmArray(
array,
first,
last,
shm,
)
assert shmarr._token == token
_known_tokens[key] = shmarr.token
# "unlink" created shm on process teardown by
# pushing teardown calls onto actor context stack
stack = tractor.current_actor().lifetime_stack
stack.callback(shmarr.close)
stack.callback(shmarr.destroy)
return shmarr
def attach_shm_ndarray(
token: tuple[str, str, tuple[str, str]],
readonly: bool = True,
) -> ShmArray:
'''
Attach to an existing shared memory array previously
created by another process using ``open_shared_array``.
No new shared mem is allocated but wrapper types for read/write
access are constructed.
'''
token = NDToken.from_msg(token)
key = token.shm_name
if key in _known_tokens:
assert NDToken.from_msg(_known_tokens[key]) == token, "WTF"
# XXX: ugh, looks like due to the ``shm_open()`` C api we can't
# actually place files in a subdir, see discussion here:
# https://stackoverflow.com/a/11103289
# attach to array buffer and view as per dtype
_err: Optional[Exception] = None
for _ in range(3):
try:
shm = SharedMemory(
name=key,
create=False,
)
break
except OSError as oserr:
_err = oserr
time.sleep(0.1)
else:
if _err:
raise _err
shmarr = np.ndarray(
(token.size,),
dtype=token.dtype,
buffer=shm.buf
)
shmarr.setflags(write=int(not readonly))
first = SharedInt(
shm=SharedMemory(
name=token.shm_first_index_name,
create=False,
size=4, # std int
),
)
last = SharedInt(
shm=SharedMemory(
name=token.shm_last_index_name,
create=False,
size=4, # std int
),
)
# make sure we can read
first.value
sha = ShmArray(
shmarr,
first,
last,
shm,
)
# read test
sha.array
# Stash key -> token knowledge for future queries
# via `maybe_opepn_shm_array()` but only after we know
# we can attach.
if key not in _known_tokens:
_known_tokens[key] = token
# "close" attached shm on actor teardown
tractor.current_actor().lifetime_stack.callback(sha.close)
return sha
def maybe_open_shm_ndarray(
key: str, # unique identifier for segment
size: int,
dtype: np.dtype | None = None,
append_start_index: int = 0,
readonly: bool = True,
) -> tuple[ShmArray, bool]:
'''
Attempt to attach to a shared memory block using a "key" lookup
to registered blocks in the users overall "system" registry
(presumes you don't have the block's explicit token).
This function is meant to solve the problem of discovering whether
a shared array token has been allocated or discovered by the actor
running in **this** process. Systems where multiple actors may seek
to access a common block can use this function to attempt to acquire
a token as discovered by the actors who have previously stored
a "key" -> ``NDToken`` map in an actor local (aka python global)
variable.
If you know the explicit ``NDToken`` for your memory segment instead
use ``attach_shm_array``.
'''
try:
# see if we already know this key
token = _known_tokens[key]
return (
attach_shm_ndarray(
token=token,
readonly=readonly,
),
False, # not newly opened
)
except KeyError:
log.warning(f"Could not find {key} in shms cache")
if dtype:
token = _make_token(
key,
size=size,
dtype=dtype,
)
else:
try:
return (
attach_shm_ndarray(
token=token,
readonly=readonly,
),
False,
)
except FileNotFoundError:
log.warning(f"Could not attach to shm with token {token}")
# This actor does not know about memory
# associated with the provided "key".
# Attempt to open a block and expect
# to fail if a block has been allocated
# on the OS by someone else.
return (
open_shm_ndarray(
key=key,
size=size,
dtype=dtype,
append_start_index=append_start_index,
readonly=readonly,
),
True,
)
class ShmList(ShareableList):
'''
Carbon copy of ``.shared_memory.ShareableList`` with a few
enhancements:
- readonly mode via instance var flag `._readonly: bool`
- ``.__getitem__()`` accepts ``slice`` inputs
- exposes the underlying buffer "name" as a ``.key: str``
'''
def __init__(
self,
sequence: list | None = None,
*,
name: str | None = None,
readonly: bool = True
) -> None:
self._readonly = readonly
self._key = name
return super().__init__(
sequence=sequence,
name=name,
)
@property
def key(self) -> str:
return self._key
@property
def readonly(self) -> bool:
return self._readonly
def __setitem__(
self,
position,
value,
) -> None:
# mimick ``numpy`` error
if self._readonly:
raise ValueError('assignment destination is read-only')
return super().__setitem__(position, value)
def __getitem__(
self,
indexish,
) -> list:
# NOTE: this is a non-writeable view (copy?) of the buffer
# in a new list instance.
if isinstance(indexish, slice):
return list(self)[indexish]
return super().__getitem__(indexish)
# TODO: should we offer a `.array` and `.push()` equivalent
# to the `ShmArray`?
# currently we have the following limitations:
# - can't write slices of input using traditional slice-assign
# syntax due to the ``ShareableList.__setitem__()`` implementation.
# - ``list(shmlist)`` returns a non-mutable copy instead of
# a writeable view which would be handier numpy-style ops.
def open_shm_list(
key: str,
sequence: list | None = None,
size: int = int(2 ** 10),
dtype: float | int | bool | str | bytes | None = float,
readonly: bool = True,
) -> ShmList:
if sequence is None:
default = {
float: 0.,
int: 0,
bool: True,
str: 'doggy',
None: None,
}[dtype]
sequence = [default] * size
shml = ShmList(
sequence=sequence,
name=key,
readonly=readonly,
)
# "close" attached shm on actor teardown
try:
actor = tractor.current_actor()
actor.lifetime_stack.callback(shml.shm.close)
actor.lifetime_stack.callback(shml.shm.unlink)
except RuntimeError:
log.warning('tractor runtime not active, skipping teardown steps')
return shml
def attach_shm_list(
key: str,
readonly: bool = False,
) -> ShmList:
return ShmList(
name=key,
readonly=readonly,
)

View File

@ -31,24 +31,25 @@ from typing import (
TYPE_CHECKING, TYPE_CHECKING,
) )
from exceptiongroup import BaseExceptionGroup
import trio import trio
from trio import TaskStatus from trio_typing import TaskStatus
from ._debug import ( from .devx._debug import (
maybe_wait_for_debugger, maybe_wait_for_debugger,
acquire_debug_lock, acquire_debug_lock,
) )
from tractor._state import ( from ._state import (
current_actor, current_actor,
is_main_process, is_main_process,
is_root_process, is_root_process,
debug_mode, debug_mode,
) )
from tractor.log import get_logger from .log import get_logger
from tractor._portal import Portal from ._portal import Portal
from tractor._runtime import Actor from ._runtime import Actor
from tractor._entry import _mp_main from ._entry import _mp_main
from tractor._exceptions import ActorFailure from ._exceptions import ActorFailure
if TYPE_CHECKING: if TYPE_CHECKING:
@ -143,7 +144,7 @@ async def exhaust_portal(
# XXX: streams should never be reaped here since they should # XXX: streams should never be reaped here since they should
# always be established and shutdown using a context manager api # always be established and shutdown using a context manager api
final: Any = await portal.result() final = await portal.result()
except ( except (
Exception, Exception,
@ -151,23 +152,13 @@ async def exhaust_portal(
) as err: ) as err:
# we reraise in the parent task via a ``BaseExceptionGroup`` # we reraise in the parent task via a ``BaseExceptionGroup``
return err return err
except trio.Cancelled as err: except trio.Cancelled as err:
# lol, of course we need this too ;P # lol, of course we need this too ;P
# TODO: merge with above? # TODO: merge with above?
log.warning( log.warning(f"Cancelled result waiter for {portal.actor.uid}")
'Cancelled portal result waiter task:\n'
f'uid: {portal.channel.uid}\n'
f'error: {err}\n'
)
return err return err
else: else:
log.debug( log.debug(f"Returning final result: {final}")
f'Returning final result from portal:\n'
f'uid: {portal.channel.uid}\n'
f'result: {final}\n'
)
return final return final
@ -179,46 +170,34 @@ async def cancel_on_completion(
) -> None: ) -> None:
''' '''
Cancel actor gracefully once its "main" portal's Cancel actor gracefully once it's "main" portal's
result arrives. result arrives.
Should only be called for actors spawned via the Should only be called for actors spawned with `run_in_actor()`.
`Portal.run_in_actor()` API.
=> and really this API will be deprecated and should be
re-implemented as a `.hilevel.one_shot_task_nursery()`..)
''' '''
# if this call errors we store the exception for later # if this call errors we store the exception for later
# in ``errors`` which will be reraised inside # in ``errors`` which will be reraised inside
# an exception group and we still send out a cancel request # an exception group and we still send out a cancel request
result: Any|Exception = await exhaust_portal(portal, actor) result = await exhaust_portal(portal, actor)
if isinstance(result, Exception): if isinstance(result, Exception):
errors[actor.uid]: Exception = result errors[actor.uid] = result
log.cancel( log.warning(
'Cancelling subactor runtime due to error:\n\n' f"Cancelling {portal.channel.uid} after error {result}"
f'Portal.cancel_actor() => {portal.channel.uid}\n\n'
f'error: {result}\n'
) )
else: else:
log.runtime( log.runtime(
'Cancelling subactor gracefully:\n\n' f"Cancelling {portal.channel.uid} gracefully "
f'Portal.cancel_actor() => {portal.channel.uid}\n\n' f"after result {result}")
f'result: {result}\n'
)
# cancel the process now that we have a final result # cancel the process now that we have a final result
await portal.cancel_actor() await portal.cancel_actor()
async def hard_kill( async def do_hard_kill(
proc: trio.Process, proc: trio.Process,
terminate_after: int = 1.6, terminate_after: int = 3,
# NOTE: for mucking with `.pause()`-ing inside the runtime
# whilst also hacking on it XD
# terminate_after: int = 99999,
# NOTE: for mucking with `.pause()`-ing inside the runtime # NOTE: for mucking with `.pause()`-ing inside the runtime
# whilst also hacking on it XD # whilst also hacking on it XD
@ -240,14 +219,11 @@ async def hard_kill(
to be handled. to be handled.
''' '''
log.cancel(
'Terminating sub-proc:\n'
f'|_{proc}\n'
)
# NOTE: this timeout used to do nothing since we were shielding # NOTE: this timeout used to do nothing since we were shielding
# the ``.wait()`` inside ``new_proc()`` which will pretty much # the ``.wait()`` inside ``new_proc()`` which will pretty much
# never release until the process exits, now it acts as # never release until the process exits, now it acts as
# a hard-kill time ultimatum. # a hard-kill time ultimatum.
log.debug(f"Terminating {proc}")
with trio.move_on_after(terminate_after) as cs: with trio.move_on_after(terminate_after) as cs:
# NOTE: code below was copied verbatim from the now deprecated # NOTE: code below was copied verbatim from the now deprecated
@ -284,17 +260,11 @@ async def hard_kill(
# zombies (as a feature) we ask the OS to do send in the # zombies (as a feature) we ask the OS to do send in the
# removal swad as the last resort. # removal swad as the last resort.
if cs.cancelled_caught: if cs.cancelled_caught:
# TODO: toss in the skynet-logo face as ascii art? log.critical(f"#ZOMBIE_LORD_IS_HERE: {proc}")
log.critical(
# 'Well, the #ZOMBIE_LORD_IS_HERE# to collect\n'
'#T-800 deployed to collect zombie B0\n'
f'|\n'
f'|_{proc}\n'
)
proc.kill() proc.kill()
async def soft_kill( async def soft_wait(
proc: ProcessType, proc: ProcessType,
wait_func: Callable[ wait_func: Callable[
@ -305,25 +275,16 @@ async def soft_kill(
) -> None: ) -> None:
''' '''
Wait for proc termination but **don't yet** teardown Wait for proc termination but **dont' yet** teardown
std-streams since it will clobber any ongoing pdb REPL std-streams (since it will clobber any ongoing pdb REPL
session. session). This is our "soft" (and thus itself cancellable)
join/reap on an actor-runtime-in-process.
This is our "soft"/graceful, and thus itself also cancellable,
join/reap on an actor-runtime-in-process shutdown; it is
**not** the same as a "hard kill" via an OS signal (for that
see `.hard_kill()`).
''' '''
uid: tuple[str, str] = portal.channel.uid uid = portal.channel.uid
try: try:
log.cancel( log.cancel(f'Soft waiting on actor:\n{uid}')
'Soft killing sub-actor via `Portal.cancel_actor()`\n'
f'|_{proc}\n'
)
# wait on sub-proc to signal termination
await wait_func(proc) await wait_func(proc)
except trio.Cancelled: except trio.Cancelled:
# if cancelled during a soft wait, cancel the child # if cancelled during a soft wait, cancel the child
# actor before entering the hard reap sequence # actor before entering the hard reap sequence
@ -335,9 +296,8 @@ async def soft_kill(
async def cancel_on_proc_deth(): async def cancel_on_proc_deth():
''' '''
"Cancel-the-cancel" request: if we detect that the Cancel the actor cancel request if we detect that
underlying sub-process exited prior to that the process terminated.
a `Portal.cancel_actor()` call completing .
''' '''
await wait_func(proc) await wait_func(proc)
@ -354,10 +314,10 @@ async def soft_kill(
if proc.poll() is None: # type: ignore if proc.poll() is None: # type: ignore
log.warning( log.warning(
'Subactor still alive after cancel request?\n\n' 'Actor still alive after cancel request:\n'
f'uid: {uid}\n' f'{uid}'
f'|_{proc}\n'
) )
n.cancel_scope.cancel() n.cancel_scope.cancel()
raise raise
@ -381,7 +341,7 @@ async def new_proc(
) -> None: ) -> None:
# lookup backend spawning target # lookup backend spawning target
target: Callable = _methods[_spawn_method] target = _methods[_spawn_method]
# mark the new actor with the global spawn method # mark the new actor with the global spawn method
subactor._spawn_method = _spawn_method subactor._spawn_method = _spawn_method
@ -449,22 +409,19 @@ async def trio_proc(
spawn_cmd.append("--asyncio") spawn_cmd.append("--asyncio")
cancelled_during_spawn: bool = False cancelled_during_spawn: bool = False
proc: trio.Process|None = None proc: trio.Process | None = None
try: try:
try: try:
# TODO: needs ``trio_typing`` patch? # TODO: needs ``trio_typing`` patch?
proc = await trio.lowlevel.open_process(spawn_cmd) proc = await trio.lowlevel.open_process(spawn_cmd)
log.runtime(
'Started new sub-proc\n' log.runtime(f"Started {proc}")
f'|_{proc}\n'
)
# wait for actor to spawn and connect back to us # wait for actor to spawn and connect back to us
# channel should have handshake completed by the # channel should have handshake completed by the
# local actor by the time we get a ref to it # local actor by the time we get a ref to it
event, chan = await actor_nursery._actor.wait_for_peer( event, chan = await actor_nursery._actor.wait_for_peer(
subactor.uid subactor.uid)
)
except trio.Cancelled: except trio.Cancelled:
cancelled_during_spawn = True cancelled_during_spawn = True
@ -525,7 +482,7 @@ async def trio_proc(
# This is a "soft" (cancellable) join/reap which # This is a "soft" (cancellable) join/reap which
# will remote cancel the actor on a ``trio.Cancelled`` # will remote cancel the actor on a ``trio.Cancelled``
# condition. # condition.
await soft_kill( await soft_wait(
proc, proc,
trio.Process.wait, trio.Process.wait,
portal portal
@ -534,9 +491,8 @@ async def trio_proc(
# cancel result waiter that may have been spawned in # cancel result waiter that may have been spawned in
# tandem if not done already # tandem if not done already
log.cancel( log.cancel(
'Cancelling existing result waiter task for ' "Cancelling existing result waiter task for "
f'{subactor.uid}' f"{subactor.uid}")
)
nursery.cancel_scope.cancel() nursery.cancel_scope.cancel()
finally: finally:
@ -554,16 +510,7 @@ async def trio_proc(
with trio.move_on_after(0.5): with trio.move_on_after(0.5):
await proc.wait() await proc.wait()
log.pdb( if is_root_process():
'Delaying subproc reaper while debugger locked..'
)
await maybe_wait_for_debugger(
child_in_debug=_runtime_vars.get(
'_debug_mode', False
),
# TODO: need a diff value then default?
# poll_steps=9999999,
)
# TODO: solve the following issue where we need # TODO: solve the following issue where we need
# to do a similar wait like this but in an # to do a similar wait like this but in an
# "intermediary" parent actor that itself isn't # "intermediary" parent actor that itself isn't
@ -571,22 +518,14 @@ async def trio_proc(
# to hold off on relaying SIGINT until that child # to hold off on relaying SIGINT until that child
# is complete. # is complete.
# https://github.com/goodboy/tractor/issues/320 # https://github.com/goodboy/tractor/issues/320
# -[ ] we need to handle non-root parent-actors specially await maybe_wait_for_debugger(
# by somehow determining if a child is in debug and then child_in_debug=_runtime_vars.get(
# avoiding cancel/kill of said child by this '_debug_mode', False),
# (intermediary) parent until such a time as the root says )
# the pdb lock is released and we are good to tear down
# (our children)..
#
# -[ ] so maybe something like this where we try to
# acquire the lock and get notified of who has it,
# check that uid against our known children?
# this_uid: tuple[str, str] = current_actor().uid
# await acquire_debug_lock(this_uid)
if proc.poll() is None: if proc.poll() is None:
log.cancel(f"Attempting to hard kill {proc}") log.cancel(f"Attempting to hard kill {proc}")
await hard_kill(proc) await do_hard_kill(proc)
log.debug(f"Joined {proc}") log.debug(f"Joined {proc}")
else: else:
@ -730,7 +669,7 @@ async def mp_proc(
# This is a "soft" (cancellable) join/reap which # This is a "soft" (cancellable) join/reap which
# will remote cancel the actor on a ``trio.Cancelled`` # will remote cancel the actor on a ``trio.Cancelled``
# condition. # condition.
await soft_kill( await soft_wait(
proc, proc,
proc_waiter, proc_waiter,
portal portal

View File

@ -18,18 +18,12 @@
Per process state Per process state
""" """
from __future__ import annotations
from typing import ( from typing import (
Optional,
Any, Any,
TYPE_CHECKING,
) )
if TYPE_CHECKING: _current_actor: Optional['Actor'] = None # type: ignore # noqa
from ._runtime import Actor
_current_actor: Actor|None = None # type: ignore # noqa
_last_actor_terminated: Actor|None = None
_runtime_vars: dict[str, Any] = { _runtime_vars: dict[str, Any] = {
'_debug_mode': False, '_debug_mode': False,
'_is_root': False, '_is_root': False,
@ -37,49 +31,14 @@ _runtime_vars: dict[str, Any] = {
} }
def last_actor() -> Actor|None: def current_actor(err_on_no_runtime: bool = True) -> 'Actor': # type: ignore # noqa
'''
Try to return last active `Actor` singleton
for this process.
For case where runtime already exited but someone is asking
about the "last" actor probably to get its `.uid: tuple`.
'''
return _last_actor_terminated
def current_actor(
err_on_no_runtime: bool = True,
) -> Actor:
''' '''
Get the process-local actor instance. Get the process-local actor instance.
''' '''
if (
err_on_no_runtime
and _current_actor is None
):
msg: str = 'No local actor has been initialized yet'
from ._exceptions import NoRuntime from ._exceptions import NoRuntime
if _current_actor is None and err_on_no_runtime:
if last := last_actor(): raise NoRuntime("No local actor has been initialized yet")
msg += (
f'Apparently the lact active actor was\n'
f'|_{last}\n'
f'|_{last.uid}\n'
)
# no actor runtime has (as of yet) ever been started for
# this process.
else:
msg += (
'No last actor found?\n'
'Did you forget to open one of:\n\n'
'- `tractor.open_root_actor()`\n'
'- `tractor.open_nursery()`\n'
)
raise NoRuntime(msg)
return _current_actor return _current_actor

View File

@ -21,9 +21,8 @@ The machinery and types behind ``Context.open_stream()``
''' '''
from __future__ import annotations from __future__ import annotations
from contextlib import asynccontextmanager as acm
import inspect import inspect
from pprint import pformat from contextlib import asynccontextmanager as acm
from typing import ( from typing import (
Any, Any,
Callable, Callable,
@ -36,7 +35,6 @@ import trio
from ._exceptions import ( from ._exceptions import (
_raise_from_no_key_in_msg, _raise_from_no_key_in_msg,
ContextCancelled,
) )
from .log import get_logger from .log import get_logger
from .trionics import ( from .trionics import (
@ -86,76 +84,15 @@ class MsgStream(trio.abc.Channel):
self._broadcaster = _broadcaster self._broadcaster = _broadcaster
# flag to denote end of stream # flag to denote end of stream
self._eoc: bool|trio.EndOfChannel = False self._eoc: bool = False
self._closed: bool|trio.ClosedResourceError = False self._closed: bool = False
# delegate directly to underlying mem channel # delegate directly to underlying mem channel
def receive_nowait( def receive_nowait(self):
self, msg = self._rx_chan.receive_nowait()
allow_msg_keys: list[str] = ['yield'],
):
msg: dict = self._rx_chan.receive_nowait()
for (
i,
key,
) in enumerate(allow_msg_keys):
try: try:
return msg[key]
except KeyError as kerr:
if i < (len(allow_msg_keys) - 1):
continue
_raise_from_no_key_in_msg(
ctx=self._ctx,
msg=msg,
src_err=kerr,
log=log,
expect_key=key,
stream=self,
)
async def receive(
self,
hide_tb: bool = True,
):
'''
Receive a single msg from the IPC transport, the next in
sequence sent by the far end task (possibly in order as
determined by the underlying protocol).
'''
__tracebackhide__: bool = hide_tb
# NOTE: `trio.ReceiveChannel` implements
# EOC handling as follows (aka uses it
# to gracefully exit async for loops):
#
# async def __anext__(self) -> ReceiveType:
# try:
# return await self.receive()
# except trio.EndOfChannel:
# raise StopAsyncIteration
#
# see ``.aclose()`` for notes on the old behaviour prior to
# introducing this
if self._eoc:
raise self._eoc
if self._closed:
raise self._closed
src_err: Exception|None = None # orig tb
try:
try:
msg = await self._rx_chan.receive()
return msg['yield'] return msg['yield']
except KeyError as kerr: except KeyError as kerr:
src_err = kerr
# NOTE: may raise any of the below error types
# includg EoC when a 'stop' msg is found.
_raise_from_no_key_in_msg( _raise_from_no_key_in_msg(
ctx=self._ctx, ctx=self._ctx,
msg=msg, msg=msg,
@ -165,169 +102,98 @@ class MsgStream(trio.abc.Channel):
stream=self, stream=self,
) )
# XXX: the stream terminates on either of: async def receive(self):
# - via `self._rx_chan.receive()` raising after manual closure '''
# by the rpc-runtime OR, Receive a single msg from the IPC transport, the next in
# - via a received `{'stop': ...}` msg from remote side. sequence sent by the far end task (possibly in order as
# |_ NOTE: previously this was triggered by calling determined by the underlying protocol).
# ``._rx_chan.aclose()`` on the send side of the channel inside
# `Actor._push_result()`, but now the 'stop' message handling
# has been put just above inside `_raise_from_no_key_in_msg()`.
except (
trio.EndOfChannel,
) as eoc:
src_err = eoc
self._eoc = eoc
# TODO: Locally, we want to close this stream gracefully, by '''
# terminating any local consumers tasks deterministically. # NOTE: `trio.ReceiveChannel` implements
# Once we have broadcast support, we **don't** want to be # EOC handling as follows (aka uses it
# closing this stream and not flushing a final value to # to gracefully exit async for loops):
# remaining (clone) consumers who may not have been #
# scheduled to receive it yet. # async def __anext__(self) -> ReceiveType:
# try: # try:
# maybe_err_msg_or_res: dict = self._rx_chan.receive_nowait() # return await self.receive()
# if maybe_err_msg_or_res: # except trio.EndOfChannel:
# log.warning( # raise StopAsyncIteration
# 'Discarding un-processed msg:\n'
# f'{maybe_err_msg_or_res}'
# )
# except trio.WouldBlock:
# # no queued msgs that might be another remote
# # error, so just raise the original EoC
# pass
# raise eoc # see ``.aclose()`` for notes on the old behaviour prior to
# introducing this
if self._eoc:
raise trio.EndOfChannel
if self._closed:
raise trio.ClosedResourceError('This stream was closed')
try:
msg = await self._rx_chan.receive()
return msg['yield']
except KeyError as kerr:
_raise_from_no_key_in_msg(
ctx=self._ctx,
msg=msg,
src_err=kerr,
log=log,
expect_key='yield',
stream=self,
)
except (
trio.ClosedResourceError, # by self._rx_chan
trio.EndOfChannel, # by self._rx_chan or `stop` msg from far end
):
# XXX: we close the stream on any of these error conditions:
# a ``ClosedResourceError`` indicates that the internal # a ``ClosedResourceError`` indicates that the internal
# feeder memory receive channel was closed likely by the # feeder memory receive channel was closed likely by the
# runtime after the associated transport-channel # runtime after the associated transport-channel
# disconnected or broke. # disconnected or broke.
except trio.ClosedResourceError as cre: # by self._rx_chan.receive()
src_err = cre # an ``EndOfChannel`` indicates either the internal recv
log.warning( # memchan exhausted **or** we raisesd it just above after
'`Context._rx_chan` was already closed?' # receiving a `stop` message from the far end of the stream.
)
self._closed = cre # Previously this was triggered by calling ``.aclose()`` on
# the send side of the channel inside
# ``Actor._push_result()`` (should still be commented code
# there - which should eventually get removed), but now the
# 'stop' message handling has been put just above.
# TODO: Locally, we want to close this stream gracefully, by
# terminating any local consumers tasks deterministically.
# One we have broadcast support, we **don't** want to be
# closing this stream and not flushing a final value to
# remaining (clone) consumers who may not have been
# scheduled to receive it yet.
# when the send is closed we assume the stream has # when the send is closed we assume the stream has
# terminated and signal this local iterator to stop # terminated and signal this local iterator to stop
drained: list[Exception|dict] = await self.aclose() await self.aclose()
if drained:
# from .devx import pause
# await pause()
log.warning(
'Drained context msgs during closure:\n'
f'{drained}'
)
# TODO: pass these to the `._ctx._drained_msgs: deque`
# and then iterate them as part of any `.result()` call?
# NOTE XXX: if the context was cancelled or remote-errored raise # propagate
# but we received the stream close msg first, we
# probably want to instead raise the remote error
# over the end-of-stream connection error since likely
# the remote error was the source cause?
ctx: Context = self._ctx
ctx.maybe_raise(
raise_ctxc_from_self_call=True,
)
# propagate any error but hide low-level frame details async def aclose(self):
# from the caller by default for debug noise reduction.
if (
hide_tb
# XXX NOTE XXX don't reraise on certain
# stream-specific internal error types like,
#
# - `trio.EoC` since we want to use the exact instance
# to ensure that it is the error that bubbles upward
# for silent absorption by `Context.open_stream()`.
and not self._eoc
# - `RemoteActorError` (or `ContextCancelled`) if it gets
# raised from `_raise_from_no_key_in_msg()` since we
# want the same (as the above bullet) for any
# `.open_context()` block bubbled error raised by
# any nearby ctx API remote-failures.
# and not isinstance(src_err, RemoteActorError)
):
raise type(src_err)(*src_err.args) from src_err
else:
raise src_err
async def aclose(self) -> list[Exception|dict]:
''' '''
Cancel associated remote actor task and local memory channel on Cancel associated remote actor task and local memory channel on
close. close.
Notes:
- REMEMBER that this is also called by `.__aexit__()` so
careful consideration must be made to handle whatever
internal stsate is mutated, particuarly in terms of
draining IPC msgs!
- more or less we try to maintain adherance to trio's `.aclose()` semantics:
https://trio.readthedocs.io/en/stable/reference-io.html#trio.abc.AsyncResource.aclose
''' '''
# XXX: keep proper adherance to trio's `.aclose()` semantics:
# https://trio.readthedocs.io/en/stable/reference-io.html#trio.abc.AsyncResource.aclose
rx_chan = self._rx_chan
# rx_chan = self._rx_chan if rx_chan._closed:
log.cancel(f"{self} is already closed")
# XXX NOTE XXX
# it's SUPER IMPORTANT that we ensure we don't DOUBLE
# DRAIN msgs on closure so avoid getting stuck handing on
# the `._rx_chan` since we call this method on
# `.__aexit__()` as well!!!
# => SO ENSURE WE CATCH ALL TERMINATION STATES in this
# block including the EoC..
if self.closed:
# this stream has already been closed so silently succeed as # this stream has already been closed so silently succeed as
# per ``trio.AsyncResource`` semantics. # per ``trio.AsyncResource`` semantics.
# https://trio.readthedocs.io/en/stable/reference-io.html#trio.abc.AsyncResource.aclose # https://trio.readthedocs.io/en/stable/reference-io.html#trio.abc.AsyncResource.aclose
return [] return
ctx: Context = self._ctx self._eoc = True
drained: list[Exception|dict] = []
while not drained:
try:
maybe_final_msg = self.receive_nowait(
allow_msg_keys=['yield', 'return'],
)
if maybe_final_msg:
log.debug(
'Drained un-processed stream msg:\n'
f'{pformat(maybe_final_msg)}'
)
# TODO: inject into parent `Context` buf?
drained.append(maybe_final_msg)
# NOTE: we only need these handlers due to the
# `.receive_nowait()` call above which may re-raise
# one of these errors on a msg key error!
except trio.WouldBlock as be:
drained.append(be)
break
except trio.EndOfChannel as eoc:
self._eoc: Exception = eoc
drained.append(eoc)
break
except trio.ClosedResourceError as cre:
self._closed = cre
drained.append(cre)
break
except ContextCancelled as ctxc:
# log.exception('GOT CTXC')
log.cancel(
'Context was cancelled during stream closure:\n'
f'canceller: {ctxc.canceller}\n'
f'{pformat(ctxc.msgdata)}'
)
break
# NOTE: this is super subtle IPC messaging stuff: # NOTE: this is super subtle IPC messaging stuff:
# Relay stop iteration to far end **iff** we're # Relay stop iteration to far end **iff** we're
@ -358,40 +224,26 @@ class MsgStream(trio.abc.Channel):
except ( except (
trio.BrokenResourceError, trio.BrokenResourceError,
trio.ClosedResourceError trio.ClosedResourceError
) as re: ):
# the underlying channel may already have been pulled # the underlying channel may already have been pulled
# in which case our stop message is meaningless since # in which case our stop message is meaningless since
# it can't traverse the transport. # it can't traverse the transport.
ctx = self._ctx
log.warning( log.warning(
f'Stream was already destroyed?\n' f'Stream was already destroyed?\n'
f'actor: {ctx.chan.uid}\n' f'actor: {ctx.chan.uid}\n'
f'ctx id: {ctx.cid}' f'ctx id: {ctx.cid}'
) )
drained.append(re)
self._closed = re
# if caught_eoc: self._closed = True
# # from .devx import _debug
# # await _debug.pause()
# with trio.CancelScope(shield=True):
# await rx_chan.aclose()
if not self._eoc: # Do we close the local mem chan ``self._rx_chan`` ??!?
log.cancel(
'Stream closed before it received an EoC?\n' # NO, DEFINITELY NOT if we're a bi-dir ``MsgStream``!
'Setting eoc manually..\n..' # BECAUSE this same core-msg-loop mem recv-chan is used to deliver
) # the potential final result from the surrounding inter-actor
self._eoc: bool = trio.EndOfChannel( # `Context` so we don't want to close it until that context has
f'Context stream closed by {self._ctx.side}\n' # run to completion.
f'|_{self}\n'
)
# ?XXX WAIT, why do we not close the local mem chan `._rx_chan` XXX?
# => NO, DEFINITELY NOT! <=
# if we're a bi-dir ``MsgStream`` BECAUSE this same
# core-msg-loop mem recv-chan is used to deliver the
# potential final result from the surrounding inter-actor
# `Context` so we don't want to close it until that
# context has run to completion.
# XXX: Notes on old behaviour: # XXX: Notes on old behaviour:
# await rx_chan.aclose() # await rx_chan.aclose()
@ -420,26 +272,6 @@ class MsgStream(trio.abc.Channel):
# runtime's closure of ``rx_chan`` in the case where we may # runtime's closure of ``rx_chan`` in the case where we may
# still need to consume msgs that are "in transit" from the far # still need to consume msgs that are "in transit" from the far
# end (eg. for ``Context.result()``). # end (eg. for ``Context.result()``).
# self._closed = True
return drained
@property
def closed(self) -> bool:
rxc: bool = self._rx_chan._closed
_closed: bool|Exception = self._closed
_eoc: bool|trio.EndOfChannel = self._eoc
if rxc or _closed or _eoc:
log.runtime(
f'`MsgStream` is already closed\n'
f'{self}\n'
f' |_cid: {self._ctx.cid}\n'
f' |_rx_chan._closed: {type(rxc)} = {rxc}\n'
f' |_closed: {type(_closed)} = {_closed}\n'
f' |_eoc: {type(_eoc)} = {_eoc}'
)
return True
return False
@acm @acm
async def subscribe( async def subscribe(
@ -495,43 +327,19 @@ class MsgStream(trio.abc.Channel):
async def send( async def send(
self, self,
data: Any, data: Any
hide_tb: bool = True,
) -> None: ) -> None:
''' '''
Send a message over this stream to the far end. Send a message over this stream to the far end.
''' '''
__tracebackhide__: bool = hide_tb if self._ctx._remote_error:
raise self._ctx._remote_error # from None
# raise any alreay known error immediately
self._ctx.maybe_raise()
if self._eoc:
raise self._eoc
if self._closed: if self._closed:
raise self._closed raise trio.ClosedResourceError('This stream was already closed')
try: await self._ctx.chan.send({'yield': data, 'cid': self._ctx.cid})
await self._ctx.chan.send(
payload={
'yield': data,
'cid': self._ctx.cid,
},
# hide_tb=hide_tb,
)
except (
trio.ClosedResourceError,
trio.BrokenResourceError,
BrokenPipeError,
) as trans_err:
if hide_tb:
raise type(trans_err)(
*trans_err.args
) from trans_err
else:
raise
def stream(func: Callable) -> Callable: def stream(func: Callable) -> Callable:

View File

@ -21,22 +21,19 @@
from contextlib import asynccontextmanager as acm from contextlib import asynccontextmanager as acm
from functools import partial from functools import partial
import inspect import inspect
from pprint import pformat
from typing import TYPE_CHECKING from typing import TYPE_CHECKING
import typing import typing
import warnings import warnings
from exceptiongroup import BaseExceptionGroup
import trio import trio
from ._debug import maybe_wait_for_debugger from .devx._debug import maybe_wait_for_debugger
from ._state import current_actor, is_main_process from ._state import current_actor, is_main_process
from .log import get_logger, get_loglevel from .log import get_logger, get_loglevel
from ._runtime import Actor from ._runtime import Actor
from ._portal import Portal from ._portal import Portal
from ._exceptions import ( from ._exceptions import is_multi_cancelled
is_multi_cancelled,
ContextCancelled,
)
from ._root import open_root_actor from ._root import open_root_actor
from . import _state from . import _state
from . import _spawn from . import _spawn
@ -106,14 +103,6 @@ class ActorNursery:
self.errors = errors self.errors = errors
self.exited = trio.Event() self.exited = trio.Event()
# NOTE: when no explicit call is made to
# `.open_root_actor()` by application code,
# `.open_nursery()` will implicitly call it to start the
# actor-tree runtime. In this case we mark ourselves as
# such so that runtime components can be aware for logging
# and syncing purposes to any actor opened nurseries.
self._implicit_runtime_started: bool = False
async def start_actor( async def start_actor(
self, self,
name: str, name: str,
@ -167,7 +156,7 @@ class ActorNursery:
# start a task to spawn a process # start a task to spawn a process
# blocks until process has been started and a portal setup # blocks until process has been started and a portal setup
nursery: trio.Nursery = nursery or self._da_nursery nursery = nursery or self._da_nursery
# XXX: the type ignore is actually due to a `mypy` bug # XXX: the type ignore is actually due to a `mypy` bug
return await nursery.start( # type: ignore return await nursery.start( # type: ignore
@ -200,16 +189,14 @@ class ActorNursery:
**kwargs, # explicit args to ``fn`` **kwargs, # explicit args to ``fn``
) -> Portal: ) -> Portal:
''' """Spawn a new actor, run a lone task, then terminate the actor and
Spawn a new actor, run a lone task, then terminate the actor and
return its result. return its result.
Actors spawned using this method are kept alive at nursery teardown Actors spawned using this method are kept alive at nursery teardown
until the task spawned by executing ``fn`` completes at which point until the task spawned by executing ``fn`` completes at which point
the actor is terminated. the actor is terminated.
"""
''' mod_path = fn.__module__
mod_path: str = fn.__module__
if name is None: if name is None:
# use the explicit function name if not provided # use the explicit function name if not provided
@ -244,37 +231,21 @@ class ActorNursery:
) )
return portal return portal
async def cancel( async def cancel(self, hard_kill: bool = False) -> None:
self, """Cancel this nursery by instructing each subactor to cancel
hard_kill: bool = False,
) -> None:
'''
Cancel this nursery by instructing each subactor to cancel
itself and wait for all subactors to terminate. itself and wait for all subactors to terminate.
If ``hard_killl`` is set to ``True`` then kill the processes If ``hard_killl`` is set to ``True`` then kill the processes
directly without any far end graceful ``trio`` cancellation. directly without any far end graceful ``trio`` cancellation.
"""
'''
self.cancelled = True self.cancelled = True
# TODO: impl a repr for spawn more compact log.cancel(f"Cancelling nursery in {self._actor.uid}")
# then `._children`..
children: dict = self._children
child_count: int = len(children)
msg: str = f'Cancelling actor nursery with {child_count} children\n'
with trio.move_on_after(3) as cs: with trio.move_on_after(3) as cs:
async with trio.open_nursery() as tn:
subactor: Actor async with trio.open_nursery() as nursery:
proc: trio.Process
portal: Portal for subactor, proc, portal in self._children.values():
for (
subactor,
proc,
portal,
) in children.values():
# TODO: are we ever even going to use this or # TODO: are we ever even going to use this or
# is the spawning backend responsible for such # is the spawning backend responsible for such
@ -286,13 +257,12 @@ class ActorNursery:
if portal is None: # actor hasn't fully spawned yet if portal is None: # actor hasn't fully spawned yet
event = self._actor._peer_connected[subactor.uid] event = self._actor._peer_connected[subactor.uid]
log.warning( log.warning(
f"{subactor.uid} never 't finished spawning?" f"{subactor.uid} wasn't finished spawning?")
)
await event.wait() await event.wait()
# channel/portal should now be up # channel/portal should now be up
_, _, portal = children[subactor.uid] _, _, portal = self._children[subactor.uid]
# XXX should be impossible to get here # XXX should be impossible to get here
# unless method was called from within # unless method was called from within
@ -309,24 +279,14 @@ class ActorNursery:
# spawn cancel tasks for each sub-actor # spawn cancel tasks for each sub-actor
assert portal assert portal
if portal.channel.connected(): if portal.channel.connected():
tn.start_soon(portal.cancel_actor) nursery.start_soon(portal.cancel_actor)
log.cancel(msg)
# if we cancelled the cancel (we hung cancelling remote actors) # if we cancelled the cancel (we hung cancelling remote actors)
# then hard kill all sub-processes # then hard kill all sub-processes
if cs.cancelled_caught: if cs.cancelled_caught:
log.error( log.error(
f'Failed to cancel {self}?\n' f"Failed to cancel {self}\nHard killing process tree!")
'Hard killing underlying subprocess tree!\n' for subactor, proc, portal in self._children.values():
)
subactor: Actor
proc: trio.Process
portal: Portal
for (
subactor,
proc,
portal,
) in children.values():
log.warning(f"Hard killing process {proc}") log.warning(f"Hard killing process {proc}")
proc.terminate() proc.terminate()
@ -366,7 +326,7 @@ async def _open_and_supervise_one_cancels_all_nursery(
# the above "daemon actor" nursery will be notified. # the above "daemon actor" nursery will be notified.
async with trio.open_nursery() as ria_nursery: async with trio.open_nursery() as ria_nursery:
an = ActorNursery( anursery = ActorNursery(
actor, actor,
ria_nursery, ria_nursery,
da_nursery, da_nursery,
@ -375,16 +335,16 @@ async def _open_and_supervise_one_cancels_all_nursery(
try: try:
# spawning of actors happens in the caller's scope # spawning of actors happens in the caller's scope
# after we yield upwards # after we yield upwards
yield an yield anursery
# When we didn't error in the caller's scope, # When we didn't error in the caller's scope,
# signal all process-monitor-tasks to conduct # signal all process-monitor-tasks to conduct
# the "hard join phase". # the "hard join phase".
log.runtime( log.runtime(
'Waiting on subactors to complete:\n' f"Waiting on subactors {anursery._children} "
f'{pformat(an._children)}\n' "to complete"
) )
an._join_procs.set() anursery._join_procs.set()
except BaseException as inner_err: except BaseException as inner_err:
errors[actor.uid] = inner_err errors[actor.uid] = inner_err
@ -396,60 +356,37 @@ async def _open_and_supervise_one_cancels_all_nursery(
# Instead try to wait for pdb to be released before # Instead try to wait for pdb to be released before
# tearing down. # tearing down.
await maybe_wait_for_debugger( await maybe_wait_for_debugger(
child_in_debug=an._at_least_one_child_in_debug child_in_debug=anursery._at_least_one_child_in_debug
) )
# if the caller's scope errored then we activate our # if the caller's scope errored then we activate our
# one-cancels-all supervisor strategy (don't # one-cancels-all supervisor strategy (don't
# worry more are coming). # worry more are coming).
an._join_procs.set() anursery._join_procs.set()
# XXX NOTE XXX: hypothetically an error could # XXX: hypothetically an error could be
# be raised and then a cancel signal shows up # raised and then a cancel signal shows up
# slightly after in which case the `else:` # slightly after in which case the `else:`
# block here might not complete? For now, # block here might not complete? For now,
# shield both. # shield both.
with trio.CancelScope(shield=True): with trio.CancelScope(shield=True):
etype: type = type(inner_err) etype = type(inner_err)
if etype in ( if etype in (
trio.Cancelled, trio.Cancelled,
KeyboardInterrupt, KeyboardInterrupt
) or ( ) or (
is_multi_cancelled(inner_err) is_multi_cancelled(inner_err)
): ):
log.cancel( log.cancel(
f'Actor-nursery cancelled by {etype}\n\n' f"Nursery for {current_actor().uid} "
f"was cancelled with {etype}")
f'{current_actor().uid}\n'
f' |_{an}\n\n'
# TODO: show tb str?
# f'{tb_str}'
)
elif etype in {
ContextCancelled,
}:
log.cancel(
'Actor-nursery caught remote cancellation\n\n'
f'{inner_err.tb_str}'
)
else: else:
log.exception( log.exception(
'Nursery errored with:\n' f"Nursery for {current_actor().uid} "
f"errored with")
# TODO: same thing as in
# `._invoke()` to compute how to
# place this div-line in the
# middle of the above msg
# content..
# -[ ] prolly helper-func it too
# in our `.log` module..
# '------ - ------'
)
# cancel all subactors # cancel all subactors
await an.cancel() await anursery.cancel()
# ria_nursery scope end # ria_nursery scope end
@ -470,22 +407,18 @@ async def _open_and_supervise_one_cancels_all_nursery(
# XXX: yet another guard before allowing the cancel # XXX: yet another guard before allowing the cancel
# sequence in case a (single) child is in debug. # sequence in case a (single) child is in debug.
await maybe_wait_for_debugger( await maybe_wait_for_debugger(
child_in_debug=an._at_least_one_child_in_debug child_in_debug=anursery._at_least_one_child_in_debug
) )
# If actor-local error was raised while waiting on # If actor-local error was raised while waiting on
# ".run_in_actor()" actors then we also want to cancel all # ".run_in_actor()" actors then we also want to cancel all
# remaining sub-actors (due to our lone strategy: # remaining sub-actors (due to our lone strategy:
# one-cancels-all). # one-cancels-all).
if an._children: log.cancel(f"Nursery cancelling due to {err}")
log.cancel( if anursery._children:
'Actor-nursery cancelling due error type:\n'
f'{err}\n'
)
with trio.CancelScope(shield=True): with trio.CancelScope(shield=True):
await an.cancel() await anursery.cancel()
raise raise
finally: finally:
# No errors were raised while awaiting ".run_in_actor()" # No errors were raised while awaiting ".run_in_actor()"
# actors but those actors may have returned remote errors as # actors but those actors may have returned remote errors as
@ -494,9 +427,9 @@ async def _open_and_supervise_one_cancels_all_nursery(
# collected in ``errors`` so cancel all actors, summarize # collected in ``errors`` so cancel all actors, summarize
# all errors and re-raise. # all errors and re-raise.
if errors: if errors:
if an._children: if anursery._children:
with trio.CancelScope(shield=True): with trio.CancelScope(shield=True):
await an.cancel() await anursery.cancel()
# use `BaseExceptionGroup` as needed # use `BaseExceptionGroup` as needed
if len(errors) > 1: if len(errors) > 1:
@ -531,20 +464,19 @@ async def open_nursery(
which cancellation scopes correspond to each spawned subactor set. which cancellation scopes correspond to each spawned subactor set.
''' '''
implicit_runtime: bool = False implicit_runtime = False
actor: Actor = current_actor(err_on_no_runtime=False)
an: ActorNursery|None = None actor = current_actor(err_on_no_runtime=False)
try: try:
if ( if actor is None and is_main_process():
actor is None
and is_main_process()
):
# if we are the parent process start the # if we are the parent process start the
# actor runtime implicitly # actor runtime implicitly
log.info("Starting actor runtime!") log.info("Starting actor runtime!")
# mark us for teardown on exit # mark us for teardown on exit
implicit_runtime: bool = True implicit_runtime = True
async with open_root_actor(**kwargs) as actor: async with open_root_actor(**kwargs) as actor:
assert actor is current_actor() assert actor is current_actor()
@ -552,42 +484,24 @@ async def open_nursery(
try: try:
async with _open_and_supervise_one_cancels_all_nursery( async with _open_and_supervise_one_cancels_all_nursery(
actor actor
) as an: ) as anursery:
yield anursery
# NOTE: mark this nursery as having
# implicitly started the root actor so
# that `._runtime` machinery can avoid
# certain teardown synchronization
# blocking/waits and any associated (warn)
# logging when it's known that this
# nursery shouldn't be exited before the
# root actor is.
an._implicit_runtime_started = True
yield an
finally: finally:
# XXX: this event will be set after the root actor anursery.exited.set()
# runtime is already torn down, so we want to
# avoid any blocking on it.
an.exited.set()
else: # sub-nursery case else: # sub-nursery case
try: try:
async with _open_and_supervise_one_cancels_all_nursery( async with _open_and_supervise_one_cancels_all_nursery(
actor actor
) as an: ) as anursery:
yield an yield anursery
finally: finally:
an.exited.set() anursery.exited.set()
finally: finally:
msg: str = ( log.debug("Nursery teardown complete")
'Actor-nursery exited\n'
f'|_{an}\n\n'
)
# shutdown runtime if it was started # shutdown runtime if it was started
if implicit_runtime: if implicit_runtime:
msg += '=> Shutting down actor runtime <=\n' log.info("Shutting down actor tree")
log.info(msg)

View File

@ -1,74 +0,0 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
Various helpers/utils for auditing your `tractor` app and/or the
core runtime.
'''
from contextlib import asynccontextmanager as acm
import pathlib
import tractor
from .pytest import (
tractor_test as tractor_test
)
def repodir() -> pathlib.Path:
'''
Return the abspath to the repo directory.
'''
# 2 parents up to step up through tests/<repo_dir>
return pathlib.Path(
__file__
# 3 .parents bc:
# <._testing-pkg>.<tractor-pkg>.<git-repo-dir>
# /$HOME/../<tractor-repo-dir>/tractor/_testing/__init__.py
).parent.parent.parent.absolute()
def examples_dir() -> pathlib.Path:
'''
Return the abspath to the examples directory as `pathlib.Path`.
'''
return repodir() / 'examples'
@acm
async def expect_ctxc(
yay: bool,
reraise: bool = False,
) -> None:
'''
Small acm to catch `ContextCancelled` errors when expected
below it in a `async with ()` block.
'''
if yay:
try:
yield
raise RuntimeError('Never raised ctxc?')
except tractor.ContextCancelled:
if reraise:
raise
else:
return
else:
yield

View File

@ -1,113 +0,0 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
`pytest` utils helpers and plugins for testing `tractor`'s runtime
and applications.
'''
from functools import (
partial,
wraps,
)
import inspect
import platform
import tractor
import trio
def tractor_test(fn):
'''
Decorator for async test funcs to present them as "native"
looking sync funcs runnable by `pytest` using `trio.run()`.
Use:
@tractor_test
async def test_whatever():
await ...
If fixtures:
- ``reg_addr`` (a socket addr tuple where arbiter is listening)
- ``loglevel`` (logging level passed to tractor internals)
- ``start_method`` (subprocess spawning backend)
are defined in the `pytest` fixture space they will be automatically
injected to tests declaring these funcargs.
'''
@wraps(fn)
def wrapper(
*args,
loglevel=None,
reg_addr=None,
start_method: str|None = None,
debug_mode: bool = False,
**kwargs
):
# __tracebackhide__ = True
# NOTE: inject ant test func declared fixture
# names by manually checking!
if 'reg_addr' in inspect.signature(fn).parameters:
# injects test suite fixture value to test as well
# as `run()`
kwargs['reg_addr'] = reg_addr
if 'loglevel' in inspect.signature(fn).parameters:
# allows test suites to define a 'loglevel' fixture
# that activates the internal logging
kwargs['loglevel'] = loglevel
if start_method is None:
if platform.system() == "Windows":
start_method = 'trio'
if 'start_method' in inspect.signature(fn).parameters:
# set of subprocess spawning backends
kwargs['start_method'] = start_method
if 'debug_mode' in inspect.signature(fn).parameters:
# set of subprocess spawning backends
kwargs['debug_mode'] = debug_mode
if kwargs:
# use explicit root actor start
async def _main():
async with tractor.open_root_actor(
# **kwargs,
registry_addrs=[reg_addr] if reg_addr else None,
loglevel=loglevel,
start_method=start_method,
# TODO: only enable when pytest is passed --pdb
debug_mode=debug_mode,
):
await fn(*args, **kwargs)
main = _main
else:
# use implicit root actor start
main = partial(fn, *args, **kwargs)
return trio.run(main)
return wrapper

View File

@ -0,0 +1,47 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
"""
Runtime "developer experience" utils and addons to aid our
(advanced) users and core devs in building distributed applications
and working with/on the actor runtime.
"""
from ._debug import (
maybe_wait_for_debugger,
acquire_debug_lock,
breakpoint,
pause,
pause_from_sync,
shield_sigint_handler,
MultiActorPdb,
open_crash_handler,
maybe_open_crash_handler,
post_mortem,
)
__all__ = [
'maybe_wait_for_debugger',
'acquire_debug_lock',
'breakpoint',
'pause',
'pause_from_sync',
'shield_sigint_handler',
'MultiActorPdb',
'open_crash_handler',
'maybe_open_crash_handler',
'post_mortem',
]

View File

@ -27,7 +27,11 @@ from functools import (
partial, partial,
cached_property, cached_property,
) )
from contextlib import asynccontextmanager as acm from contextlib import (
asynccontextmanager as acm,
contextmanager as cm,
nullcontext,
)
from typing import ( from typing import (
Any, Any,
Callable, Callable,
@ -44,22 +48,24 @@ from trio_typing import (
# Task, # Task,
) )
from .log import get_logger from ..log import get_logger
from ._discovery import get_root from .._state import (
from ._state import (
is_root_process, is_root_process,
debug_mode, debug_mode,
) )
from ._exceptions import ( from .._exceptions import (
is_multi_cancelled, is_multi_cancelled,
ContextCancelled, ContextCancelled,
) )
from ._ipc import Channel from .._ipc import Channel
log = get_logger(__name__) log = get_logger(__name__)
__all__ = ['breakpoint', 'post_mortem'] __all__ = [
'breakpoint',
'post_mortem',
]
class Lock: class Lock:
@ -324,7 +330,7 @@ async def lock_tty_for_child(
f'Actor {subactor_uid} is blocked from acquiring debug lock\n' f'Actor {subactor_uid} is blocked from acquiring debug lock\n'
f"remote task: {task_name}:{subactor_uid}" f"remote task: {task_name}:{subactor_uid}"
) )
ctx._enter_debugger_on_cancel = False ctx._enter_debugger_on_cancel: bool = False
await ctx.cancel(f'Debug lock blocked for {subactor_uid}') await ctx.cancel(f'Debug lock blocked for {subactor_uid}')
return 'pdb_lock_blocked' return 'pdb_lock_blocked'
@ -375,12 +381,14 @@ async def wait_for_parent_stdin_hijack(
This function is used by any sub-actor to acquire mutex access to This function is used by any sub-actor to acquire mutex access to
the ``pdb`` REPL and thus the root's TTY for interactive debugging the ``pdb`` REPL and thus the root's TTY for interactive debugging
(see below inside ``_pause()``). It can be used to ensure that (see below inside ``pause()``). It can be used to ensure that
an intermediate nursery-owning actor does not clobber its children an intermediate nursery-owning actor does not clobber its children
if they are in debug (see below inside if they are in debug (see below inside
``maybe_wait_for_debugger()``). ``maybe_wait_for_debugger()``).
''' '''
from .._discovery import get_root
with trio.CancelScope(shield=True) as cs: with trio.CancelScope(shield=True) as cs:
Lock._debugger_request_cs = cs Lock._debugger_request_cs = cs
@ -390,7 +398,7 @@ async def wait_for_parent_stdin_hijack(
# this syncs to child's ``Context.started()`` call. # this syncs to child's ``Context.started()`` call.
async with portal.open_context( async with portal.open_context(
tractor._debug.lock_tty_for_child, lock_tty_for_child,
subactor_uid=actor_uid, subactor_uid=actor_uid,
) as (ctx, val): ) as (ctx, val):
@ -441,171 +449,6 @@ def mk_mpdb() -> tuple[MultiActorPdb, Callable]:
return pdb, Lock.unshield_sigint return pdb, Lock.unshield_sigint
async def _pause(
debug_func: Callable | None = None,
release_lock_signal: trio.Event | None = None,
# TODO:
# shield: bool = False
task_status: TaskStatus[trio.Event] = trio.TASK_STATUS_IGNORED
) -> None:
'''
A pause point (more commonly known as a "breakpoint") interrupt
instruction for engaging a blocking debugger instance to
conduct manual console-based-REPL-interaction from within
`tractor`'s async runtime, normally from some single-threaded
and currently executing actor-hosted-`trio`-task in some
(remote) process.
NOTE: we use the semantics "pause" since it better encompasses
the entirety of the necessary global-runtime-state-mutation any
actor-task must access and lock in order to get full isolated
control over the process tree's root TTY:
https://en.wikipedia.org/wiki/Breakpoint
'''
__tracebackhide__ = True
actor = tractor.current_actor()
pdb, undo_sigint = mk_mpdb()
task_name = trio.lowlevel.current_task().name
# TODO: is it possible to debug a trio.Cancelled except block?
# right now it seems like we can kinda do with by shielding
# around ``tractor.breakpoint()`` but not if we move the shielded
# scope here???
# with trio.CancelScope(shield=shield):
# await trio.lowlevel.checkpoint()
if (
not Lock.local_pdb_complete
or Lock.local_pdb_complete.is_set()
):
Lock.local_pdb_complete = trio.Event()
# TODO: need a more robust check for the "root" actor
if (
not is_root_process()
and actor._parent_chan # a connected child
):
if Lock.local_task_in_debug:
# Recurrence entry case: this task already has the lock and
# is likely recurrently entering a breakpoint
if Lock.local_task_in_debug == task_name:
# noop on recurrent entry case but we want to trigger
# a checkpoint to allow other actors error-propagate and
# potetially avoid infinite re-entries in some subactor.
await trio.lowlevel.checkpoint()
return
# if **this** actor is already in debug mode block here
# waiting for the control to be released - this allows
# support for recursive entries to `tractor.breakpoint()`
log.warning(f"{actor.uid} already has a debug lock, waiting...")
await Lock.local_pdb_complete.wait()
await trio.sleep(0.1)
# mark local actor as "in debug mode" to avoid recurrent
# entries/requests to the root process
Lock.local_task_in_debug = task_name
# this **must** be awaited by the caller and is done using the
# root nursery so that the debugger can continue to run without
# being restricted by the scope of a new task nursery.
# TODO: if we want to debug a trio.Cancelled triggered exception
# we have to figure out how to avoid having the service nursery
# cancel on this task start? I *think* this works below:
# ```python
# actor._service_n.cancel_scope.shield = shield
# ```
# but not entirely sure if that's a sane way to implement it?
try:
with trio.CancelScope(shield=True):
await actor._service_n.start(
wait_for_parent_stdin_hijack,
actor.uid,
)
Lock.repl = pdb
except RuntimeError:
Lock.release()
if actor._cancel_called:
# service nursery won't be usable and we
# don't want to lock up the root either way since
# we're in (the midst of) cancellation.
return
raise
elif is_root_process():
# we also wait in the root-parent for any child that
# may have the tty locked prior
# TODO: wait, what about multiple root tasks acquiring it though?
if Lock.global_actor_in_debug == actor.uid:
# re-entrant root process already has it: noop.
return
# XXX: since we need to enter pdb synchronously below,
# we have to release the lock manually from pdb completion
# callbacks. Can't think of a nicer way then this atm.
if Lock._debug_lock.locked():
log.warning(
'Root actor attempting to shield-acquire active tty lock'
f' owned by {Lock.global_actor_in_debug}')
# must shield here to avoid hitting a ``Cancelled`` and
# a child getting stuck bc we clobbered the tty
with trio.CancelScope(shield=True):
await Lock._debug_lock.acquire()
else:
# may be cancelled
await Lock._debug_lock.acquire()
Lock.global_actor_in_debug = actor.uid
Lock.local_task_in_debug = task_name
Lock.repl = pdb
try:
# breakpoint()
if debug_func is None:
# assert release_lock_signal, (
# 'Must pass `release_lock_signal: trio.Event` if no '
# 'trace func provided!'
# )
print(f"{actor.uid} ENTERING WAIT")
task_status.started()
# with trio.CancelScope(shield=True):
# await release_lock_signal.wait()
else:
# block here one (at the appropriate frame *up*) where
# ``breakpoint()`` was awaited and begin handling stdio.
log.debug("Entering the synchronous world of pdb")
debug_func(actor, pdb)
except bdb.BdbQuit:
Lock.release()
raise
# XXX: apparently we can't do this without showing this frame
# in the backtrace on first entry to the REPL? Seems like an odd
# behaviour that should have been fixed by now. This is also why
# we scrapped all the @cm approaches that were tried previously.
# finally:
# __tracebackhide__ = True
# # frame = sys._getframe()
# # last_f = frame.f_back
# # last_f.f_globals['__tracebackhide__'] = True
# # signal.signal = pdbp.hideframe(signal.signal)
def shield_sigint_handler( def shield_sigint_handler(
signum: int, signum: int,
frame: 'frame', # type: ignore # noqa frame: 'frame', # type: ignore # noqa
@ -767,8 +610,9 @@ def shield_sigint_handler(
def _set_trace( def _set_trace(
actor: tractor.Actor | None = None, actor: tractor.Actor | None = None,
pdb: MultiActorPdb | None = None, pdb: MultiActorPdb | None = None,
shield: bool = False,
): ):
__tracebackhide__ = True __tracebackhide__: bool = True
actor: tractor.Actor = actor or tractor.current_actor() actor: tractor.Actor = actor or tractor.current_actor()
# start 2 levels up in user code # start 2 levels up in user code
@ -778,13 +622,20 @@ def _set_trace(
if ( if (
frame frame
and pdb and (
pdb
and actor is not None and actor is not None
) or shield
): ):
# pdbp.set_trace()
log.pdb(f"\nAttaching pdb to actor: {actor.uid}\n") log.pdb(f"\nAttaching pdb to actor: {actor.uid}\n")
# no f!#$&* idea, but when we're in async land # no f!#$&* idea, but when we're in async land
# we need 2x frames up? # we need 2x frames up?
frame = frame.f_back frame = frame.f_back
# frame = frame.f_back
# if shield:
# frame = frame.f_back
else: else:
pdb, undo_sigint = mk_mpdb() pdb, undo_sigint = mk_mpdb()
@ -797,15 +648,203 @@ def _set_trace(
# undo_ # undo_
# TODO: allow pausing from sync code, normally by remapping async def pause(
# python's builtin breakpoint() hook to this runtime aware version.
debug_func: Callable = _set_trace,
release_lock_signal: trio.Event | None = None,
# TODO: allow caller to pause despite task cancellation,
# exactly the same as wrapping with:
# with CancelScope(shield=True):
# await pause()
# => the REMAINING ISSUE is that the scope's .__exit__() frame
# is always show in the debugger on entry.. and there seems to
# be no way to override it?..
# shield: bool = False,
# TODO:
# shield: bool = False
task_status: TaskStatus[trio.Event] = trio.TASK_STATUS_IGNORED
) -> None:
'''
A pause point (more commonly known as a "breakpoint") interrupt
instruction for engaging a blocking debugger instance to
conduct manual console-based-REPL-interaction from within
`tractor`'s async runtime, normally from some single-threaded
and currently executing actor-hosted-`trio`-task in some
(remote) process.
NOTE: we use the semantics "pause" since it better encompasses
the entirety of the necessary global-runtime-state-mutation any
actor-task must access and lock in order to get full isolated
control over the process tree's root TTY:
https://en.wikipedia.org/wiki/Breakpoint
'''
# __tracebackhide__ = True
actor = tractor.current_actor()
pdb, undo_sigint = mk_mpdb()
task_name = trio.lowlevel.current_task().name
if (
not Lock.local_pdb_complete
or Lock.local_pdb_complete.is_set()
):
Lock.local_pdb_complete = trio.Event()
# if shield:
debug_func = partial(
debug_func,
# shield=shield,
)
# def _exit(self, *args, **kwargs):
# __tracebackhide__: bool = True
# super().__exit__(*args, **kwargs)
# trio.CancelScope.__exit__.__tracebackhide__ = True
# import types
# with trio.CancelScope(shield=shield) as cs:
# cs.__exit__ = types.MethodType(_exit, cs)
# cs.__exit__.__tracebackhide__ = True
# TODO: need a more robust check for the "root" actor
if (
not is_root_process()
and actor._parent_chan # a connected child
):
if Lock.local_task_in_debug:
# Recurrence entry case: this task already has the lock and
# is likely recurrently entering a breakpoint
if Lock.local_task_in_debug == task_name:
# noop on recurrent entry case but we want to trigger
# a checkpoint to allow other actors error-propagate and
# potetially avoid infinite re-entries in some subactor.
await trio.lowlevel.checkpoint()
return
# if **this** actor is already in debug mode block here
# waiting for the control to be released - this allows
# support for recursive entries to `tractor.breakpoint()`
log.warning(f"{actor.uid} already has a debug lock, waiting...")
await Lock.local_pdb_complete.wait()
await trio.sleep(0.1)
# mark local actor as "in debug mode" to avoid recurrent
# entries/requests to the root process
Lock.local_task_in_debug = task_name
# this **must** be awaited by the caller and is done using the
# root nursery so that the debugger can continue to run without
# being restricted by the scope of a new task nursery.
# TODO: if we want to debug a trio.Cancelled triggered exception
# we have to figure out how to avoid having the service nursery
# cancel on this task start? I *think* this works below:
# ```python
# actor._service_n.cancel_scope.shield = shield
# ```
# but not entirely sure if that's a sane way to implement it?
try:
with trio.CancelScope(shield=True):
await actor._service_n.start(
wait_for_parent_stdin_hijack,
actor.uid,
)
Lock.repl = pdb
except RuntimeError:
Lock.release()
if actor._cancel_called:
# service nursery won't be usable and we
# don't want to lock up the root either way since
# we're in (the midst of) cancellation.
return
raise
elif is_root_process():
# we also wait in the root-parent for any child that
# may have the tty locked prior
# TODO: wait, what about multiple root tasks acquiring it though?
if Lock.global_actor_in_debug == actor.uid:
# re-entrant root process already has it: noop.
return
# XXX: since we need to enter pdb synchronously below,
# we have to release the lock manually from pdb completion
# callbacks. Can't think of a nicer way then this atm.
if Lock._debug_lock.locked():
log.warning(
'Root actor attempting to shield-acquire active tty lock'
f' owned by {Lock.global_actor_in_debug}')
# must shield here to avoid hitting a ``Cancelled`` and
# a child getting stuck bc we clobbered the tty
with trio.CancelScope(shield=True):
await Lock._debug_lock.acquire()
else:
# may be cancelled
await Lock._debug_lock.acquire()
Lock.global_actor_in_debug = actor.uid
Lock.local_task_in_debug = task_name
Lock.repl = pdb
try:
if debug_func is None:
# assert release_lock_signal, (
# 'Must pass `release_lock_signal: trio.Event` if no '
# 'trace func provided!'
# )
print(f"{actor.uid} ENTERING WAIT")
task_status.started()
# with trio.CancelScope(shield=True):
# await release_lock_signal.wait()
else:
# block here one (at the appropriate frame *up*) where
# ``breakpoint()`` was awaited and begin handling stdio.
log.debug("Entering the synchronous world of pdb")
debug_func(actor, pdb)
except bdb.BdbQuit:
Lock.release()
raise
# XXX: apparently we can't do this without showing this frame
# in the backtrace on first entry to the REPL? Seems like an odd
# behaviour that should have been fixed by now. This is also why
# we scrapped all the @cm approaches that were tried previously.
# finally:
# __tracebackhide__ = True
# # frame = sys._getframe()
# # last_f = frame.f_back
# # last_f.f_globals['__tracebackhide__'] = True
# # signal.signal = pdbp.hideframe(signal.signal)
# TODO: allow pausing from sync code.
# normally by remapping python's builtin breakpoint() hook to this
# runtime aware version which takes care of all .
def pause_from_sync() -> None: def pause_from_sync() -> None:
print("ENTER SYNC PAUSE") print("ENTER SYNC PAUSE")
actor: tractor.Actor = tractor.current_actor(
err_on_no_runtime=False,
)
if actor:
try: try:
import greenback import greenback
__tracebackhide__ = True # __tracebackhide__ = True
actor: tractor.Actor = tractor.current_actor()
# task_can_release_tty_lock = trio.Event() # task_can_release_tty_lock = trio.Event()
# spawn bg task which will lock out the TTY, we poll # spawn bg task which will lock out the TTY, we poll
@ -818,8 +857,11 @@ def pause_from_sync() -> None:
# release_lock_signal=task_can_release_tty_lock, # release_lock_signal=task_can_release_tty_lock,
)) ))
) )
except ModuleNotFoundError: except ModuleNotFoundError:
log.warning('NO GREENBACK FOUND') log.warning('NO GREENBACK FOUND')
else:
log.warning('Not inside actor-runtime')
db, undo_sigint = mk_mpdb() db, undo_sigint = mk_mpdb()
Lock.local_task_in_debug = 'sync' Lock.local_task_in_debug = 'sync'
@ -854,11 +896,7 @@ def pause_from_sync() -> None:
# using the "pause" semantics instead since # using the "pause" semantics instead since
# that better covers actually somewhat "pausing the runtime" # that better covers actually somewhat "pausing the runtime"
# for this particular paralell task to do debugging B) # for this particular paralell task to do debugging B)
pause = partial( # pp = pause # short-hand for "pause point"
_pause,
_set_trace,
)
pp = pause # short-hand for "pause point"
async def breakpoint(**kwargs): async def breakpoint(**kwargs):
@ -891,7 +929,7 @@ def _post_mortem(
post_mortem = partial( post_mortem = partial(
_pause, pause,
_post_mortem, _post_mortem,
) )
@ -1011,3 +1049,56 @@ async def maybe_wait_for_debugger(
log.debug( log.debug(
'Root acquired TTY LOCK' 'Root acquired TTY LOCK'
) )
# TODO: better naming and what additionals?
# - [ ] optional runtime plugging?
# - [ ] detection for sync vs. async code?
# - [ ] specialized REPL entry when in distributed mode?
# - [x] allow ignoring kbi Bo
@cm
def open_crash_handler(
catch: set[BaseException] = {
Exception,
BaseException,
},
ignore: set[BaseException] = {
KeyboardInterrupt,
},
):
'''
Generic "post mortem" crash handler using `pdbp` REPL debugger.
We expose this as a CLI framework addon to both `click` and
`typer` users so they can quickly wrap cmd endpoints which get
automatically wrapped to use the runtime's `debug_mode: bool`
AND `pdbp.pm()` around any code that is PRE-runtime entry
- any sync code which runs BEFORE the main call to
`trio.run()`.
'''
try:
yield
except tuple(catch) as err:
if type(err) not in ignore:
pdbp.xpm()
raise
@cm
def maybe_open_crash_handler(pdb: bool = False):
'''
Same as `open_crash_handler()` but with bool input flag
to allow conditional handling.
Normally this is used with CLI endpoints such that if the --pdb
flag is passed the pdb REPL is engaed on any crashes B)
'''
rtctx = nullcontext
if pdb:
rtctx = open_crash_handler
with rtctx():
yield

136
tractor/devx/cli.py 100644
View File

@ -0,0 +1,136 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
"""
CLI framework extensions for hacking on the actor runtime.
Currently popular frameworks supported are:
- `typer` via the `@callback` API
"""
from __future__ import annotations
from contextlib import (
# asynccontextmanager as acm,
contextmanager as cm,
)
from typing import (
Any,
Callable,
)
from typing_extensions import Annotated
import typer
from ._debug import open_crash_handler
_runtime_vars: dict[str, Any] = {}
def load_runtime_vars(
ctx: typer.Context,
callback: Callable,
pdb: bool = False, # --pdb
ll: Annotated[
str,
typer.Option(
'--loglevel',
'-l',
help='BigD logging level',
),
] = 'cancel', # -l info
):
'''
Maybe engage crash handling with `pdbp` when code inside
a `typer` CLI endpoint cmd raises.
To use this callback simply take your `app = typer.Typer()` instance
and decorate this function with it like so:
.. code:: python
from tractor.devx import cli
app = typer.Typer()
# manual decoration to hook into `click`'s context system!
cli.load_runtime_vars = app.callback(
invoke_without_command=True,
)
And then you can use the now augmented `click` CLI context as so,
.. code:: python
@app.command(
context_settings={
"allow_extra_args": True,
"ignore_unknown_options": True,
}
)
def my_cli_cmd(
ctx: typer.Context,
):
rtvars: dict = ctx.runtime_vars
pdb: bool = rtvars['pdb']
with tractor.devx.cli.maybe_open_crash_handler(pdb=pdb):
trio.run(
partial(
my_tractor_main_task_func,
debug_mode=pdb,
loglevel=rtvars['ll'],
)
)
which will enable log level and debug mode globally for the entire
`tractor` + `trio` runtime thereafter!
Bo
'''
global _runtime_vars
_runtime_vars |= {
'pdb': pdb,
'll': ll,
}
ctx.runtime_vars: dict[str, Any] = _runtime_vars
print(
f'`typer` sub-cmd: {ctx.invoked_subcommand}\n'
f'`tractor` runtime vars: {_runtime_vars}'
)
# XXX NOTE XXX: hackzone.. if no sub-cmd is specified (the
# default if the user just invokes `bigd`) then we simply
# invoke the sole `_bigd()` cmd passing in the "parent"
# typer.Context directly to that call since we're treating it
# as a "non sub-command" or wtv..
# TODO: ideally typer would have some kinda built-in way to get
# this behaviour without having to construct and manually
# invoke our own cmd..
if (
ctx.invoked_subcommand is None
or ctx.invoked_subcommand == callback.__name__
):
cmd: typer.core.TyperCommand = typer.core.TyperCommand(
name='bigd',
callback=callback,
)
ctx.params = {'ctx': ctx}
cmd.invoke(ctx)

View File

@ -31,7 +31,7 @@ from typing import (
Callable, Callable,
) )
from functools import partial from functools import partial
from contextlib import aclosing from async_generator import aclosing
import trio import trio
import wrapt import wrapt

View File

@ -289,19 +289,11 @@ def get_console_log(
if not level: if not level:
return log return log
log.setLevel( log.setLevel(level.upper() if not isinstance(level, int) else level)
level.upper()
if not isinstance(level, int)
else level
)
if not any( if not any(
handler.stream == sys.stderr # type: ignore handler.stream == sys.stderr # type: ignore
for handler in logger.handlers if getattr( for handler in logger.handlers if getattr(handler, 'stream', None)
handler,
'stream',
None,
)
): ):
handler = logging.StreamHandler() handler = logging.StreamHandler()
formatter = colorlog.ColoredFormatter( formatter = colorlog.ColoredFormatter(

View File

@ -43,33 +43,21 @@ IPC-compat cross-mem-boundary object pointer.
# - https://github.com/msgpack/msgpack-python#packingunpacking-of-custom-data-type # - https://github.com/msgpack/msgpack-python#packingunpacking-of-custom-data-type
from __future__ import annotations from __future__ import annotations
from inspect import ( from inspect import isfunction
isfunction,
ismethod,
)
from pkgutil import resolve_name from pkgutil import resolve_name
class NamespacePath(str): class NamespacePath(str):
''' '''
A serializeable `str`-subtype implementing a "namespace A serializeable description of a (function) Python object
pointer" to any Python object reference (like a function) location described by the target's module path and namespace
using the same format as the built-in `pkgutil.resolve_name()` key meant as a message-native "packet" to allows actors to
system. point-and-load objects by an absolute ``str`` (and thus
serializable) reference.
A value describes a target's module-path and namespace-key
separated by a ':' and thus can be easily used as
a IPC-message-native reference-type allowing memory isolated
actors to point-and-load objects via a minimal `str` value.
''' '''
_ref: object | type | None = None _ref: object | type | None = None
# TODO: support providing the ns instance in
# order to support 'self.<meth>` style to make
# `Portal.run_from_ns()` work!
# _ns: ModuleType|type|None = None
def load_ref(self) -> object | type: def load_ref(self) -> object | type:
if self._ref is None: if self._ref is None:
self._ref = resolve_name(self) self._ref = resolve_name(self)
@ -88,22 +76,12 @@ class NamespacePath(str):
''' '''
if ( if (
isfunction(ref) isinstance(ref, object)
and not isfunction(ref)
): ):
name: str = getattr(ref, '__name__')
elif ismethod(ref):
# build out the path manually i guess..?
# TODO: better way?
name: str = '.'.join([
type(ref.__self__).__name__,
ref.__func__.__name__,
])
else: # object or other?
# isinstance(ref, object)
# and not isfunction(ref)
name: str = type(ref).__name__ name: str = type(ref).__name__
else:
name: str = getattr(ref, '__name__')
# fully qualified namespace path, tuple. # fully qualified namespace path, tuple.
fqnp: tuple[str, str] = ( fqnp: tuple[str, str] = (
@ -122,13 +100,5 @@ class NamespacePath(str):
fqnp: tuple[str, str] = cls._mk_fqnp(ref) fqnp: tuple[str, str] = cls._mk_fqnp(ref)
return cls(':'.join(fqnp)) return cls(':'.join(fqnp))
def to_tuple( def to_tuple(self) -> tuple[str, str]:
self, return self._mk_fqnp(self.load_ref())
# TODO: could this work re `self:<meth>` case from above?
# load_ref: bool = True,
) -> tuple[str, str]:
return self._mk_fqnp(
self.load_ref()
)

View File

@ -35,24 +35,6 @@ from msgspec import (
structs, structs,
) )
# TODO: auto-gen type sig for input func both for
# type-msgs and logging of RPC tasks?
# taken and modified from:
# https://stackoverflow.com/a/57110117
# import inspect
# from typing import List
# def my_function(input_1: str, input_2: int) -> list[int]:
# pass
# def types_of(func):
# specs = inspect.getfullargspec(func)
# return_type = specs.annotations['return']
# input_types = [t.__name__ for s, t in specs.annotations.items() if s != 'return']
# return f'{func.__name__}({": ".join(input_types)}) -> {return_type}'
# types_of(my_function)
class DiffDump(UserList): class DiffDump(UserList):
''' '''
@ -179,7 +161,6 @@ class Struct(
# https://docs.python.org/3.11/library/pprint.html#pprint.saferepr # https://docs.python.org/3.11/library/pprint.html#pprint.saferepr
val_str: str = saferepr(v) val_str: str = saferepr(v)
# TODO: LOLOL use `textwrap.indent()` instead dawwwwwg!
obj_str += (field_ws + f'{k}: {typ_name} = {val_str},\n') obj_str += (field_ws + f'{k}: {typ_name} = {val_str},\n')
return ( return (

View File

@ -216,14 +216,7 @@ def _run_asyncio_task(
try: try:
result = await coro result = await coro
except BaseException as aio_err: except BaseException as aio_err:
if isinstance(aio_err, CancelledError): log.exception('asyncio task errored')
log.runtime(
'`asyncio` task was cancelled..\n'
)
else:
log.exception(
'`asyncio` task errored\n'
)
chan._aio_err = aio_err chan._aio_err = aio_err
raise raise
@ -278,22 +271,12 @@ def _run_asyncio_task(
except BaseException as terr: except BaseException as terr:
task_err = terr task_err = terr
msg: str = (
'Infected `asyncio` task {etype_str}\n'
f'|_{task}\n'
)
if isinstance(terr, CancelledError): if isinstance(terr, CancelledError):
log.cancel( log.cancel(f'`asyncio` task cancelled: {task.get_name()}')
msg.format(etype_str='cancelled')
)
else: else:
log.exception( log.exception(f'`asyncio` task: {task.get_name()} errored')
msg.format(etype_str='cancelled')
)
assert type(terr) is type(aio_err), ( assert type(terr) is type(aio_err), 'Asyncio task error mismatch?'
'`asyncio` task error mismatch?!?'
)
if aio_err is not None: if aio_err is not None:
# XXX: uhh is this true? # XXX: uhh is this true?
@ -306,23 +289,19 @@ def _run_asyncio_task(
# We might want to change this in the future though. # We might want to change this in the future though.
from_aio.close() from_aio.close()
if task_err is None: if type(aio_err) is CancelledError:
assert aio_err log.cancel("infected task was cancelled")
aio_err.with_traceback(aio_err.__traceback__)
# log.error(
# 'infected task errorred'
# )
# TODO: show that the cancellation originated # TODO: show that the cancellation originated
# from the ``trio`` side? right? # from the ``trio`` side? right?
# elif type(aio_err) is CancelledError:
# log.cancel(
# 'infected task was cancelled'
# )
# if cancel_scope.cancelled: # if cancel_scope.cancelled:
# raise aio_err from err # raise aio_err from err
elif task_err is None:
assert aio_err
aio_err.with_traceback(aio_err.__traceback__)
log.error('infected task errorred')
# XXX: alway cancel the scope on error # XXX: alway cancel the scope on error
# in case the trio task is blocking # in case the trio task is blocking
# on a checkpoint. # on a checkpoint.

View File

@ -26,6 +26,7 @@ from contextlib import asynccontextmanager
from functools import partial from functools import partial
from operator import ne from operator import ne
from typing import ( from typing import (
Optional,
Callable, Callable,
Awaitable, Awaitable,
Any, Any,
@ -44,11 +45,6 @@ from tractor.log import get_logger
log = get_logger(__name__) log = get_logger(__name__)
# TODO: use new type-vars syntax from 3.12
# https://realpython.com/python312-new-features/#dedicated-type-variable-syntax
# https://docs.python.org/3/whatsnew/3.12.html#whatsnew312-pep695
# https://docs.python.org/3/reference/simple_stmts.html#type
#
# A regular invariant generic type # A regular invariant generic type
T = TypeVar("T") T = TypeVar("T")
@ -114,7 +110,7 @@ class BroadcastState(Struct):
# broadcast event to wake up all sleeping consumer tasks # broadcast event to wake up all sleeping consumer tasks
# on a newly produced value from the sender. # on a newly produced value from the sender.
recv_ready: tuple[int, trio.Event]|None = None recv_ready: Optional[tuple[int, trio.Event]] = None
# if a ``trio.EndOfChannel`` is received on any # if a ``trio.EndOfChannel`` is received on any
# consumer all consumers should be placed in this state # consumer all consumers should be placed in this state
@ -168,7 +164,7 @@ class BroadcastReceiver(ReceiveChannel):
rx_chan: AsyncReceiver, rx_chan: AsyncReceiver,
state: BroadcastState, state: BroadcastState,
receive_afunc: Callable[[], Awaitable[Any]]|None = None, receive_afunc: Optional[Callable[[], Awaitable[Any]]] = None,
raise_on_lag: bool = True, raise_on_lag: bool = True,
) -> None: ) -> None:
@ -456,7 +452,7 @@ def broadcast_receiver(
recv_chan: AsyncReceiver, recv_chan: AsyncReceiver,
max_buffer_size: int, max_buffer_size: int,
receive_afunc: Callable[[], Awaitable[Any]]|None = None, receive_afunc: Optional[Callable[[], Awaitable[Any]]] = None,
raise_on_lag: bool = True, raise_on_lag: bool = True,
) -> BroadcastReceiver: ) -> BroadcastReceiver:

View File

@ -33,9 +33,10 @@ from typing import (
) )
import trio import trio
from trio_typing import TaskStatus
from tractor._state import current_actor from .._state import current_actor
from tractor.log import get_logger from ..log import get_logger
log = get_logger(__name__) log = get_logger(__name__)
@ -183,7 +184,7 @@ class _Cache:
cls, cls,
mng, mng,
ctx_key: tuple, ctx_key: tuple,
task_status: trio.TaskStatus[T] = trio.TASK_STATUS_IGNORED, task_status: TaskStatus[T] = trio.TASK_STATUS_IGNORED,
) -> None: ) -> None:
async with mng as value: async with mng as value: