Compare commits

...

101 Commits

Author SHA1 Message Date
goodboy e5ee2e3de8
Merge pull request #358 from goodboy/switch_to_pdbp
Switch to `pdbp` 🏄🏼
2023-05-15 09:58:58 -04:00
Tyler Goodlet 41aa91c8eb Add news file 2023-05-15 09:35:59 -04:00
Tyler Goodlet 6758e4487c Drop lingering `pdbpp` comment-refs in tests 2023-05-15 09:14:42 -04:00
Tyler Goodlet 1c3893a383 Drop commented `pdbpp` import logic 2023-05-15 09:01:55 -04:00
Tyler Goodlet 73befac9bc Switch to `pdbp` in test reqs 2023-05-15 09:01:27 -04:00
Tyler Goodlet 79622bbeea Restore `breakpoint()` hook after runtime exits
Previously we were leaking our (pdb++) override into the Python runtime
which would always result in a runtime error whenever `breakpoint()` is
called outside our runtime; after exit of the root actor . This
explicitly restores any previous hook override (detected during startup)
or deletes the hook and restores the environment if none existed prior.

Also adds a new WIP debugging example script to ensure breakpointing
works as normal after runtime close; this will be added to the test
suite.
2023-05-15 00:47:29 -04:00
Tyler Goodlet 95535b2226 Some more 3.10+ optional type sigs 2023-05-15 00:47:29 -04:00
Tyler Goodlet 87c6e09d6b Switch readme links to point @ `pdbp` B) 2023-05-14 22:52:24 -04:00
Tyler Goodlet 9ccd3a74b6 More detailed preface description 2023-05-14 22:38:47 -04:00
Tyler Goodlet ae4ff5dc8d pdbp: adding typing to config settings vars 2023-05-14 22:38:46 -04:00
Tyler Goodlet 705538398f `pdbp`: turn off line truncating by default, fixes terminal resizing stuff 2023-05-14 22:38:16 -04:00
Tyler Goodlet 86aef5238d Hide actor nursery exit frame 2023-05-14 21:24:26 -04:00
Tyler Goodlet cc82447db6 First try: switch debug machinery over to `pdbp` B) 2023-05-14 21:24:26 -04:00
Tyler Goodlet 23cffbd940 Use multiline import for debug mod 2023-05-14 21:24:26 -04:00
Tyler Goodlet 3d202272c4 Change over debugger tests to use `PROMPT` var.. 2023-05-14 21:24:26 -04:00
Tyler Goodlet 63cdb0891f Switch to `pdbp` since noone is maintaining `pdbpp` 2023-05-14 21:24:26 -04:00
goodboy 0f7db27b68
Merge pull request #356 from goodboy/drop_proc_actxmngr
`trio.Process.aclose()`?
2023-05-14 20:59:53 -04:00
Tyler Goodlet c53d62d2f7 Add news file 2023-05-14 20:31:26 -04:00
Tyler Goodlet f667d16d66 Copy the now deprecated `trio.Process.aclose()`
Move it into our `_spawn.do_hard_kill()` since we do indeed rely on
the particular process killing sequence on "soft kill" failure cases.
2023-05-14 19:31:50 -04:00
Tyler Goodlet 24a062341e Just call `trio.Process.aclose()` directly for now? 2023-04-02 14:34:41 -04:00
goodboy e714bec8db
Merge pull request #355 from kehrazy/patch-1
fixed the `Zombie` example having wrong indentation
2023-04-01 12:11:47 -04:00
Igor 009cd6552e
fixed the `Zombie` example having wrong indentation 2023-03-31 17:50:46 +03:00
goodboy 649c5e7504
Merge pull request #343 from goodboy/breceiver_internals
Avoid inf recursion in `BroadcastReceiver.receive()`
2023-01-30 14:01:13 -05:00
Tyler Goodlet 203f95615c Add nooz 2023-01-30 12:42:26 -05:00
Tyler Goodlet efb8bec828 Add a basic no-raise-on lag test 2023-01-30 12:26:07 -05:00
Tyler Goodlet 8637778739 Expose `raise_on_lag: bool` flag through factory 2023-01-30 12:18:23 -05:00
Tyler Goodlet 47166e45f0 Be explicit with passthrough kwargs (there's so few) 2023-01-29 17:31:21 -05:00
Tyler Goodlet 4ce2dcd12b Switch back to raising `Lagged` by default
Makes the broadcast test suite not hang xD, and is our expected default
behaviour. Also removes a ton of commented legacy cruft from before the
refactor to remove the `.receive()` recursion and fixes some typing.

Oh right, and in the case where there's only one subscriber left we warn
log about it since in theory we could actually entirely unwind the
bcaster back to the original underlying, though not sure if that's sane
or works for some use cases (like wanting to have some other subscriber
get added dynamically later).
2023-01-29 15:03:34 -05:00
Tyler Goodlet 80f983818f Ignore monkey patched `.send()` type annot 2023-01-29 15:03:34 -05:00
Tyler Goodlet 6ba29f8d56 Recurse and get the last value when in warn mode 2023-01-29 15:03:34 -05:00
Tyler Goodlet 2707a0e971 Add `._raise_on_lag` flag to disable `Lag` raising 2023-01-29 15:03:34 -05:00
Tyler Goodlet c8efcdd0d3 Drop `ReceiveMsgStream` from test suite 2023-01-29 15:03:34 -05:00
Tyler Goodlet 9f9907271b Merge `ReceiveMsgStream` and `MsgStream`
Since one-way streaming can be accomplished by just *not* sending on one
side (and/or thus wrapping such usage in a more restrictive API), we
just drop the recv-only parent type. The only method different was
`MsgStream.send()`, now merged in. Further in usage of `.subscribe()`
we monkey patch the underlying stream's `.send()` onto the delivered
broadcast receiver so that subscriber tasks can two-way stream as though
using the stream directly.

This allows us to more definitively drop `tractor.open_stream_from()` in
the longer run if we so choose as well; note currently this will
potentially create an issue if a caller tries to `.send()` on such a one
way stream.
2023-01-29 15:03:34 -05:00
Tyler Goodlet c2367c1c5e Better `trio`-ize `BroadcastReceiver` internals
Driven by a bug found in `piker` where we'd get an inf recursion error
due to `BroadcastReceiver.receive()` being called when consumer tasks
are awoken but no value is ready to `.nowait_receive()`.

This new rework takes an approach closer to the interface and internals
of `trio.MemoryReceiveChannel` particularly in terms of,

- implementing a `BroadcastReceiver.receive_nowait()` and using it
  within the async `.receive()`.
- failing over to an internal `._receive_from_underlying()` when the
  `_nowait()` call raises `trio.WouldBlock`.
- adding `BroadcastState.statistics()` for debugging and testing
  dropping recursion from `.receive()`.
2023-01-29 15:03:34 -05:00
goodboy a777217674
Merge pull request #346 from goodboy/ipc_failure_while_streaming
Ipc failure while streaming
2023-01-29 15:02:54 -05:00
Tyler Goodlet 13c9eadc8f Move result log msg up and drop else block 2023-01-29 14:55:02 -05:00
Tyler Goodlet af6c325072 Bump up legacy streaming timeout a smidgen 2023-01-29 14:55:02 -05:00
Tyler Goodlet 195d2f0ed4 Add nooz 2023-01-29 14:55:02 -05:00
Tyler Goodlet aa4871b13d Call `MsgStream.aclose()` in `Context.open_stream.__aexit__()`
We weren't doing this originally I *think* just because of the path
dependent nature of the way the code was developed (originally being
mega pedantic about one-way vs. bidirectional streams) but, it doesn't
seem like there's any issue just calling the stream's `.aclose()`; also
have the benefit of just being less code and logic checks B)
2023-01-29 14:55:02 -05:00
Tyler Goodlet 556f4626db Tweak warning msg for still-alive-after-cancelled actor 2023-01-29 14:55:02 -05:00
Tyler Goodlet 3967c0ed9e Add a simplified zombie lord specific process reaping test 2023-01-29 14:55:02 -05:00
Tyler Goodlet e34823aab4 Add parent vs. child cancels first cases 2023-01-29 14:55:02 -05:00
Tyler Goodlet 6c35ba2cb6 Add IPC breakage on both parent and child side
With the new fancy `_pytest.pathlib.import_path()` we can do real
parametrization of the example-script-module code and thus configure
whether the child, parent, or both silently break the IPC connection.

Parametrize the test for all the above mentioned cases as well as the
case where the IPC never breaks but we still simulate the user hammering
ctl-c / SIGINT to terminate the actor tree. Adjust expected errors based
on each case and heavily document each of these.
2023-01-29 14:55:02 -05:00
Tyler Goodlet 3a0817ff55 Skip `advanced_faults/` subset in docs examples tests 2023-01-29 14:55:02 -05:00
Tyler Goodlet 7fddb4416b Handle `mp` spawn method cases in test suite 2023-01-29 14:55:02 -05:00
Tyler Goodlet 1d92f2552a Adjust other examples tests to expect `pathlib` objects 2023-01-29 14:55:02 -05:00
Tyler Goodlet 4f8586a928 Wrap ex in new test, change dir helpers to use `pathlib.Path` 2023-01-29 14:55:02 -05:00
Tyler Goodlet fb9ff45745 Move example to a new `advanced_faults` egs subset dir 2023-01-29 14:55:02 -05:00
Tyler Goodlet 36a83cb306 Refine example to drop IPC mid-stream
Use a task nursery in the subactor to spawn tasks which cancel the IPC
channel mid stream to simulate the most concurrent case we're likely to
see. Make `main()` accept a `debug_mode: bool` for parametrization. Fill
out detailed comments/docs on this example.
2023-01-29 14:55:02 -05:00
Tyler Goodlet 7394a187e0 Name one-way streaming (con generators) what it is 2023-01-29 14:55:02 -05:00
Tyler Goodlet df01294bb2 Show more functiony syntax in ctx-cancelled log msgs 2023-01-29 14:55:02 -05:00
Tyler Goodlet ddf3d0d1b3 Show tracebacks for un-shipped/propagated errors 2023-01-29 14:55:02 -05:00
Tyler Goodlet 158569adae Add WIP example of silent IPC breaks while streaming 2023-01-29 14:55:02 -05:00
Tyler Goodlet 97d5f7233b Fix uid2nursery lookup table type annot 2023-01-29 14:55:02 -05:00
Tyler Goodlet d27c081a15 Ensure arbiter sockaddr type before usage 2023-01-29 14:55:02 -05:00
Tyler Goodlet a4874a3227 Always set the `parent_exit: trio.Event` on exit 2023-01-29 14:55:02 -05:00
Tyler Goodlet de04bbb2bb Don't raise on a broken IPC-context when sending stop msg 2023-01-29 14:55:02 -05:00
Tyler Goodlet 4f977189c0 Handle broken mem chan on `Actor._push_result()`
When backpressure is used and a feeder mem chan breaks during msg
delivery (usually because the IPC allocating task already terminated)
instead of raising we simply warn as we do for the non-backpressure
case.

Also, add a proper `Actor.is_arbiter` test inside `._invoke()` to avoid
doing an arbiter-registry lookup if the current actor **is** the
registrar.
2023-01-29 14:55:02 -05:00
goodboy 9fd62cf71f
Merge pull request #348 from goodboy/deprecate_arbiter_addr
Begin deprecation of `arbiter_addr` -> `registry_addr`
2023-01-26 16:05:41 -05:00
Tyler Goodlet 606efa5bb7 Adjust daemon command to use new `registry_addr` 2023-01-26 16:00:08 -05:00
Tyler Goodlet 121a8cc891 Drop `Optional` usage from root mod 2023-01-26 16:00:08 -05:00
Tyler Goodlet c54b8ca4ba Begin deprecation of `arbiter_addr` -> `registry_addr` 2023-01-26 16:00:08 -05:00
goodboy de93c8257c
Merge pull request #349 from goodboy/prompt_on_ctrlc
Re-draw `pdbpp` prompt on `SIGINT`
2023-01-26 15:56:37 -05:00
Tyler Goodlet 5b8a87d0f6 Slightly better `xonsh` check hack, fix typing 2023-01-26 15:48:15 -05:00
Tyler Goodlet 9e5c8ce6f6 Add nooz file 2023-01-26 15:39:03 -05:00
Tyler Goodlet 965cd406a2 Use std `pdbpp` release 2023-01-26 15:27:55 -05:00
Tyler Goodlet 2e278ceb74 Add a super hacky check for `xonsh`, smh.. 2023-01-26 15:26:43 -05:00
Tyler Goodlet 6d124db7c9 Never run ctlc-with-intermediary-actor cases locally either 2023-01-26 12:44:13 -05:00
Tyler Goodlet dba8118553 Always attempt prompt redraw on ctl-c in REPL
The stdlib has all sorts of muckery with ignoring SIGINT in the
`Pdb._cmdloop()` but here we just override all that since we don't trust
their decisions about cancellation handling whatsoever. Adds
a `Lock.repl: MultiActorPdb` attr which is set by any task which
acquires root TTY lock indicating (via actor global state) that the
current actor is using the debugger REPL and can be expected to re-draw
the prompt on SIGINT. Further we mask out log messages from any actor
who also has the `shield_sigint_handler()` enabled to avoid logging
noise when debugging.
2023-01-26 12:44:13 -05:00
Tyler Goodlet fca2e7c10e Simplify closed abruptly log msg 2023-01-26 12:44:13 -05:00
Tyler Goodlet 5ed62c5c54 Add note about intermediary-actor in debug issue 2023-01-26 12:44:13 -05:00
goodboy 588b7ca7bf
Merge pull request #344 from goodboy/harden_cluster_tests
Harden cluster tests
2022-12-12 15:02:23 -05:00
Tyler Goodlet d8214735b9 Add bugfix nooz 2022-12-12 14:53:59 -05:00
Tyler Goodlet 48f6d514ef Handle earlier name error crash in debug test 2022-12-12 14:05:32 -05:00
Tyler Goodlet 6c8cacc9d1 Adjust all default is `None` annots (per new `mypy`) 2022-12-12 13:18:22 -05:00
Tyler Goodlet 38326e8c15 Avoid error on context double pops 2022-12-11 23:46:33 -05:00
Tyler Goodlet b5192cca8e Always greedily `list`-cast`mngrs` input sequence 2022-12-11 23:20:58 -05:00
Tyler Goodlet c606be8c64 Passthrough runtime kwargs from `open_actor_cluster()` 2022-12-11 19:56:08 -05:00
Tyler Goodlet d8e48e29ba Add `mngrs=(<gen_comprehension>)` test 2022-12-11 19:56:01 -05:00
goodboy a0f6668ce8
Merge pull request #333 from goodboy/exceptiongroups
`ExceptiongGroup`s and `trio>=0.22`
2022-10-14 20:11:26 -04:00
Tyler Goodlet 274c66cf9d Add nooz 2022-10-14 19:42:23 -04:00
Tyler Goodlet f2641c8964 Avoid "task never called `.started()`" runtime erros when cancelling 2022-10-14 19:42:23 -04:00
Tyler Goodlet c47575997a Expand nested case to include error prop and breakpointing 2022-10-14 19:42:23 -04:00
Tyler Goodlet f39414ce12 Drop error-repacking for `.run_in_actor()`s block
If we pack the nursery parent task's error into the `errors` table
directly in the handler, we don't need to specially handle packing that
same error into any exception group raised while handling sub-actor
cancellation; drops some ugly indentation ;)
2022-10-14 19:42:23 -04:00
Tyler Goodlet 0a1bf8e57d Tolerate eg in runtime test teardown 2022-10-14 19:42:23 -04:00
Tyler Goodlet e298b70edf Drop added `.pdp()` level msgs used duringn dev 2022-10-14 19:42:23 -04:00
Tyler Goodlet c0dd5d7ffc Adjust multi-daemon test to be more deterministic 2022-10-14 19:42:23 -04:00
Tyler Goodlet 347591c348 Expect egs in tests which retreive portal results 2022-10-14 19:42:23 -04:00
Tyler Goodlet 38f9d35dee Fix errors table type annot 2022-10-14 19:42:23 -04:00
Tyler Goodlet 88448f7281 Fix handler type annot 2022-10-14 19:42:23 -04:00
Tyler Goodlet 0956d5f461 Restore the `trio` SIGINT handler, cancel root lock tasks on no-peers
Pretty sure this is the final touch to alleviate all our debug lock
headaches! Instead of trying to revert to the "last" handler (as `pdb`
does internally in the stdlib) we always just revert to the handler
`trio` registers during startup. Further this seems to allow cancelling
the root-side locking task if it's detected as stale IFF we only do this
when the root actor is in a "no more IPC peers" state.

Deatz:
- (always) set `._debug.Lock._trio_handler` as the `trio` version, not
  some last used handler to make sure we're getting the ctrl-c handling
  we want when not in debug mode.
- assign the trio handler in `open_root_actor()`
  `._runtime._async_main()` to be sure it's applied in subactors as well
  as the root.
- only do debug lock blocking and root-side-locking-task cancels when
  a "no peers" condition is detected in the root actor: i.e. no IPC
  channels are detected by the root meaning it's impossible any actor
  has a sane lock-state ongoing for debug mode.
2022-10-14 18:18:01 -04:00
Tyler Goodlet c646c79a82 Adjust root-errors debug tests for blocking and egs 2022-10-14 18:18:01 -04:00
Tyler Goodlet 33f2234baf Hide some stack layers the user doesn't really need to see 2022-10-14 18:18:01 -04:00
Tyler Goodlet 7521bded3d Pack error from the parent task into the actor nursery 2022-10-14 18:16:51 -04:00
Tyler Goodlet 0f523b65fb Change cancel test over the exception group 2022-10-14 18:16:51 -04:00
Tyler Goodlet 50fe098e06 First pass, swap `MultiError` for `BaseExceptionGroup` 2022-10-14 18:16:51 -04:00
Tyler Goodlet d87d6af7e1 Add `exceptiongroup` (3.11 backport lib) as dep 2022-10-14 18:16:51 -04:00
Tyler Goodlet df69aedcd5 Pin to latest `trio` version 2022-10-14 18:16:51 -04:00
Tyler Goodlet b15e4ed9ce Adjust "no arbiter" test for new runtime defaults
Turns out this test was being silently ignored due to incorrect usage of
sync opening of our `.open_nursery()` block (with a `with` not `async
with`) and thus was an noop XD

Instead this fixes the test to call a `tractor` discovery built-in
without starting the runtime (which is now done implicitly when a user
opens a nursery) which should result in the prior expected outcome,
a `RuntimeError`.
2022-10-12 12:46:20 -04:00
Tyler Goodlet 98056f6ed7 Move logging context map into `log.py` module 2022-10-12 12:46:20 -04:00
goodboy 247d3448ae
Merge pull request #337 from goodboy/debug_lock_blocking
Debug lock blocking
2022-10-12 12:41:14 -04:00
44 changed files with 1695 additions and 686 deletions

View File

@ -6,8 +6,14 @@
``tractor`` is a `structured concurrent`_, multi-processing_ runtime
built on trio_.
Fundamentally ``tractor`` gives you parallelism via ``trio``-"*actors*":
our nurseries_ let you spawn new Python processes which each run a ``trio``
Fundamentally, ``tractor`` gives you parallelism via
``trio``-"*actors*": independent Python processes (aka
non-shared-memory threads) which maintain structured
concurrency (SC) *end-to-end* inside a *supervision tree*.
Cross-process (and thus cross-host) SC is accomplished through the
combined use of our "actor nurseries_" and an "SC-transitive IPC
protocol" constructed on top of multiple Pythons each running a ``trio``
scheduled runtime - a call to ``trio.run()``.
We believe the system adheres to the `3 axioms`_ of an "`actor model`_"
@ -23,7 +29,8 @@ Features
- **It's just** a ``trio`` API
- *Infinitely nesteable* process trees
- Builtin IPC streaming APIs with task fan-out broadcasting
- A (first ever?) "native" multi-core debugger UX for Python using `pdb++`_
- A "native" multi-core debugger REPL using `pdbp`_ (a fork & fix of
`pdb++`_ thanks to @mdmintz!)
- Support for a swappable, OS specific, process spawning layer
- A modular transport stack, allowing for custom serialization (eg. with
`msgspec`_), communications protocols, and environment specific IPC
@ -149,7 +156,7 @@ it **is a bug**.
"Native" multi-process debugging
--------------------------------
Using the magic of `pdb++`_ and our internal IPC, we've
Using the magic of `pdbp`_ and our internal IPC, we've
been able to create a native feeling debugging experience for
any (sub-)process in your ``tractor`` tree.
@ -597,6 +604,7 @@ channel`_!
.. _adherance to: https://www.youtube.com/watch?v=7erJ1DV_Tlo&t=1821s
.. _trio gitter channel: https://gitter.im/python-trio/general
.. _matrix channel: https://matrix.to/#/!tractor:matrix.org
.. _pdbp: https://github.com/mdmintz/pdbp
.. _pdb++: https://github.com/pdbpp/pdbpp
.. _guest mode: https://trio.readthedocs.io/en/stable/reference-lowlevel.html?highlight=guest%20mode#using-guest-mode-to-run-trio-on-top-of-other-event-loops
.. _messages: https://en.wikipedia.org/wiki/Message_passing

View File

@ -0,0 +1,151 @@
'''
Complex edge case where during real-time streaming the IPC tranport
channels are wiped out (purposely in this example though it could have
been an outage) and we want to ensure that despite being in debug mode
(or not) the user can sent SIGINT once they notice the hang and the
actor tree will eventually be cancelled without leaving any zombies.
'''
import trio
from tractor import (
open_nursery,
context,
Context,
MsgStream,
)
async def break_channel_silently_then_error(
stream: MsgStream,
):
async for msg in stream:
await stream.send(msg)
# XXX: close the channel right after an error is raised
# purposely breaking the IPC transport to make sure the parent
# doesn't get stuck in debug or hang on the connection join.
# this more or less simulates an infinite msg-receive hang on
# the other end.
await stream._ctx.chan.send(None)
assert 0
async def close_stream_and_error(
stream: MsgStream,
):
async for msg in stream:
await stream.send(msg)
# wipe out channel right before raising
await stream._ctx.chan.send(None)
await stream.aclose()
assert 0
@context
async def recv_and_spawn_net_killers(
ctx: Context,
break_ipc_after: bool | int = False,
) -> None:
'''
Receive stream msgs and spawn some IPC killers mid-stream.
'''
await ctx.started()
async with (
ctx.open_stream() as stream,
trio.open_nursery() as n,
):
async for i in stream:
print(f'child echoing {i}')
await stream.send(i)
if (
break_ipc_after
and i > break_ipc_after
):
'#################################\n'
'Simulating child-side IPC BREAK!\n'
'#################################'
n.start_soon(break_channel_silently_then_error, stream)
n.start_soon(close_stream_and_error, stream)
async def main(
debug_mode: bool = False,
start_method: str = 'trio',
# by default we break the parent IPC first (if configured to break
# at all), but this can be changed so the child does first (even if
# both are set to break).
break_parent_ipc_after: int | bool = False,
break_child_ipc_after: int | bool = False,
) -> None:
async with (
open_nursery(
start_method=start_method,
# NOTE: even debugger is used we shouldn't get
# a hang since it never engages due to broken IPC
debug_mode=debug_mode,
loglevel='warning',
) as an,
):
portal = await an.start_actor(
'chitty_hijo',
enable_modules=[__name__],
)
async with portal.open_context(
recv_and_spawn_net_killers,
break_ipc_after=break_child_ipc_after,
) as (ctx, sent):
async with ctx.open_stream() as stream:
for i in range(1000):
if (
break_parent_ipc_after
and i > break_parent_ipc_after
):
print(
'#################################\n'
'Simulating parent-side IPC BREAK!\n'
'#################################'
)
await stream._ctx.chan.send(None)
# it actually breaks right here in the
# mp_spawn/forkserver backends and thus the zombie
# reaper never even kicks in?
print(f'parent sending {i}')
await stream.send(i)
with trio.move_on_after(2) as cs:
# NOTE: in the parent side IPC failure case this
# will raise an ``EndOfChannel`` after the child
# is killed and sends a stop msg back to it's
# caller/this-parent.
rx = await stream.receive()
print(f"I'm a happy user and echoed to me is {rx}")
if cs.cancelled_caught:
# pretend to be a user seeing no streaming action
# thinking it's a hang, and then hitting ctl-c..
print("YOO i'm a user anddd thingz hangin..")
print(
"YOO i'm mad send side dun but thingz hangin..\n"
'MASHING CTlR-C Ctl-c..'
)
raise KeyboardInterrupt
if __name__ == '__main__':
trio.run(main)

View File

@ -27,6 +27,17 @@ async def main():
# retreive results
async with p0.open_stream_from(breakpoint_forever) as stream:
# triggers the first name error
try:
await p1.run(name_error)
except tractor.RemoteActorError as rae:
assert rae.type is NameError
async for i in stream:
# a second time try the failing subactor and this tie
# let error propagate up to the parent/nursery.
await p1.run(name_error)

View File

@ -12,18 +12,31 @@ async def breakpoint_forever():
while True:
await tractor.breakpoint()
# NOTE: if the test never sent 'q'/'quit' commands
# on the pdb repl, without this checkpoint line the
# repl would spin in this actor forever.
# await trio.sleep(0)
async def spawn_until(depth=0):
""""A nested nursery that triggers another ``NameError``.
"""
async with tractor.open_nursery() as n:
if depth < 1:
# await n.run_in_actor('breakpoint_forever', breakpoint_forever)
await n.run_in_actor(
await n.run_in_actor(breakpoint_forever)
p = await n.run_in_actor(
name_error,
name='name_error'
)
await trio.sleep(0.5)
# rx and propagate error from child
await p.result()
else:
# recusrive call to spawn another process branching layer of
# the tree
depth -= 1
await n.run_in_actor(
spawn_until,
@ -53,6 +66,7 @@ async def main():
"""
async with tractor.open_nursery(
debug_mode=True,
# loglevel='cancel',
) as n:
# spawn both actors
@ -67,8 +81,16 @@ async def main():
name='spawner1',
)
# TODO: test this case as well where the parent don't see
# the sub-actor errors by default and instead expect a user
# ctrl-c to kill the root.
with trio.move_on_after(3):
await trio.sleep_forever()
# gah still an issue here.
await portal.result()
# should never get here
await portal1.result()

View File

@ -0,0 +1,24 @@
import os
import sys
import trio
import tractor
async def main() -> None:
async with tractor.open_nursery(debug_mode=True) as an:
assert os.environ['PYTHONBREAKPOINT'] == 'tractor._debug._set_trace'
# TODO: an assert that verifies the hook has indeed been, hooked
# XD
assert sys.breakpointhook is not tractor._debug._set_trace
breakpoint()
# TODO: an assert that verifies the hook is unhooked..
assert sys.breakpointhook
breakpoint()
if __name__ == '__main__':
trio.run(main)

View File

@ -0,0 +1,25 @@
Add support for ``trio >= 0.22`` and support for the new Python 3.11
``[Base]ExceptionGroup`` from `pep 654`_ via the backported
`exceptiongroup`_ package and some final fixes to the debug mode
subsystem.
This port ended up driving some (hopefully) final fixes to our debugger
subsystem including the solution to all lingering stdstreams locking
race-conditions and deadlock scenarios. This includes extending the
debugger tests suite as well as cancellation and ``asyncio`` mode cases.
Some of the notable details:
- always reverting to the ``trio`` SIGINT handler when leaving debug
mode.
- bypassing child attempts to acquire the debug lock when detected
to be amdist actor-runtime-cancellation.
- allowing the root actor to cancel local but IPC-stale subactor
requests-tasks for the debug lock when in a "no IPC peers" state.
Further we refined our ``ActorNursery`` semantics to be more similar to
``trio`` in the sense that parent task errors are always packed into the
actor-nursery emitted exception group and adjusted all tests and
examples accordingly.
.. _pep 654: https://peps.python.org/pep-0654/#handling-exception-groups
.. _exceptiongroup: https://github.com/python-trio/exceptiongroup

View File

@ -0,0 +1,19 @@
Rework our ``.trionics.BroadcastReceiver`` internals to avoid method
recursion and approach a design and interface closer to ``trio``'s
``MemoryReceiveChannel``.
The details of the internal changes include:
- implementing a ``BroadcastReceiver.receive_nowait()`` and using it
within the async ``.receive()`` thus avoiding recursion from
``.receive()``.
- failing over to an internal ``._receive_from_underlying()`` when the
``_nowait()`` call raises ``trio.WouldBlock``
- adding ``BroadcastState.statistics()`` for debugging and testing both
internals and by users.
- add an internal ``BroadcastReceiver._raise_on_lag: bool`` which can be
set to avoid ``Lagged`` raising for possible use cases where a user
wants to choose between a [cheap or nasty
pattern](https://zguide.zeromq.org/docs/chapter7/#The-Cheap-or-Nasty-Pattern)
the the particular stream (we use this in ``piker``'s dark clearing
engine to avoid fast feeds breaking during HFT periods).

View File

@ -0,0 +1,11 @@
Always ``list``-cast the ``mngrs`` input to
``.trionics.gather_contexts()`` and ensure its size otherwise raise
a ``ValueError``.
Turns out that trying to pass an inline-style generator comprehension
doesn't seem to work inside the ``async with`` expression? Further, in
such a case we can get a hang waiting on the all-entered event
completion when the internal mngrs iteration is a noop. Instead we
always greedily check a size and error on empty input; the lazy
iteration of a generator input is not beneficial anyway since we're
entering all manager instances in concurrent tasks.

View File

@ -0,0 +1,15 @@
Fixes to ensure IPC (channel) breakage doesn't result in hung actor
trees; the zombie reaping and general supervision machinery will always
clean up and terminate.
This includes not only the (mostly minor) fixes to solve these cases but
also a new extensive test suite in `test_advanced_faults.py` with an
accompanying highly configurable example module-script in
`examples/advanced_faults/ipc_failure_during_stream.py`. Tests ensure we
never get hang or zombies despite operating in debug mode and attempt to
simulate all possible IPC transport failure cases for a local-host actor
tree.
Further we simplify `Context.open_stream.__aexit__()` to just call
`MsgStream.aclose()` directly more or less avoiding a pure duplicate
code path.

View File

@ -0,0 +1,10 @@
Always redraw the `pdbpp` prompt on `SIGINT` during REPL use.
There was recent changes todo with Python 3.10 that required us to pin
to a specific commit in `pdbpp` which have recently been fixed minus
this last issue with `SIGINT` shielding: not clobbering or not
showing the `(Pdb++)` prompt on ctlr-c by the user. This repairs all
that by firstly removing the standard KBI intercepting of the std lib's
`pdb.Pdb._cmdloop()` as well as ensuring that only the actor with REPL
control ever reports `SIGINT` handler log msgs and prompt redraws. With
this we move back to using pypi `pdbpp` release.

View File

@ -0,0 +1,7 @@
Drop `trio.Process.aclose()` usage, copy into our spawning code.
The details are laid out in https://github.com/goodboy/tractor/issues/330.
`trio` changed is process running quite some time ago, this just copies
out the small bit we needed (from the old `.aclose()`) for hard kills
where a soft runtime cancel request fails and our "zombie killer"
implementation kicks in.

View File

@ -0,0 +1,15 @@
Switch to using the fork & fix of `pdb++`, `pdbp`:
https://github.com/mdmintz/pdbp
Allows us to sidestep a variety of issues that aren't being maintained
in the upstream project thanks to the hard work of @mdmintz!
We also include some default settings adjustments as per recent
development on the fork:
- sticky mode is still turned on by default but now activates when
a using the `ll` repl command.
- turn off line truncation by default to avoid inter-line gaps when
resizing the terimnal during use.
- when using the backtrace cmd either by `w` or `bt`, the config
automatically switches to non-sticky mode.

View File

@ -1,7 +1,7 @@
pytest
pytest-trio
pytest-timeout
pdbpp
pdbp
mypy
trio_typing
pexpect

View File

@ -26,12 +26,12 @@ with open('docs/README.rst', encoding='utf-8') as f:
setup(
name="tractor",
version='0.1.0a6dev0', # alpha zone
description='structured concurrrent "actors"',
description='structured concurrrent `trio`-"actors"',
long_description=readme,
license='AGPLv3',
author='Tyler Goodlet',
maintainer='Tyler Goodlet',
maintainer_email='jgbt@protonmail.com',
maintainer_email='goodboy_foss@protonmail.com',
url='https://github.com/goodboy/tractor',
platforms=['linux', 'windows'],
packages=[
@ -44,21 +44,23 @@ setup(
# trio related
# proper range spec:
# https://packaging.python.org/en/latest/discussions/install-requires-vs-requirements/#id5
'trio >= 0.20, < 0.22',
'trio >= 0.22',
'async_generator',
'trio_typing',
'exceptiongroup',
# tooling
'tricycle',
'trio_typing',
# tooling
'colorlog',
'wrapt',
# serialization
# IPC serialization
'msgspec',
# debug mode REPL
'pdbp',
# pip ref docs on these specs:
# https://pip.pypa.io/en/stable/reference/requirement-specifiers/#examples
# and pep:
@ -69,14 +71,9 @@ setup(
# https://github.com/pdbpp/fancycompleter/issues/37
'pyreadline3 ; platform_system == "Windows"',
# 3.10 has an outstanding unreleased issue and `pdbpp` itself
# pins to patched forks of its own dependencies as well..and
# we need a specific patch on master atm.
'pdbpp @ git+https://github.com/pdbpp/pdbpp@76c4be5#egg=pdbpp ; python_version > "3.9"', # noqa: E501
],
tests_require=['pytest'],
python_requires=">=3.9",
python_requires=">=3.10",
keywords=[
'trio',
'async',

View File

@ -7,6 +7,7 @@ import os
import random
import signal
import platform
import pathlib
import time
import inspect
from functools import partial, wraps
@ -113,14 +114,21 @@ no_windows = pytest.mark.skipif(
)
def repodir():
"""Return the abspath to the repo directory.
"""
dirname = os.path.dirname
dirpath = os.path.abspath(
dirname(dirname(os.path.realpath(__file__)))
)
return dirpath
def repodir() -> pathlib.Path:
'''
Return the abspath to the repo directory.
'''
# 2 parents up to step up through tests/<repo_dir>
return pathlib.Path(__file__).parent.parent.absolute()
def examples_dir() -> pathlib.Path:
'''
Return the abspath to the examples directory as `pathlib.Path`.
'''
return repodir() / 'examples'
def pytest_addoption(parser):
@ -151,7 +159,7 @@ def loglevel(request):
@pytest.fixture(scope='session')
def spawn_backend(request):
def spawn_backend(request) -> str:
return request.config.option.spawn_backend
@ -205,16 +213,22 @@ def sig_prog(proc, sig):
@pytest.fixture
def daemon(loglevel, testdir, arb_addr):
"""Run a daemon actor as a "remote arbiter".
"""
def daemon(
loglevel: str,
testdir,
arb_addr: tuple[str, int],
):
'''
Run a daemon actor as a "remote arbiter".
'''
if loglevel in ('trace', 'debug'):
# too much logging will lock up the subproc (smh)
loglevel = 'info'
cmdargs = [
sys.executable, '-c',
"import tractor; tractor.run_daemon([], arbiter_addr={}, loglevel={})"
"import tractor; tractor.run_daemon([], registry_addr={}, loglevel={})"
.format(
arb_addr,
"'{}'".format(loglevel) if loglevel else None)

View File

@ -0,0 +1,193 @@
'''
Sketchy network blackoutz, ugly byzantine gens, puedes eschuchar la
cancelacion?..
'''
from functools import partial
import pytest
from _pytest.pathlib import import_path
import trio
import tractor
from conftest import (
examples_dir,
)
@pytest.mark.parametrize(
'debug_mode',
[False, True],
ids=['no_debug_mode', 'debug_mode'],
)
@pytest.mark.parametrize(
'ipc_break',
[
# no breaks
{
'break_parent_ipc_after': False,
'break_child_ipc_after': False,
},
# only parent breaks
{
'break_parent_ipc_after': 500,
'break_child_ipc_after': False,
},
# only child breaks
{
'break_parent_ipc_after': False,
'break_child_ipc_after': 500,
},
# both: break parent first
{
'break_parent_ipc_after': 500,
'break_child_ipc_after': 800,
},
# both: break child first
{
'break_parent_ipc_after': 800,
'break_child_ipc_after': 500,
},
],
ids=[
'no_break',
'break_parent',
'break_child',
'break_both_parent_first',
'break_both_child_first',
],
)
def test_ipc_channel_break_during_stream(
debug_mode: bool,
spawn_backend: str,
ipc_break: dict | None,
):
'''
Ensure we can have an IPC channel break its connection during
streaming and it's still possible for the (simulated) user to kill
the actor tree using SIGINT.
We also verify the type of connection error expected in the parent
depending on which side if the IPC breaks first.
'''
if spawn_backend != 'trio':
if debug_mode:
pytest.skip('`debug_mode` only supported on `trio` spawner')
# non-`trio` spawners should never hit the hang condition that
# requires the user to do ctl-c to cancel the actor tree.
expect_final_exc = trio.ClosedResourceError
mod = import_path(
examples_dir() / 'advanced_faults' / 'ipc_failure_during_stream.py',
root=examples_dir(),
)
expect_final_exc = KeyboardInterrupt
# when ONLY the child breaks we expect the parent to get a closed
# resource error on the next `MsgStream.receive()` and then fail out
# and cancel the child from there.
if (
# only child breaks
(
ipc_break['break_child_ipc_after']
and ipc_break['break_parent_ipc_after'] is False
)
# both break but, parent breaks first
or (
ipc_break['break_child_ipc_after'] is not False
and (
ipc_break['break_parent_ipc_after']
> ipc_break['break_child_ipc_after']
)
)
):
expect_final_exc = trio.ClosedResourceError
# when the parent IPC side dies (even if the child's does as well
# but the child fails BEFORE the parent) we expect the channel to be
# sent a stop msg from the child at some point which will signal the
# parent that the stream has been terminated.
# NOTE: when the parent breaks "after" the child you get this same
# case as well, the child breaks the IPC channel with a stop msg
# before any closure takes place.
elif (
# only parent breaks
(
ipc_break['break_parent_ipc_after']
and ipc_break['break_child_ipc_after'] is False
)
# both break but, child breaks first
or (
ipc_break['break_parent_ipc_after'] is not False
and (
ipc_break['break_child_ipc_after']
> ipc_break['break_parent_ipc_after']
)
)
):
expect_final_exc = trio.EndOfChannel
with pytest.raises(expect_final_exc):
trio.run(
partial(
mod.main,
debug_mode=debug_mode,
start_method=spawn_backend,
**ipc_break,
)
)
@tractor.context
async def break_ipc_after_started(
ctx: tractor.Context,
) -> None:
await ctx.started()
async with ctx.open_stream() as stream:
await stream.aclose()
await trio.sleep(0.2)
await ctx.chan.send(None)
print('child broke IPC and terminating')
def test_stream_closed_right_after_ipc_break_and_zombie_lord_engages():
'''
Verify that is a subactor's IPC goes down just after bringing up a stream
the parent can trigger a SIGINT and the child will be reaped out-of-IPC by
the localhost process supervision machinery: aka "zombie lord".
'''
async def main():
async with tractor.open_nursery() as n:
portal = await n.start_actor(
'ipc_breaker',
enable_modules=[__name__],
)
with trio.move_on_after(1):
async with (
portal.open_context(
break_ipc_after_started
) as (ctx, sent),
):
async with ctx.open_stream():
await trio.sleep(0.5)
print('parent waiting on context')
print('parent exited context')
raise KeyboardInterrupt
with pytest.raises(KeyboardInterrupt):
trio.run(main)

View File

@ -14,7 +14,7 @@ def is_win():
return platform.system() == 'Windows'
_registry: dict[str, set[tractor.ReceiveMsgStream]] = {
_registry: dict[str, set[tractor.MsgStream]] = {
'even': set(),
'odd': set(),
}

View File

@ -8,6 +8,10 @@ import platform
import time
from itertools import repeat
from exceptiongroup import (
BaseExceptionGroup,
ExceptionGroup,
)
import pytest
import trio
import tractor
@ -56,29 +60,49 @@ def test_remote_error(arb_addr, args_err):
arbiter_addr=arb_addr,
) as nursery:
# on a remote type error caused by bad input args
# this should raise directly which means we **don't** get
# an exception group outside the nursery since the error
# here and the far end task error are one in the same?
portal = await nursery.run_in_actor(
assert_err, name='errorer', **args
)
# get result(s) from main task
try:
# this means the root actor will also raise a local
# parent task error and thus an eg will propagate out
# of this actor nursery.
await portal.result()
except tractor.RemoteActorError as err:
assert err.type == errtype
print("Look Maa that actor failed hard, hehh")
raise
# ensure boxed errors
if args:
with pytest.raises(tractor.RemoteActorError) as excinfo:
trio.run(main)
# ensure boxed error is correct
assert excinfo.value.type == errtype
else:
# the root task will also error on the `.result()` call
# so we expect an error from there AND the child.
with pytest.raises(BaseExceptionGroup) as excinfo:
trio.run(main)
# ensure boxed errors
for exc in excinfo.value.exceptions:
assert exc.type == errtype
def test_multierror(arb_addr):
"""Verify we raise a ``trio.MultiError`` out of a nursery where
'''
Verify we raise a ``BaseExceptionGroup`` out of a nursery where
more then one actor errors.
"""
'''
async def main():
async with tractor.open_nursery(
arbiter_addr=arb_addr,
@ -95,10 +119,10 @@ def test_multierror(arb_addr):
print("Look Maa that first actor failed hard, hehh")
raise
# here we should get a `trio.MultiError` containing exceptions
# here we should get a ``BaseExceptionGroup`` containing exceptions
# from both subactors
with pytest.raises(trio.MultiError):
with pytest.raises(BaseExceptionGroup):
trio.run(main)
@ -107,7 +131,7 @@ def test_multierror(arb_addr):
'num_subactors', range(25, 26),
)
def test_multierror_fast_nursery(arb_addr, start_method, num_subactors, delay):
"""Verify we raise a ``trio.MultiError`` out of a nursery where
"""Verify we raise a ``BaseExceptionGroup`` out of a nursery where
more then one actor errors and also with a delay before failure
to test failure during an ongoing spawning.
"""
@ -123,10 +147,11 @@ def test_multierror_fast_nursery(arb_addr, start_method, num_subactors, delay):
delay=delay
)
with pytest.raises(trio.MultiError) as exc_info:
# with pytest.raises(trio.MultiError) as exc_info:
with pytest.raises(BaseExceptionGroup) as exc_info:
trio.run(main)
assert exc_info.type == tractor.MultiError
assert exc_info.type == ExceptionGroup
err = exc_info.value
exceptions = err.exceptions
@ -214,8 +239,8 @@ async def test_cancel_infinite_streamer(start_method):
[
# daemon actors sit idle while single task actors error out
(1, tractor.RemoteActorError, AssertionError, (assert_err, {}), None),
(2, tractor.MultiError, AssertionError, (assert_err, {}), None),
(3, tractor.MultiError, AssertionError, (assert_err, {}), None),
(2, BaseExceptionGroup, AssertionError, (assert_err, {}), None),
(3, BaseExceptionGroup, AssertionError, (assert_err, {}), None),
# 1 daemon actor errors out while single task actors sleep forever
(3, tractor.RemoteActorError, AssertionError, (sleep_forever, {}),
@ -226,7 +251,7 @@ async def test_cancel_infinite_streamer(start_method):
(do_nuthin, {}), (assert_err, {'delay': 1}, True)),
# daemon complete quickly delay while single task
# actors error after brief delay
(3, tractor.MultiError, AssertionError,
(3, BaseExceptionGroup, AssertionError,
(assert_err, {'delay': 1}), (do_nuthin, {}, False)),
],
ids=[
@ -293,7 +318,7 @@ async def test_some_cancels_all(num_actors_and_errs, start_method, loglevel):
# should error here with a ``RemoteActorError`` or ``MultiError``
except first_err as err:
if isinstance(err, tractor.MultiError):
if isinstance(err, BaseExceptionGroup):
assert len(err.exceptions) == num_actors
for exc in err.exceptions:
if isinstance(exc, tractor.RemoteActorError):
@ -337,7 +362,7 @@ async def spawn_and_error(breadth, depth) -> None:
@tractor_test
async def test_nested_multierrors(loglevel, start_method):
'''
Test that failed actor sets are wrapped in `trio.MultiError`s. This
Test that failed actor sets are wrapped in `BaseExceptionGroup`s. This
test goes only 2 nurseries deep but we should eventually have tests
for arbitrary n-depth actor trees.
@ -365,7 +390,7 @@ async def test_nested_multierrors(loglevel, start_method):
breadth=subactor_breadth,
depth=depth,
)
except trio.MultiError as err:
except BaseExceptionGroup as err:
assert len(err.exceptions) == subactor_breadth
for subexc in err.exceptions:
@ -383,10 +408,10 @@ async def test_nested_multierrors(loglevel, start_method):
assert subexc.type in (
tractor.RemoteActorError,
trio.Cancelled,
trio.MultiError
BaseExceptionGroup,
)
elif isinstance(subexc, trio.MultiError):
elif isinstance(subexc, BaseExceptionGroup):
for subsub in subexc.exceptions:
if subsub in (tractor.RemoteActorError,):
@ -394,7 +419,7 @@ async def test_nested_multierrors(loglevel, start_method):
assert type(subsub) in (
trio.Cancelled,
trio.MultiError,
BaseExceptionGroup,
)
else:
assert isinstance(subexc, tractor.RemoteActorError)
@ -406,13 +431,13 @@ async def test_nested_multierrors(loglevel, start_method):
if is_win():
if isinstance(subexc, tractor.RemoteActorError):
assert subexc.type in (
trio.MultiError,
BaseExceptionGroup,
tractor.RemoteActorError
)
else:
assert isinstance(subexc, trio.MultiError)
assert isinstance(subexc, BaseExceptionGroup)
else:
assert subexc.type is trio.MultiError
assert subexc.type is ExceptionGroup
else:
assert subexc.type in (
tractor.RemoteActorError,

View File

@ -1,5 +1,6 @@
import itertools
import pytest
import trio
import tractor
from tractor import open_actor_cluster
@ -11,26 +12,72 @@ from conftest import tractor_test
MESSAGE = 'tractoring at full speed'
def test_empty_mngrs_input_raises() -> None:
async def main():
with trio.fail_after(1):
async with (
open_actor_cluster(
modules=[__name__],
# NOTE: ensure we can passthrough runtime opts
loglevel='info',
# debug_mode=True,
) as portals,
gather_contexts(
# NOTE: it's the use of inline-generator syntax
# here that causes the empty input.
mngrs=(
p.open_context(worker) for p in portals.values()
),
),
):
assert 0
with pytest.raises(ValueError):
trio.run(main)
@tractor.context
async def worker(ctx: tractor.Context) -> None:
async def worker(
ctx: tractor.Context,
) -> None:
await ctx.started()
async with ctx.open_stream(backpressure=True) as stream:
async with ctx.open_stream(
backpressure=True,
) as stream:
# TODO: this with the below assert causes a hang bug?
# with trio.move_on_after(1):
async for msg in stream:
# do something with msg
print(msg)
assert msg == MESSAGE
# TODO: does this ever cause a hang
# assert 0
@tractor_test
async def test_streaming_to_actor_cluster() -> None:
async with (
open_actor_cluster(modules=[__name__]) as portals,
gather_contexts(
mngrs=[p.open_context(worker) for p in portals.values()],
) as contexts,
gather_contexts(
mngrs=[ctx[0].open_stream() for ctx in contexts],
) as streams,
):
with trio.move_on_after(1):
for stream in itertools.cycle(streams):

View File

@ -10,9 +10,11 @@ TODO:
- wonder if any of it'll work on OS X?
"""
import itertools
from os import path
from typing import Optional
import platform
import pathlib
import sys
import time
@ -23,7 +25,10 @@ from pexpect.exceptions import (
EOF,
)
from conftest import repodir, _ci_env
from conftest import (
examples_dir,
_ci_env,
)
# TODO: The next great debugger audit could be done by you!
# - recurrent entry to breakpoint() from single actor *after* and an
@ -42,19 +47,13 @@ if platform.system() == 'Windows':
)
def examples_dir():
"""Return the abspath to the examples directory.
"""
return path.join(repodir(), 'examples', 'debugging/')
def mk_cmd(ex_name: str) -> str:
"""Generate a command suitable to pass to ``pexpect.spawn()``.
"""
return ' '.join(
['python',
path.join(examples_dir(), f'{ex_name}.py')]
)
'''
Generate a command suitable to pass to ``pexpect.spawn()``.
'''
script_path: pathlib.Path = examples_dir() / 'debugging' / f'{ex_name}.py'
return ' '.join(['python', str(script_path)])
# TODO: was trying to this xfail style but some weird bug i see in CI
@ -96,7 +95,7 @@ def spawn(
return _spawn
PROMPT = r"\(Pdb\+\+\)"
PROMPT = r"\(Pdb\+\)"
def expect(
@ -152,27 +151,14 @@ def ctlc(
use_ctlc = request.param
if (
sys.version_info <= (3, 10)
and use_ctlc
):
# on 3.9 it seems the REPL UX
# is highly unreliable and frankly annoying
# to test for. It does work from manual testing
# but i just don't think it's wroth it to try
# and get this working especially since we want to
# be 3.10+ mega-asap.
pytest.skip('Py3.9 and `pdbpp` son no bueno..')
if ci_env:
node = request.node
markers = node.own_markers
for mark in markers:
if mark.name == 'has_nested_actors':
pytest.skip(
f'Test for {node} uses nested actors and fails in CI\n'
f'The test seems to run fine locally but until we solve'
'this issue this CI test will be xfail:\n'
f'Test {node} has nested actors and fails with Ctrl-C.\n'
f'The test can sometimes run fine locally but until'
' we solve' 'this issue this CI test will be xfail:\n'
'https://github.com/goodboy/tractor/issues/320'
)
@ -195,13 +181,15 @@ def ctlc(
ids=lambda item: f'{item[0]} -> {item[1]}',
)
def test_root_actor_error(spawn, user_in_out):
"""Demonstrate crash handler entering pdbpp from basic error in root actor.
"""
'''
Demonstrate crash handler entering pdb from basic error in root actor.
'''
user_input, expect_err_str = user_in_out
child = spawn('root_actor_error')
# scan for the pdbpp prompt
# scan for the prompt
expect(child, PROMPT)
before = str(child.before.decode())
@ -232,8 +220,8 @@ def test_root_actor_bp(spawn, user_in_out):
user_input, expect_err_str = user_in_out
child = spawn('root_actor_breakpoint')
# scan for the pdbpp prompt
child.expect(r"\(Pdb\+\+\)")
# scan for the prompt
child.expect(PROMPT)
assert 'Error' not in str(child.before)
@ -274,7 +262,7 @@ def do_ctlc(
if expect_prompt:
before = str(child.before.decode())
time.sleep(delay)
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
time.sleep(delay)
if patt:
@ -293,7 +281,7 @@ def test_root_actor_bp_forever(
# entries
for _ in range(10):
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
if ctlc:
do_ctlc(child)
@ -303,7 +291,7 @@ def test_root_actor_bp_forever(
# do one continue which should trigger a
# new task to lock the tty
child.sendline('continue')
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
# seems that if we hit ctrl-c too fast the
# sigint guard machinery might not kick in..
@ -314,10 +302,10 @@ def test_root_actor_bp_forever(
# XXX: this previously caused a bug!
child.sendline('n')
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
child.sendline('n')
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
# quit out of the loop
child.sendline('q')
@ -340,8 +328,8 @@ def test_subactor_error(
'''
child = spawn('subactor_error')
# scan for the pdbpp prompt
child.expect(r"\(Pdb\+\+\)")
# scan for the prompt
child.expect(PROMPT)
before = str(child.before.decode())
assert "Attaching to pdb in crashed actor: ('name_error'" in before
@ -361,7 +349,7 @@ def test_subactor_error(
# creating actor
child.sendline('continue')
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
before = str(child.before.decode())
# root actor gets debugger engaged
@ -388,8 +376,8 @@ def test_subactor_breakpoint(
child = spawn('subactor_breakpoint')
# scan for the pdbpp prompt
child.expect(r"\(Pdb\+\+\)")
# scan for the prompt
child.expect(PROMPT)
before = str(child.before.decode())
assert "Attaching pdb to actor: ('breakpoint_forever'" in before
@ -398,7 +386,7 @@ def test_subactor_breakpoint(
# entries
for _ in range(10):
child.sendline('next')
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
if ctlc:
do_ctlc(child)
@ -406,7 +394,7 @@ def test_subactor_breakpoint(
# now run some "continues" to show re-entries
for _ in range(5):
child.sendline('continue')
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
before = str(child.before.decode())
assert "Attaching pdb to actor: ('breakpoint_forever'" in before
@ -417,7 +405,7 @@ def test_subactor_breakpoint(
child.sendline('q')
# child process should exit but parent will capture pdb.BdbQuit
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
before = str(child.before.decode())
assert "RemoteActorError: ('breakpoint_forever'" in before
@ -449,8 +437,8 @@ def test_multi_subactors(
'''
child = spawn(r'multi_subactors')
# scan for the pdbpp prompt
child.expect(r"\(Pdb\+\+\)")
# scan for the prompt
child.expect(PROMPT)
before = str(child.before.decode())
assert "Attaching pdb to actor: ('breakpoint_forever'" in before
@ -462,7 +450,7 @@ def test_multi_subactors(
# entries
for _ in range(10):
child.sendline('next')
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
if ctlc:
do_ctlc(child)
@ -471,7 +459,7 @@ def test_multi_subactors(
child.sendline('c')
# first name_error failure
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
before = str(child.before.decode())
assert "Attaching to pdb in crashed actor: ('name_error'" in before
assert "NameError" in before
@ -483,19 +471,21 @@ def test_multi_subactors(
child.sendline('c')
# 2nd name_error failure
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
assert_before(child, [
"Attaching to pdb in crashed actor: ('name_error_1'",
"NameError",
])
# TODO: will we ever get the race where this crash will show up?
# blocklist strat now prevents this crash
# assert_before(child, [
# "Attaching to pdb in crashed actor: ('name_error_1'",
# "NameError",
# ])
if ctlc:
do_ctlc(child)
# breakpoint loop should re-engage
child.sendline('c')
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
before = str(child.before.decode())
assert "Attaching pdb to actor: ('breakpoint_forever'" in before
@ -511,7 +501,7 @@ def test_multi_subactors(
):
child.sendline('c')
time.sleep(0.1)
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
before = str(child.before.decode())
if ctlc:
@ -530,11 +520,11 @@ def test_multi_subactors(
# now run some "continues" to show re-entries
for _ in range(5):
child.sendline('c')
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
# quit the loop and expect parent to attach
child.sendline('q')
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
before = str(child.before.decode())
assert_before(child, [
@ -578,16 +568,16 @@ def test_multi_daemon_subactors(
'''
child = spawn('multi_daemon_subactors')
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
# there is a race for which subactor will acquire
# the root's tty lock first
before = str(child.before.decode())
# there can be a race for which subactor will acquire
# the root's tty lock first so anticipate either crash
# message on the first entry.
bp_forever_msg = "Attaching pdb to actor: ('bp_forever'"
name_error_msg = "NameError"
name_error_msg = "NameError: name 'doggypants' is not defined"
before = str(child.before.decode())
if bp_forever_msg in before:
next_msg = name_error_msg
@ -608,10 +598,8 @@ def test_multi_daemon_subactors(
# second entry by `bp_forever`.
child.sendline('c')
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
assert next_msg in before
child.expect(PROMPT)
assert_before(child, [next_msg])
# XXX: hooray the root clobbering the child here was fixed!
# IMO, this demonstrates the true power of SC system design.
@ -630,31 +618,50 @@ def test_multi_daemon_subactors(
if ctlc:
do_ctlc(child)
# wait for final error in root
while True:
# expect another breakpoint actor entry
child.sendline('c')
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
child.expect(PROMPT)
try:
# root error should be packed as remote error
assert "_exceptions.RemoteActorError: ('name_error'" in before
break
assert_before(child, [bp_forever_msg])
except AssertionError:
assert bp_forever_msg in before
assert_before(child, [name_error_msg])
else:
if ctlc:
do_ctlc(child)
# should crash with the 2nd name error (simulates
# a retry) and then the root eventually (boxed) errors
# after 1 or more further bp actor entries.
child.sendline('c')
child.expect(PROMPT)
assert_before(child, [name_error_msg])
# wait for final error in root
# where it crashs with boxed error
while True:
try:
child.sendline('c')
child.expect(pexpect.EOF)
child.expect(PROMPT)
assert_before(
child,
[bp_forever_msg]
)
except AssertionError:
break
except TIMEOUT:
# Failed to exit using continue..?
child.sendline('q')
assert_before(
child,
[
# boxed error raised in root task
"Attaching to pdb in crashed actor: ('root'",
"_exceptions.RemoteActorError: ('name_error'",
]
)
child.sendline('c')
child.expect(pexpect.EOF)
@ -670,8 +677,8 @@ def test_multi_subactors_root_errors(
'''
child = spawn('multi_subactor_root_errors')
# scan for the pdbpp prompt
child.expect(r"\(Pdb\+\+\)")
# scan for the prompt
child.expect(PROMPT)
# at most one subactor should attach before the root is cancelled
before = str(child.before.decode())
@ -683,7 +690,15 @@ def test_multi_subactors_root_errors(
# continue again to catch 2nd name error from
# actor 'name_error_1' (which is 2nd depth).
child.sendline('c')
child.expect(r"\(Pdb\+\+\)")
# due to block list strat from #337, this will no longer
# propagate before the root errors and cancels the spawner sub-tree.
child.expect(PROMPT)
# only if the blocking condition doesn't kick in fast enough
before = str(child.before.decode())
if "Debug lock blocked for ['name_error_1'" not in before:
assert_before(child, [
"Attaching to pdb in crashed actor: ('name_error_1'",
"NameError",
@ -693,10 +708,15 @@ def test_multi_subactors_root_errors(
do_ctlc(child)
child.sendline('c')
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
# check if the spawner crashed or was blocked from debug
# and if this intermediary attached check the boxed error
before = str(child.before.decode())
if "Attaching to pdb in crashed actor: ('spawn_error'" in before:
assert_before(child, [
"Attaching to pdb in crashed actor: ('spawn_error'",
# boxed error from previous step
# boxed error from spawner's child
"RemoteActorError: ('name_error_1'",
"NameError",
])
@ -705,27 +725,29 @@ def test_multi_subactors_root_errors(
do_ctlc(child)
child.sendline('c')
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
# expect a root actor crash
assert_before(child, [
"Attaching to pdb in crashed actor: ('root'",
# boxed error from previous step
"RemoteActorError: ('name_error'",
"NameError",
# error from root actor and root task that created top level nursery
"Attaching to pdb in crashed actor: ('root'",
"AssertionError",
])
# warnings assert we probably don't need
# assert "Cancelling nursery in ('spawn_error'," in before
if ctlc:
do_ctlc(child)
# continue again
child.sendline('c')
child.expect(pexpect.EOF)
before = str(child.before.decode())
# error from root actor and root task that created top level nursery
assert "AssertionError" in before
assert_before(child, [
# "Attaching to pdb in crashed actor: ('root'",
# boxed error from previous step
"RemoteActorError: ('name_error'",
"NameError",
"AssertionError",
'assert 0',
])
@has_nested_actors
@ -750,24 +772,31 @@ def test_multi_nested_subactors_error_through_nurseries(
timed_out_early: bool = False
for i in range(12):
for send_char in itertools.cycle(['c', 'q']):
try:
child.expect(r"\(Pdb\+\+\)")
child.sendline('c')
time.sleep(0.1)
child.expect(PROMPT)
child.sendline(send_char)
time.sleep(0.01)
except EOF:
# race conditions on how fast the continue is sent?
print(f"Failed early on {i}?")
timed_out_early = True
break
else:
child.expect(pexpect.EOF)
if not timed_out_early:
before = str(child.before.decode())
assert "NameError" in before
assert_before(child, [
# boxed source errors
"NameError: name 'doggypants' is not defined",
"tractor._exceptions.RemoteActorError: ('name_error'",
"bdb.BdbQuit",
# first level subtrees
"tractor._exceptions.RemoteActorError: ('spawner0'",
# "tractor._exceptions.RemoteActorError: ('spawner1'",
# propagation of errors up through nested subtrees
"tractor._exceptions.RemoteActorError: ('spawn_until_0'",
"tractor._exceptions.RemoteActorError: ('spawn_until_1'",
"tractor._exceptions.RemoteActorError: ('spawn_until_2'",
])
@pytest.mark.timeout(15)
@ -787,7 +816,7 @@ def test_root_nursery_cancels_before_child_releases_tty_lock(
child = spawn('root_cancelled_but_child_is_in_tty_lock')
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
before = str(child.before.decode())
assert "NameError: name 'doggypants' is not defined" in before
@ -802,7 +831,7 @@ def test_root_nursery_cancels_before_child_releases_tty_lock(
for i in range(4):
time.sleep(0.5)
try:
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
except (
EOF,
@ -859,7 +888,7 @@ def test_root_cancels_child_context_during_startup(
'''
child = spawn('fast_error_in_root_after_spawn')
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
before = str(child.before.decode())
assert "AssertionError" in before
@ -876,7 +905,7 @@ def test_different_debug_mode_per_actor(
ctlc: bool,
):
child = spawn('per_actor_debug')
child.expect(r"\(Pdb\+\+\)")
child.expect(PROMPT)
# only one actor should enter the debugger
before = str(child.before.decode())

View File

@ -12,17 +12,17 @@ import shutil
import pytest
from conftest import repodir
def examples_dir():
"""Return the abspath to the examples directory.
"""
return os.path.join(repodir(), 'examples')
from conftest import (
examples_dir,
)
@pytest.fixture
def run_example_in_subproc(loglevel, testdir, arb_addr):
def run_example_in_subproc(
loglevel: str,
testdir,
arb_addr: tuple[str, int],
):
@contextmanager
def run(script_code):
@ -32,8 +32,8 @@ def run_example_in_subproc(loglevel, testdir, arb_addr):
# on windows we need to create a special __main__.py which will
# be executed with ``python -m <modulename>`` on windows..
shutil.copyfile(
os.path.join(examples_dir(), '__main__.py'),
os.path.join(str(testdir), '__main__.py')
examples_dir() / '__main__.py',
str(testdir / '__main__.py'),
)
# drop the ``if __name__ == '__main__'`` guard onwards from
@ -88,6 +88,7 @@ def run_example_in_subproc(loglevel, testdir, arb_addr):
and f[0] != '_'
and 'debugging' not in p[0]
and 'integration' not in p[0]
and 'advanced_faults' not in p[0]
],
ids=lambda t: t[1],

View File

@ -8,6 +8,7 @@ import builtins
import itertools
import importlib
from exceptiongroup import BaseExceptionGroup
import pytest
import trio
import tractor
@ -409,11 +410,12 @@ def test_trio_error_cancels_intertask_chan(arb_addr):
# should trigger remote actor error
await portal.result()
with pytest.raises(RemoteActorError) as excinfo:
with pytest.raises(BaseExceptionGroup) as excinfo:
trio.run(main)
# ensure boxed error is correct
assert excinfo.value.type == Exception
# ensure boxed errors
for exc in excinfo.value.exceptions:
assert exc.type == Exception
def test_trio_closes_early_and_channel_exits(arb_addr):
@ -442,11 +444,12 @@ def test_aio_errors_and_channel_propagates_and_closes(arb_addr):
# should trigger remote actor error
await portal.result()
with pytest.raises(RemoteActorError) as excinfo:
with pytest.raises(BaseExceptionGroup) as excinfo:
trio.run(main)
# ensure boxed error is correct
assert excinfo.value.type == Exception
# ensure boxed errors
for exc in excinfo.value.exceptions:
assert exc.type == Exception
@tractor.context

View File

@ -251,7 +251,7 @@ def test_a_quadruple_example(time_quad_ex, ci_env, spawn_backend):
results, diff = time_quad_ex
assert results
this_fast = 6 if platform.system() in ('Windows', 'Darwin') else 2.666
this_fast = 6 if platform.system() in ('Windows', 'Darwin') else 3
assert diff < this_fast

View File

@ -11,15 +11,15 @@ from conftest import tractor_test
@pytest.mark.trio
async def test_no_arbitter():
async def test_no_runtime():
"""An arbitter must be established before any nurseries
can be created.
(In other words ``tractor.open_root_actor()`` must be engaged at
some point?)
"""
with pytest.raises(RuntimeError):
with tractor.open_nursery():
with pytest.raises(RuntimeError) :
async with tractor.find_actor('doggy'):
pass

View File

@ -62,7 +62,10 @@ async def test_lifetime_stack_wipes_tmpfile(
)
).result()
except tractor.RemoteActorError:
except (
tractor.RemoteActorError,
tractor.BaseExceptionGroup,
):
pass
# tmp file should have been wiped by

View File

@ -12,7 +12,10 @@ import pytest
import trio
from trio.lowlevel import current_task
import tractor
from tractor.trionics import broadcast_receiver, Lagged
from tractor.trionics import (
broadcast_receiver,
Lagged,
)
@tractor.context
@ -37,7 +40,7 @@ async def echo_sequences(
async def ensure_sequence(
stream: tractor.ReceiveMsgStream,
stream: tractor.MsgStream,
sequence: list,
delay: Optional[float] = None,
@ -211,7 +214,8 @@ def test_faster_task_to_recv_is_cancelled_by_slower(
arb_addr,
start_method,
):
'''Ensure that if a faster task consuming from a stream is cancelled
'''
Ensure that if a faster task consuming from a stream is cancelled
the slower task can continue to receive all expected values.
'''
@ -460,3 +464,51 @@ def test_first_recver_is_cancelled():
assert value == 1
trio.run(main)
def test_no_raise_on_lag():
'''
Run a simple 2-task broadcast where one task is slow but configured
so that it does not raise `Lagged` on overruns using
`raise_on_lasg=False` and verify that the task does not raise.
'''
size = 100
tx, rx = trio.open_memory_channel(size)
brx = broadcast_receiver(rx, size)
async def slow():
async with brx.subscribe(
raise_on_lag=False,
) as br:
async for msg in br:
print(f'slow task got: {msg}')
await trio.sleep(0.1)
async def fast():
async with brx.subscribe() as br:
async for msg in br:
print(f'fast task got: {msg}')
async def main():
async with (
tractor.open_root_actor(
# NOTE: so we see the warning msg emitted by the bcaster
# internals when the no raise flag is set.
loglevel='warning',
),
trio.open_nursery() as n,
):
n.start_soon(slow)
n.start_soon(fast)
for i in range(1000):
await tx.send(i)
# simulate user nailing ctl-c after realizing
# there's a lag in the slow task.
await trio.sleep(1)
raise KeyboardInterrupt
with pytest.raises(KeyboardInterrupt):
trio.run(main)

View File

@ -18,13 +18,12 @@
tractor: structured concurrent "actors".
"""
from trio import MultiError
from exceptiongroup import BaseExceptionGroup
from ._clustering import open_actor_cluster
from ._ipc import Channel
from ._streaming import (
Context,
ReceiveMsgStream,
MsgStream,
stream,
context,
@ -45,7 +44,10 @@ from ._exceptions import (
ModuleNotExposed,
ContextCancelled,
)
from ._debug import breakpoint, post_mortem
from ._debug import (
breakpoint,
post_mortem,
)
from . import msg
from ._root import (
run_daemon,
@ -62,9 +64,8 @@ __all__ = [
'ContextCancelled',
'ModuleNotExposed',
'MsgStream',
'MultiError',
'BaseExceptionGroup',
'Portal',
'ReceiveMsgStream',
'RemoteActorError',
'breakpoint',
'context',

View File

@ -32,9 +32,12 @@ import tractor
async def open_actor_cluster(
modules: list[str],
count: int = cpu_count(),
names: Optional[list[str]] = None,
start_method: Optional[str] = None,
names: list[str] | None = None,
hard_kill: bool = False,
# passed through verbatim to ``open_root_actor()``
**runtime_kwargs,
) -> AsyncGenerator[
dict[str, tractor.Portal],
None,
@ -49,7 +52,9 @@ async def open_actor_cluster(
raise ValueError(
'Number of names is {len(names)} but count it {count}')
async with tractor.open_nursery(start_method=start_method) as an:
async with tractor.open_nursery(
**runtime_kwargs,
) as an:
async with trio.open_nursery() as n:
uid = tractor.current_actor().uid

View File

@ -20,11 +20,16 @@ Multi-core debugging for da peeps!
"""
from __future__ import annotations
import bdb
import os
import sys
import signal
from functools import partial
from functools import (
partial,
cached_property,
)
from contextlib import asynccontextmanager as acm
from typing import (
Any,
Optional,
Callable,
AsyncIterator,
@ -32,6 +37,7 @@ from typing import (
)
from types import FrameType
import pdbp
import tractor
import trio
from trio_typing import TaskStatus
@ -48,17 +54,6 @@ from ._exceptions import (
)
from ._ipc import Channel
try:
# wtf: only exported when installed in dev mode?
import pdbpp
except ImportError:
# pdbpp is installed in regular mode...it monkey patches stuff
import pdb
xpm = getattr(pdb, 'xpm', None)
assert xpm, "pdbpp is not installed?" # type: ignore
pdbpp = pdb
log = get_logger(__name__)
@ -72,11 +67,16 @@ class Lock:
Mostly to avoid a lot of ``global`` declarations for now XD.
'''
repl: MultiActorPdb | None = None
# placeholder for function to set a ``trio.Event`` on debugger exit
# pdb_release_hook: Optional[Callable] = None
_trio_handler: Callable[
[int, Optional[FrameType]], Any
] | int | None = None
# actor-wide variable pointing to current task name using debugger
local_task_in_debug: Optional[str] = None
local_task_in_debug: str | None = None
# NOTE: set by the current task waiting on the root tty lock from
# the CALLER side of the `lock_tty_for_child()` context entry-call
@ -106,18 +106,15 @@ class Lock:
def shield_sigint(cls):
cls._orig_sigint_handler = signal.signal(
signal.SIGINT,
shield_sigint,
shield_sigint_handler,
)
@classmethod
def unshield_sigint(cls):
if cls._orig_sigint_handler is not None:
# restore original sigint handler
signal.signal(
signal.SIGINT,
cls._orig_sigint_handler
)
# always restore ``trio``'s sigint handler. see notes below in
# the pdb factory about the nightmare that is that code swapping
# out the handler when the repl activates...
signal.signal(signal.SIGINT, cls._trio_handler)
cls._orig_sigint_handler = None
@classmethod
@ -144,24 +141,29 @@ class Lock:
finally:
# restore original sigint handler
cls.unshield_sigint()
cls.repl = None
class TractorConfig(pdbpp.DefaultConfig):
class TractorConfig(pdbp.DefaultConfig):
'''
Custom ``pdbpp`` goodness.
Custom ``pdbp`` goodness :surfer:
'''
# use_pygments = True
# sticky_by_default = True
enable_hidden_frames = False
use_pygments: bool = True
sticky_by_default: bool = False
enable_hidden_frames: bool = False
# much thanks @mdmintz for the hot tip!
# fixes line spacing issue when resizing terminal B)
truncate_long_lines: bool = False
class MultiActorPdb(pdbpp.Pdb):
class MultiActorPdb(pdbp.Pdb):
'''
Add teardown hooks to the regular ``pdbpp.Pdb``.
Add teardown hooks to the regular ``pdbp.Pdb``.
'''
# override the pdbpp config with our coolio one
# override the pdbp config with our coolio one
DefaultConfig = TractorConfig
# def preloop(self):
@ -182,6 +184,35 @@ class MultiActorPdb(pdbpp.Pdb):
finally:
Lock.release()
# XXX NOTE: we only override this because apparently the stdlib pdb
# bois likes to touch the SIGINT handler as much as i like to touch
# my d$%&.
def _cmdloop(self):
self.cmdloop()
@cached_property
def shname(self) -> str | None:
'''
Attempt to return the login shell name with a special check for
the infamous `xonsh` since it seems to have some issues much
different from std shells when it comes to flushing the prompt?
'''
# SUPER HACKY and only really works if `xonsh` is not used
# before spawning further sub-shells..
shpath = os.getenv('SHELL', None)
if shpath:
if (
os.getenv('XONSH_LOGIN', default=False)
or 'xonsh' in shpath
):
return 'xonsh'
return os.path.basename(shpath)
return None
@acm
async def _acquire_debug_lock_from_root_task(
@ -276,7 +307,7 @@ async def lock_tty_for_child(
) -> str:
'''
Lock the TTY in the root process of an actor tree in a new
inter-actor-context-task such that the ``pdbpp`` debugger console
inter-actor-context-task such that the ``pdbp`` debugger console
can be mutex-allocated to the calling sub-actor for REPL control
without interference by other processes / threads.
@ -363,7 +394,7 @@ async def wait_for_parent_stdin_hijack(
) as (ctx, val):
log.pdb('locked context')
log.debug('locked context')
assert val == 'Locked'
async with ctx.open_stream() as stream:
@ -382,21 +413,21 @@ async def wait_for_parent_stdin_hijack(
# sync with callee termination
assert await ctx.result() == "pdb_unlock_complete"
log.pdb('unlocked context')
log.debug('exitting child side locking task context')
except ContextCancelled:
log.warning('Root actor cancelled debug lock')
raise
finally:
log.pdb(f"Exiting debugger for actor {actor_uid}")
Lock.local_task_in_debug = None
log.pdb(f"Child {actor_uid} released parent stdio lock")
log.debug('Exiting debugger from child')
def mk_mpdb() -> tuple[MultiActorPdb, Callable]:
pdb = MultiActorPdb()
# signal.signal = pdbpp.hideframe(signal.signal)
# signal.signal = pdbp.hideframe(signal.signal)
Lock.shield_sigint()
@ -423,9 +454,8 @@ async def _breakpoint(
'''
__tracebackhide__ = True
pdb, undo_sigint = mk_mpdb()
actor = tractor.current_actor()
pdb, undo_sigint = mk_mpdb()
task_name = trio.lowlevel.current_task().name
# TODO: is it possible to debug a trio.Cancelled except block?
@ -435,7 +465,10 @@ async def _breakpoint(
# with trio.CancelScope(shield=shield):
# await trio.lowlevel.checkpoint()
if not Lock.local_pdb_complete or Lock.local_pdb_complete.is_set():
if (
not Lock.local_pdb_complete
or Lock.local_pdb_complete.is_set()
):
Lock.local_pdb_complete = trio.Event()
# TODO: need a more robust check for the "root" actor
@ -449,7 +482,10 @@ async def _breakpoint(
# Recurrence entry case: this task already has the lock and
# is likely recurrently entering a breakpoint
if Lock.local_task_in_debug == task_name:
# noop on recurrent entry case
# noop on recurrent entry case but we want to trigger
# a checkpoint to allow other actors error-propagate and
# potetially avoid infinite re-entries in some subactor.
await trio.lowlevel.checkpoint()
return
# if **this** actor is already in debug mode block here
@ -468,18 +504,29 @@ async def _breakpoint(
# root nursery so that the debugger can continue to run without
# being restricted by the scope of a new task nursery.
# NOTE: if we want to debug a trio.Cancelled triggered exception
# TODO: if we want to debug a trio.Cancelled triggered exception
# we have to figure out how to avoid having the service nursery
# cancel on this task start? I *think* this works below?
# cancel on this task start? I *think* this works below:
# ```python
# actor._service_n.cancel_scope.shield = shield
# ```
# but not entirely sure if that's a sane way to implement it?
try:
with trio.CancelScope(shield=True):
await actor._service_n.start(
wait_for_parent_stdin_hijack,
actor.uid,
)
Lock.repl = pdb
except RuntimeError:
Lock.release()
if actor._cancel_called:
# service nursery won't be usable and we
# don't want to lock up the root either way since
# we're in (the midst of) cancellation.
return
raise
elif is_root_process():
@ -509,6 +556,7 @@ async def _breakpoint(
Lock.global_actor_in_debug = actor.uid
Lock.local_task_in_debug = task_name
Lock.repl = pdb
try:
# block here one (at the appropriate frame *up*) where
@ -529,22 +577,18 @@ async def _breakpoint(
# # frame = sys._getframe()
# # last_f = frame.f_back
# # last_f.f_globals['__tracebackhide__'] = True
# # signal.signal = pdbpp.hideframe(signal.signal)
# signal.signal(
# signal.SIGINT,
# orig_handler
# )
# # signal.signal = pdbp.hideframe(signal.signal)
def shield_sigint(
def shield_sigint_handler(
signum: int,
frame: 'frame', # type: ignore # noqa
pdb_obj: Optional[MultiActorPdb] = None,
# pdb_obj: Optional[MultiActorPdb] = None,
*args,
) -> None:
'''
Specialized debugger compatible SIGINT handler.
Specialized, debugger-aware SIGINT handler.
In childred we always ignore to avoid deadlocks since cancellation
should always be managed by the parent supervising actor. The root
@ -556,6 +600,7 @@ def shield_sigint(
uid_in_debug = Lock.global_actor_in_debug
actor = tractor.current_actor()
# print(f'{actor.uid} in HANDLER with ')
def do_cancel():
# If we haven't tried to cancel the runtime then do that instead
@ -589,6 +634,9 @@ def shield_sigint(
)
return do_cancel()
# only set in the actor actually running the REPL
pdb_obj = Lock.repl
# root actor branch that reports whether or not a child
# has locked debugger.
if (
@ -601,16 +649,36 @@ def shield_sigint(
# which has already terminated to unlock.
and any_connected
):
# we are root and some actor is in debug mode
# if uid_in_debug is not None:
if pdb_obj:
name = uid_in_debug[0]
if name != 'root':
log.pdb(
f"Ignoring SIGINT while child in debug mode: `{uid_in_debug}`"
f"Ignoring SIGINT, child in debug mode: `{uid_in_debug}`"
)
else:
log.pdb(
"Ignoring SIGINT while in debug mode"
)
elif (
is_root_process()
):
if pdb_obj:
log.pdb(
"Ignoring SIGINT since debug mode is enabled"
)
if (
Lock._root_local_task_cs_in_debug
and not Lock._root_local_task_cs_in_debug.cancel_called
):
Lock._root_local_task_cs_in_debug.cancel()
# revert back to ``trio`` handler asap!
Lock.unshield_sigint()
# child actor that has locked the debugger
elif not is_root_process():
@ -626,7 +694,10 @@ def shield_sigint(
return do_cancel()
task = Lock.local_task_in_debug
if task:
if (
task
and pdb_obj
):
log.pdb(
f"Ignoring SIGINT while task in debug mode: `{task}`"
)
@ -636,20 +707,26 @@ def shield_sigint(
# https://github.com/goodboy/tractor/issues/320
# elif debug_mode():
else:
log.pdb(
"Ignoring SIGINT since debug mode is enabled"
)
else: # XXX: shouldn't ever get here?
print("WTFWTFWTF")
raise KeyboardInterrupt
# NOTE: currently (at least on ``fancycompleter`` 0.9.2)
# it lookks to be that the last command that was run (eg. ll)
# it looks to be that the last command that was run (eg. ll)
# will be repeated by default.
# TODO: maybe redraw/print last REPL output to console
# maybe redraw/print last REPL output to console since
# we want to alert the user that more input is expect since
# nothing has been done dur to ignoring sigint.
if (
pdb_obj
and sys.version_info <= (3, 10)
pdb_obj # only when this actor has a REPL engaged
):
# XXX: yah, mega hack, but how else do we catch this madness XD
if pdb_obj.shname == 'xonsh':
pdb_obj.stdout.write(pdb_obj.prompt)
pdb_obj.stdout.flush()
# TODO: make this work like sticky mode where if there is output
# detected as written to the tty we redraw this part underneath
# and erase the past draw of this same bit above?
@ -660,21 +737,13 @@ def shield_sigint(
# https://github.com/goodboy/tractor/issues/130#issuecomment-663752040
# https://github.com/prompt-toolkit/python-prompt-toolkit/blob/c2c6af8a0308f9e5d7c0e28cb8a02963fe0ce07a/prompt_toolkit/patch_stdout.py
# XXX: lol, see ``pdbpp`` issue:
# XXX LEGACY: lol, see ``pdbpp`` issue:
# https://github.com/pdbpp/pdbpp/issues/496
# TODO: pretty sure this is what we should expect to have to run
# in total but for now we're just going to wait until `pdbpp`
# figures out it's own stuff on 3.10 (and maybe we'll help).
# pdb_obj.do_longlist(None)
# XXX: we were doing this but it shouldn't be required..
print(pdb_obj.prompt, end='', flush=True)
def _set_trace(
actor: Optional[tractor.Actor] = None,
pdb: Optional[MultiActorPdb] = None,
actor: tractor.Actor | None = None,
pdb: MultiActorPdb | None = None,
):
__tracebackhide__ = True
actor = actor or tractor.current_actor()
@ -684,7 +753,11 @@ def _set_trace(
if frame:
frame = frame.f_back # type: ignore
if frame and pdb and actor is not None:
if (
frame
and pdb
and actor is not None
):
log.pdb(f"\nAttaching pdb to actor: {actor.uid}\n")
# no f!#$&* idea, but when we're in async land
# we need 2x frames up?
@ -693,7 +766,8 @@ def _set_trace(
else:
pdb, undo_sigint = mk_mpdb()
# we entered the global ``breakpoint()`` built-in from sync code?
# we entered the global ``breakpoint()`` built-in from sync
# code?
Lock.local_task_in_debug = 'sync'
pdb.set_trace(frame=frame)
@ -723,7 +797,7 @@ def _post_mortem(
# https://github.com/pdbpp/pdbpp/issues/480
# TODO: help with a 3.10+ major release if/when it arrives.
pdbpp.xpm(Pdb=lambda: pdb)
pdbp.xpm(Pdb=lambda: pdb)
post_mortem = partial(
@ -794,7 +868,10 @@ async def maybe_wait_for_debugger(
) -> None:
if not debug_mode() and not child_in_debug:
if (
not debug_mode()
and not child_in_debug
):
return
if (

View File

@ -108,7 +108,7 @@ async def query_actor(
@acm
async def find_actor(
name: str,
arbiter_sockaddr: tuple[str, int] = None
arbiter_sockaddr: tuple[str, int] | None = None
) -> AsyncGenerator[Optional[Portal], None]:
'''
@ -134,7 +134,7 @@ async def find_actor(
@acm
async def wait_for_actor(
name: str,
arbiter_sockaddr: tuple[str, int] = None
arbiter_sockaddr: tuple[str, int] | None = None
) -> AsyncGenerator[Portal, None]:
"""Wait on an actor to register with the arbiter.

View File

@ -51,7 +51,7 @@ def _mp_main(
accept_addr: tuple[str, int],
forkserver_info: tuple[Any, Any, Any, Any, Any],
start_method: SpawnMethodKey,
parent_addr: tuple[str, int] = None,
parent_addr: tuple[str, int] | None = None,
infect_asyncio: bool = False,
) -> None:
@ -98,7 +98,7 @@ def _trio_main(
actor: Actor, # type: ignore
*,
parent_addr: tuple[str, int] = None,
parent_addr: tuple[str, int] | None = None,
infect_asyncio: bool = False,
) -> None:

View File

@ -27,6 +27,7 @@ import importlib
import builtins
import traceback
import exceptiongroup as eg
import trio
@ -52,9 +53,6 @@ class RemoteActorError(Exception):
self.type = suberror_type
self.msgdata = msgdata
# TODO: a trio.MultiError.catch like context manager
# for catching underlying remote errors of a particular type
class InternalActorError(RemoteActorError):
"""Remote internal ``tractor`` error indicating
@ -123,10 +121,12 @@ def unpack_error(
err_type=RemoteActorError
) -> Exception:
"""Unpack an 'error' message from the wire
'''
Unpack an 'error' message from the wire
into a local ``RemoteActorError``.
"""
'''
__tracebackhide__ = True
error = msg['error']
tb_str = error.get('tb_str', '')
@ -139,7 +139,12 @@ def unpack_error(
suberror_type = trio.Cancelled
else: # try to lookup a suitable local error type
for ns in [builtins, _this_mod, trio]:
for ns in [
builtins,
_this_mod,
eg,
trio,
]:
try:
suberror_type = getattr(ns, type_name)
break
@ -158,12 +163,15 @@ def unpack_error(
def is_multi_cancelled(exc: BaseException) -> bool:
"""Predicate to determine if a ``trio.MultiError`` contains only
``trio.Cancelled`` sub-exceptions (and is likely the result of
'''
Predicate to determine if a possible ``eg.BaseExceptionGroup`` contains
only ``trio.Cancelled`` sub-exceptions (and is likely the result of
cancelling a collection of subtasks.
"""
return not trio.MultiError.filter(
lambda exc: exc if not isinstance(exc, trio.Cancelled) else None,
exc,
)
'''
if isinstance(exc, eg.BaseExceptionGroup):
return exc.subgroup(
lambda exc: isinstance(exc, trio.Cancelled)
) is not None
return False

View File

@ -341,7 +341,7 @@ class Channel:
async def connect(
self,
destaddr: tuple[Any, ...] = None,
destaddr: tuple[Any, ...] | None = None,
**kwargs
) -> MsgTransport:

View File

@ -45,24 +45,27 @@ from ._exceptions import (
NoResult,
ContextCancelled,
)
from ._streaming import Context, ReceiveMsgStream
from ._streaming import (
Context,
MsgStream,
)
log = get_logger(__name__)
def _unwrap_msg(
msg: dict[str, Any],
channel: Channel
) -> Any:
__tracebackhide__ = True
try:
return msg['return']
except KeyError:
# internal error should never get here
assert msg.get('cid'), "Received internal error at portal?"
raise unpack_error(msg, channel)
raise unpack_error(msg, channel) from None
class MessagingError(Exception):
@ -101,7 +104,7 @@ class Portal:
# it is expected that ``result()`` will be awaited at some
# point.
self._expect_result: Optional[Context] = None
self._streams: set[ReceiveMsgStream] = set()
self._streams: set[MsgStream] = set()
self.actor = current_actor()
async def _submit_for_result(
@ -136,6 +139,7 @@ class Portal:
Return the result(s) from the remote actor's "main" task.
'''
# __tracebackhide__ = True
# Check for non-rpc errors slapped on the
# channel for which we always raise
exc = self.channel._exc
@ -185,7 +189,7 @@ class Portal:
async def cancel_actor(
self,
timeout: float = None,
timeout: float | None = None,
) -> bool:
'''
@ -315,7 +319,7 @@ class Portal:
async_gen_func: Callable, # typing: ignore
**kwargs,
) -> AsyncGenerator[ReceiveMsgStream, None]:
) -> AsyncGenerator[MsgStream, None]:
if not inspect.isasyncgenfunction(async_gen_func):
if not (
@ -340,7 +344,7 @@ class Portal:
try:
# deliver receive only stream
async with ReceiveMsgStream(
async with MsgStream(
ctx, ctx._recv_chan,
) as rchan:
self._streams.add(rchan)
@ -460,7 +464,6 @@ class Portal:
# sure it's worth being pedantic:
# Exception,
# trio.Cancelled,
# trio.MultiError,
# KeyboardInterrupt,
) as err:
@ -497,6 +500,10 @@ class Portal:
f'actor: {uid}'
)
result = await ctx.result()
log.runtime(
f'Context {fn_name} returned '
f'value from callee `{result}`'
)
# though it should be impossible for any tasks
# operating *in* this scope to have survived
@ -518,12 +525,6 @@ class Portal:
f'task:{cid}\n'
f'actor:{uid}'
)
else:
log.runtime(
f'Context {fn_name} returned '
f'value from callee `{result}`'
)
# XXX: (MEGA IMPORTANT) if this is a root opened process we
# wait for any immediate child in debug before popping the
# context from the runtime msg loop otherwise inside
@ -536,7 +537,10 @@ class Portal:
await maybe_wait_for_debugger()
# remove the context from runtime tracking
self.actor._contexts.pop((self.channel.uid, ctx.cid))
self.actor._contexts.pop(
(self.channel.uid, ctx.cid),
None,
)
@dataclass

View File

@ -22,16 +22,21 @@ from contextlib import asynccontextmanager
from functools import partial
import importlib
import logging
import signal
import sys
import os
from typing import (
Optional,
)
import typing
import warnings
from exceptiongroup import BaseExceptionGroup
import trio
from ._runtime import Actor, Arbiter, async_main
from ._runtime import (
Actor,
Arbiter,
async_main,
)
from . import _debug
from . import _spawn
from . import _state
@ -51,37 +56,45 @@ logger = log.get_logger('tractor')
@asynccontextmanager
async def open_root_actor(
*,
# defaults are above
arbiter_addr: Optional[tuple[str, int]] = (
_default_arbiter_host,
_default_arbiter_port,
),
arbiter_addr: tuple[str, int] | None = None,
name: Optional[str] = 'root',
# defaults are above
registry_addr: tuple[str, int] | None = None,
name: str | None = 'root',
# either the `multiprocessing` start method:
# https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods
# OR `trio` (the new default).
start_method: Optional[_spawn.SpawnMethodKey] = None,
start_method: _spawn.SpawnMethodKey | None = None,
# enables the multi-process debugger support
debug_mode: bool = False,
# internal logging
loglevel: Optional[str] = None,
loglevel: str | None = None,
enable_modules: Optional[list] = None,
rpc_module_paths: Optional[list] = None,
enable_modules: list | None = None,
rpc_module_paths: list | None = None,
) -> typing.Any:
"""Async entry point for ``tractor``.
'''
Runtime init entry point for ``tractor``.
"""
'''
# Override the global debugger hook to make it play nice with
# ``trio``, see:
# ``trio``, see much discussion in:
# https://github.com/python-trio/trio/issues/1155#issuecomment-742964018
builtin_bp_handler = sys.breakpointhook
orig_bp_path: str | None = os.environ.get('PYTHONBREAKPOINT', None)
os.environ['PYTHONBREAKPOINT'] = 'tractor._debug._set_trace'
# attempt to retreive ``trio``'s sigint handler and stash it
# on our debugger lock state.
_debug.Lock._trio_handler = signal.getsignal(signal.SIGINT)
# mark top most level process as root actor
_state._runtime_vars['_is_root'] = True
@ -100,10 +113,22 @@ async def open_root_actor(
if start_method is not None:
_spawn.try_set_start_method(start_method)
arbiter_addr = (host, port) = arbiter_addr or (
if arbiter_addr is not None:
warnings.warn(
'`arbiter_addr` is now deprecated and has been renamed to'
'`registry_addr`.\nUse that instead..',
DeprecationWarning,
stacklevel=2,
)
registry_addr = (host, port) = (
registry_addr
or arbiter_addr
or (
_default_arbiter_host,
_default_arbiter_port,
)
)
loglevel = (loglevel or log._default_loglevel).upper()
@ -148,7 +173,7 @@ async def open_root_actor(
except OSError:
# TODO: make this a "discovery" log level?
logger.warning(f"No actor could be found @ {host}:{port}")
logger.warning(f"No actor registry found @ {host}:{port}")
# create a local actor and start up its main routine/task
if arbiter_found:
@ -158,7 +183,7 @@ async def open_root_actor(
actor = Actor(
name or 'anonymous',
arbiter_addr=arbiter_addr,
arbiter_addr=registry_addr,
loglevel=loglevel,
enable_modules=enable_modules,
)
@ -174,7 +199,7 @@ async def open_root_actor(
actor = Arbiter(
name or 'arbiter',
arbiter_addr=arbiter_addr,
arbiter_addr=registry_addr,
loglevel=loglevel,
enable_modules=enable_modules,
)
@ -205,7 +230,10 @@ async def open_root_actor(
try:
yield actor
except (Exception, trio.MultiError) as err:
except (
Exception,
BaseExceptionGroup,
) as err:
entered = await _debug._maybe_enter_pm(err)
@ -229,6 +257,15 @@ async def open_root_actor(
await actor.cancel()
finally:
_state._current_actor = None
# restore breakpoint hook state
sys.breakpointhook = builtin_bp_handler
if orig_bp_path is not None:
os.environ['PYTHONBREAKPOINT'] = orig_bp_path
else:
# clear env back to having no entry
os.environ.pop('PYTHONBREAKPOINT')
logger.runtime("Root actor terminated")
@ -236,13 +273,13 @@ def run_daemon(
enable_modules: list[str],
# runtime kwargs
name: Optional[str] = 'root',
arbiter_addr: tuple[str, int] = (
name: str | None = 'root',
registry_addr: tuple[str, int] = (
_default_arbiter_host,
_default_arbiter_port,
),
start_method: Optional[str] = None,
start_method: str | None = None,
debug_mode: bool = False,
**kwargs
@ -264,7 +301,7 @@ def run_daemon(
async def _main():
async with open_root_actor(
arbiter_addr=arbiter_addr,
registry_addr=registry_addr,
name=name,
start_method=start_method,
debug_mode=debug_mode,

View File

@ -25,21 +25,23 @@ from itertools import chain
import importlib
import importlib.util
import inspect
import uuid
import signal
import sys
from typing import (
Any, Optional,
Union, TYPE_CHECKING,
Callable,
)
import uuid
from types import ModuleType
import sys
import os
from contextlib import ExitStack
import warnings
from async_generator import aclosing
from exceptiongroup import BaseExceptionGroup
import trio # type: ignore
from trio_typing import TaskStatus
from async_generator import aclosing
from ._ipc import Channel
from ._streaming import Context
@ -194,7 +196,7 @@ async def _invoke(
res = await coro
await chan.send({'return': res, 'cid': cid})
except trio.MultiError:
except BaseExceptionGroup:
# if a context error was set then likely
# thei multierror was raised due to that
if ctx._error is not None:
@ -226,11 +228,11 @@ async def _invoke(
fname = func.__name__
if ctx._cancel_called:
msg = f'{fname} cancelled itself'
msg = f'`{fname}()` cancelled itself'
elif cs.cancel_called:
msg = (
f'{fname} was remotely cancelled by its caller '
f'`{fname}()` was remotely cancelled by its caller '
f'{ctx.chan.uid}'
)
@ -266,7 +268,7 @@ async def _invoke(
except (
Exception,
trio.MultiError
BaseExceptionGroup,
) as err:
if not is_multi_cancelled(err):
@ -317,7 +319,7 @@ async def _invoke(
BrokenPipeError,
):
# if we can't propagate the error that's a big boo boo
log.error(
log.exception(
f"Failed to ship error to caller @ {chan.uid} !?"
)
@ -349,7 +351,7 @@ def _get_mod_abspath(module):
async def try_ship_error_to_parent(
channel: Channel,
err: Union[Exception, trio.MultiError],
err: Union[Exception, BaseExceptionGroup],
) -> None:
with trio.CancelScope(shield=True):
@ -421,8 +423,8 @@ class Actor:
name: str,
*,
enable_modules: list[str] = [],
uid: str = None,
loglevel: str = None,
uid: str | None = None,
loglevel: str | None = None,
arbiter_addr: Optional[tuple[str, int]] = None,
spawn_method: Optional[str] = None
) -> None:
@ -453,7 +455,7 @@ class Actor:
self._mods: dict[str, ModuleType] = {}
self.loglevel = loglevel
self._arb_addr = (
self._arb_addr: tuple[str, int] | None = (
str(arbiter_addr[0]),
int(arbiter_addr[1])
) if arbiter_addr else None
@ -486,7 +488,10 @@ class Actor:
self._parent_chan: Optional[Channel] = None
self._forkserver_info: Optional[
tuple[Any, Any, Any, Any, Any]] = None
self._actoruid2nursery: dict[Optional[tuple[str, str]], 'ActorNursery'] = {} # type: ignore # noqa
self._actoruid2nursery: dict[
tuple[str, str],
ActorNursery | None,
] = {} # type: ignore # noqa
async def wait_for_peer(
self, uid: tuple[str, str]
@ -708,6 +713,14 @@ class Actor:
log.runtime(f"No more channels for {chan.uid}")
self._peers.pop(uid, None)
log.runtime(f"Peers is {self._peers}")
# No more channels to other actors (at all) registered
# as connected.
if not self._peers:
log.runtime("Signalling no more peer channel connections")
self._no_more_peers.set()
# NOTE: block this actor from acquiring the
# debugger-TTY-lock since we have no way to know if we
# cancelled it and further there is no way to ensure the
@ -721,23 +734,16 @@ class Actor:
# if a now stale local task has the TTY lock still
# we cancel it to allow servicing other requests for
# the lock.
db_cs = pdb_lock._root_local_task_cs_in_debug
if (
pdb_lock._root_local_task_cs_in_debug
and not pdb_lock._root_local_task_cs_in_debug.cancel_called
db_cs
and not db_cs.cancel_called
):
log.warning(
f'STALE DEBUG LOCK DETECTED FOR {uid}'
)
# TODO: figure out why this breaks tests..
# pdb_lock._root_local_task_cs_in_debug.cancel()
log.runtime(f"Peers is {self._peers}")
# No more channels to other actors (at all) registered
# as connected.
if not self._peers:
log.runtime("Signalling no more peer channel connections")
self._no_more_peers.set()
db_cs.cancel()
# XXX: is this necessary (GC should do it)?
if chan.connected():
@ -823,7 +829,12 @@ class Actor:
if ctx._backpressure:
log.warning(text)
try:
await send_chan.send(msg)
except trio.BrokenResourceError:
# XXX: local consumer has closed their side
# so cancel the far end streaming task
log.warning(f"{chan} is already closed")
else:
try:
raise StreamOverrun(text) from None
@ -977,7 +988,7 @@ class Actor:
handler_nursery: trio.Nursery,
*,
# (host, port) to bind for channel server
accept_host: tuple[str, int] = None,
accept_host: tuple[str, int] | None = None,
accept_port: int = 0,
task_status: TaskStatus[trio.Nursery] = trio.TASK_STATUS_IGNORED,
) -> None:
@ -1228,6 +1239,10 @@ async def async_main(
and when cancelled effectively cancels the actor.
'''
# attempt to retreive ``trio``'s sigint handler and stash it
# on our debugger lock state.
_debug.Lock._trio_handler = signal.getsignal(signal.SIGINT)
registered_with_arbiter = False
try:
@ -1364,10 +1379,12 @@ async def async_main(
actor.lifetime_stack.close()
# Unregister actor from the arbiter
if registered_with_arbiter and (
actor._arb_addr is not None
if (
registered_with_arbiter
and not actor.is_arbiter
):
failed = False
assert isinstance(actor._arb_addr, tuple)
with trio.move_on_after(0.5) as cs:
cs.shield = True
try:
@ -1549,7 +1566,10 @@ async def process_messages(
partial(_invoke, actor, cid, chan, func, kwargs),
name=funcname,
)
except (RuntimeError, trio.MultiError):
except (
RuntimeError,
BaseExceptionGroup,
):
# avoid reporting a benign race condition
# during actor runtime teardown.
nursery_cancelled_before_task = True
@ -1589,12 +1609,18 @@ async def process_messages(
# handshake for them (yet) and instead we simply bail out of
# the message loop and expect the teardown sequence to clean
# up.
log.runtime(f'channel from {chan.uid} closed abruptly:\n{chan}')
log.runtime(
f'channel from {chan.uid} closed abruptly:\n'
f'-> {chan.raddr}\n'
)
# transport **was** disconnected
return True
except (Exception, trio.MultiError) as err:
except (
Exception,
BaseExceptionGroup,
) as err:
if nursery_cancelled_before_task:
sn = actor._service_n
assert sn and sn.cancel_scope.cancel_called
@ -1635,17 +1661,28 @@ class Arbiter(Actor):
'''
is_arbiter = True
def __init__(self, *args, **kwargs):
def __init__(self, *args, **kwargs) -> None:
self._registry: dict[
tuple[str, str],
tuple[str, int],
] = {}
self._waiters = {}
self._waiters: dict[
str,
# either an event to sync to receiving an actor uid (which
# is filled in once the actor has sucessfully registered),
# or that uid after registry is complete.
list[trio.Event | tuple[str, str]]
] = {}
super().__init__(*args, **kwargs)
async def find_actor(self, name: str) -> Optional[tuple[str, int]]:
async def find_actor(
self,
name: str,
) -> tuple[str, int] | None:
for uid, sockaddr in self._registry.items():
if name in uid:
return sockaddr
@ -1680,7 +1717,8 @@ class Arbiter(Actor):
registered.
'''
sockaddrs = []
sockaddrs: list[tuple[str, int]] = []
sockaddr: tuple[str, int]
for (aname, _), sockaddr in self._registry.items():
if name == aname:
@ -1690,7 +1728,9 @@ class Arbiter(Actor):
waiter = trio.Event()
self._waiters.setdefault(name, []).append(waiter)
await waiter.wait()
for uid in self._waiters[name]:
if not isinstance(uid, trio.Event):
sockaddrs.append(self._registry[uid])
return sockaddrs
@ -1701,11 +1741,11 @@ class Arbiter(Actor):
sockaddr: tuple[str, int]
) -> None:
uid = name, uuid = (str(uid[0]), str(uid[1]))
uid = name, _ = (str(uid[0]), str(uid[1]))
self._registry[uid] = (str(sockaddr[0]), int(sockaddr[1]))
# pop and signal all waiter events
events = self._waiters.pop(name, ())
events = self._waiters.pop(name, [])
self._waiters.setdefault(name, []).append(uid)
for event in events:
if isinstance(event, trio.Event):

View File

@ -23,14 +23,14 @@ import sys
import platform
from typing import (
Any,
Awaitable,
Literal,
Optional,
Callable,
TypeVar,
TYPE_CHECKING,
)
from collections.abc import Awaitable
from exceptiongroup import BaseExceptionGroup
import trio
from trio_typing import TaskStatus
@ -59,7 +59,7 @@ if TYPE_CHECKING:
log = get_logger('tractor')
# placeholder for an mp start context if so using that backend
_ctx: Optional[mp.context.BaseContext] = None
_ctx: mp.context.BaseContext | None = None
SpawnMethodKey = Literal[
'trio', # supported on all platforms
'mp_spawn',
@ -85,7 +85,7 @@ else:
def try_set_start_method(
key: SpawnMethodKey
) -> Optional[mp.context.BaseContext]:
) -> mp.context.BaseContext | None:
'''
Attempt to set the method for process starting, aka the "actor
spawning backend".
@ -139,6 +139,7 @@ async def exhaust_portal(
If the main task is an async generator do our best to consume
what's left of it.
'''
__tracebackhide__ = True
try:
log.debug(f"Waiting on final result from {actor.uid}")
@ -146,8 +147,11 @@ async def exhaust_portal(
# always be established and shutdown using a context manager api
final = await portal.result()
except (Exception, trio.MultiError) as err:
# we reraise in the parent task via a ``trio.MultiError``
except (
Exception,
BaseExceptionGroup,
) as err:
# we reraise in the parent task via a ``BaseExceptionGroup``
return err
except trio.Cancelled as err:
# lol, of course we need this too ;P
@ -175,7 +179,7 @@ async def cancel_on_completion(
'''
# if this call errors we store the exception for later
# in ``errors`` which will be reraised inside
# a MultiError and we still send out a cancel request
# an exception group and we still send out a cancel request
result = await exhaust_portal(portal, actor)
if isinstance(result, Exception):
errors[actor.uid] = result
@ -195,16 +199,37 @@ async def cancel_on_completion(
async def do_hard_kill(
proc: trio.Process,
terminate_after: int = 3,
) -> None:
# NOTE: this timeout used to do nothing since we were shielding
# the ``.wait()`` inside ``new_proc()`` which will pretty much
# never release until the process exits, now it acts as
# a hard-kill time ultimatum.
log.debug(f"Terminating {proc}")
with trio.move_on_after(terminate_after) as cs:
# NOTE: This ``__aexit__()`` shields internally.
async with proc: # calls ``trio.Process.aclose()``
log.debug(f"Terminating {proc}")
# NOTE: code below was copied verbatim from the now deprecated
# (in 0.20.0) ``trio._subrocess.Process.aclose()``, orig doc
# string:
#
# Close any pipes we have to the process (both input and output)
# and wait for it to exit. If cancelled, kills the process and
# waits for it to finish exiting before propagating the
# cancellation.
with trio.CancelScope(shield=True):
if proc.stdin is not None:
await proc.stdin.aclose()
if proc.stdout is not None:
await proc.stdout.aclose()
if proc.stderr is not None:
await proc.stderr.aclose()
try:
await proc.wait()
finally:
if proc.returncode is None:
proc.kill()
with trio.CancelScope(shield=True):
await proc.wait()
if cs.cancelled_caught:
# XXX: should pretty much never get here unless we have
@ -255,7 +280,9 @@ async def soft_wait(
if proc.poll() is None: # type: ignore
log.warning(
f'Process still alive after cancel request:\n{uid}')
'Actor still alive after cancel request:\n'
f'{uid}'
)
n.cancel_scope.cancel()
raise
@ -348,12 +375,11 @@ async def trio_proc(
spawn_cmd.append("--asyncio")
cancelled_during_spawn: bool = False
proc: Optional[trio.Process] = None
proc: trio.Process | None = None
try:
try:
# TODO: needs ``trio_typing`` patch?
proc = await trio.lowlevel.open_process( # type: ignore
spawn_cmd)
proc = await trio.lowlevel.open_process(spawn_cmd)
log.runtime(f"Started {proc}")
@ -437,8 +463,8 @@ async def trio_proc(
nursery.cancel_scope.cancel()
finally:
# The "hard" reap since no actor zombies are allowed!
# XXX: do this **after** cancellation/tearfown to avoid
# XXX NOTE XXX: The "hard" reap since no actor zombies are
# allowed! Do this **after** cancellation/teardown to avoid
# killing the process too early.
if proc:
log.cancel(f'Hard reap sequence starting for {subactor.uid}')
@ -452,6 +478,13 @@ async def trio_proc(
await proc.wait()
if is_root_process():
# TODO: solve the following issue where we need
# to do a similar wait like this but in an
# "intermediary" parent actor that itself isn't
# in debug but has a child that is, and we need
# to hold off on relaying SIGINT until that child
# is complete.
# https://github.com/goodboy/tractor/issues/320
await maybe_wait_for_debugger(
child_in_debug=_runtime_vars.get(
'_debug_mode', False),

View File

@ -22,7 +22,6 @@ from typing import (
Optional,
Any,
)
from collections.abc import Mapping
import trio
@ -46,30 +45,6 @@ def current_actor(err_on_no_runtime: bool = True) -> 'Actor': # type: ignore #
return _current_actor
_conc_name_getters = {
'task': trio.lowlevel.current_task,
'actor': current_actor
}
class ActorContextInfo(Mapping):
"Dyanmic lookup for local actor and task names"
_context_keys = ('task', 'actor')
def __len__(self):
return len(self._context_keys)
def __iter__(self):
return iter(self._context_keys)
def __getitem__(self, key: str) -> str:
try:
return _conc_name_getters[key]().name # type: ignore
except RuntimeError:
# no local actor/task context initialized yet
return f'no {key} context'
def is_main_process() -> bool:
"""Bool determining if this actor is running in the top-most process.
"""

View File

@ -50,12 +50,13 @@ log = get_logger(__name__)
# - use __slots__ on ``Context``?
class ReceiveMsgStream(trio.abc.ReceiveChannel):
class MsgStream(trio.abc.Channel):
'''
A IPC message stream for receiving logically sequenced values over
an inter-actor ``Channel``. This is the type returned to a local
task which entered either ``Portal.open_stream_from()`` or
``Context.open_stream()``.
A bidirectional message stream for receiving logically sequenced
values over an inter-actor IPC ``Channel``.
This is the type returned to a local task which entered either
``Portal.open_stream_from()`` or ``Context.open_stream()``.
Termination rules:
@ -97,6 +98,9 @@ class ReceiveMsgStream(trio.abc.ReceiveChannel):
if self._eoc:
raise trio.EndOfChannel
if self._closed:
raise trio.ClosedResourceError('This stream was closed')
try:
msg = await self._rx_chan.receive()
return msg['yield']
@ -110,6 +114,9 @@ class ReceiveMsgStream(trio.abc.ReceiveChannel):
# - 'error'
# possibly just handle msg['stop'] here!
if self._closed:
raise trio.ClosedResourceError('This stream was closed')
if msg.get('stop') or self._eoc:
log.debug(f"{self} was stopped at remote end")
@ -189,7 +196,6 @@ class ReceiveMsgStream(trio.abc.ReceiveChannel):
return
self._eoc = True
self._closed = True
# NOTE: this is super subtle IPC messaging stuff:
# Relay stop iteration to far end **iff** we're
@ -206,12 +212,8 @@ class ReceiveMsgStream(trio.abc.ReceiveChannel):
# In the bidirectional case, `Context.open_stream()` will create
# the `Actor._cids2qs` entry from a call to
# `Actor.get_context()` and will send the stop message in
# ``__aexit__()`` on teardown so it **does not** need to be
# called here.
if not self._ctx._portal:
# Only for 2 way streams can we can send stop from the
# caller side.
# `Actor.get_context()` and will call us here to send the stop
# msg in ``__aexit__()`` on teardown.
try:
# NOTE: if this call is cancelled we expect this end to
# handle as though the stop was never sent (though if it
@ -228,7 +230,14 @@ class ReceiveMsgStream(trio.abc.ReceiveChannel):
# the underlying channel may already have been pulled
# in which case our stop message is meaningless since
# it can't traverse the transport.
log.debug(f'Channel for {self} was already closed')
ctx = self._ctx
log.warning(
f'Stream was already destroyed?\n'
f'actor: {ctx.chan.uid}\n'
f'ctx id: {ctx.cid}'
)
self._closed = True
# Do we close the local mem chan ``self._rx_chan`` ??!?
@ -271,7 +280,8 @@ class ReceiveMsgStream(trio.abc.ReceiveChannel):
self,
) -> AsyncIterator[BroadcastReceiver]:
'''Allocate and return a ``BroadcastReceiver`` which delegates
'''
Allocate and return a ``BroadcastReceiver`` which delegates
to this message stream.
This allows multiple local tasks to receive each their own copy
@ -308,15 +318,15 @@ class ReceiveMsgStream(trio.abc.ReceiveChannel):
async with self._broadcaster.subscribe() as bstream:
assert bstream.key != self._broadcaster.key
assert bstream._recv == self._broadcaster._recv
# NOTE: we patch on a `.send()` to the bcaster so that the
# caller can still conduct 2-way streaming using this
# ``bstream`` handle transparently as though it was the msg
# stream instance.
bstream.send = self.send # type: ignore
yield bstream
class MsgStream(ReceiveMsgStream, trio.abc.Channel):
'''
Bidirectional message stream for use within an inter-actor actor
``Context```.
'''
async def send(
self,
data: Any
@ -593,23 +603,23 @@ class Context:
async with MsgStream(
ctx=self,
rx_chan=ctx._recv_chan,
) as rchan:
) as stream:
if self._portal:
self._portal._streams.add(rchan)
self._portal._streams.add(stream)
try:
self._stream_opened = True
# ensure we aren't cancelled before delivering
# the stream
# XXX: do we need this?
# ensure we aren't cancelled before yielding the stream
# await trio.lowlevel.checkpoint()
yield rchan
yield stream
# XXX: Make the stream "one-shot use". On exit, signal
# NOTE: Make the stream "one-shot use". On exit, signal
# ``trio.EndOfChannel``/``StopAsyncIteration`` to the
# far end.
await self.send_stop()
await stream.aclose()
finally:
if self._portal:

View File

@ -18,6 +18,7 @@
``trio`` inspired apis and helpers
"""
from contextlib import asynccontextmanager as acm
from functools import partial
import inspect
from typing import (
@ -27,8 +28,8 @@ from typing import (
import typing
import warnings
from exceptiongroup import BaseExceptionGroup
import trio
from async_generator import asynccontextmanager
from ._debug import maybe_wait_for_debugger
from ._state import current_actor, is_main_process
@ -82,7 +83,7 @@ class ActorNursery:
actor: Actor,
ria_nursery: trio.Nursery,
da_nursery: trio.Nursery,
errors: dict[tuple[str, str], Exception],
errors: dict[tuple[str, str], BaseException],
) -> None:
# self.supervisor = supervisor # TODO
self._actor: Actor = actor
@ -110,11 +111,11 @@ class ActorNursery:
name: str,
*,
bind_addr: tuple[str, int] = _default_bind_addr,
rpc_module_paths: list[str] = None,
enable_modules: list[str] = None,
loglevel: str = None, # set log level per subactor
nursery: trio.Nursery = None,
debug_mode: Optional[bool] = None,
rpc_module_paths: list[str] | None = None,
enable_modules: list[str] | None = None,
loglevel: str | None = None, # set log level per subactor
nursery: trio.Nursery | None = None,
debug_mode: Optional[bool] | None = None,
infect_asyncio: bool = False,
) -> Portal:
'''
@ -181,9 +182,9 @@ class ActorNursery:
name: Optional[str] = None,
bind_addr: tuple[str, int] = _default_bind_addr,
rpc_module_paths: Optional[list[str]] = None,
enable_modules: list[str] = None,
loglevel: str = None, # set log level per subactor
rpc_module_paths: list[str] | None = None,
enable_modules: list[str] | None = None,
loglevel: str | None = None, # set log level per subactor
infect_asyncio: bool = False,
**kwargs, # explicit args to ``fn``
@ -294,13 +295,17 @@ class ActorNursery:
self._join_procs.set()
@asynccontextmanager
@acm
async def _open_and_supervise_one_cancels_all_nursery(
actor: Actor,
) -> typing.AsyncGenerator[ActorNursery, None]:
# TODO: yay or nay?
__tracebackhide__ = True
# the collection of errors retreived from spawned sub-actors
errors: dict[tuple[str, str], Exception] = {}
errors: dict[tuple[str, str], BaseException] = {}
# This is the outermost level "deamon actor" nursery. It is awaited
# **after** the below inner "run in actor nursery". This allows for
@ -333,19 +338,17 @@ async def _open_and_supervise_one_cancels_all_nursery(
# after we yield upwards
yield anursery
# When we didn't error in the caller's scope,
# signal all process-monitor-tasks to conduct
# the "hard join phase".
log.runtime(
f"Waiting on subactors {anursery._children} "
"to complete"
)
# Last bit before first nursery block ends in the case
# where we didn't error in the caller's scope
# signal all process monitor tasks to conduct
# hard join phase.
anursery._join_procs.set()
except BaseException as err:
except BaseException as inner_err:
errors[actor.uid] = inner_err
# If we error in the root but the debugger is
# engaged we don't want to prematurely kill (and
@ -362,19 +365,18 @@ async def _open_and_supervise_one_cancels_all_nursery(
# worry more are coming).
anursery._join_procs.set()
try:
# XXX: hypothetically an error could be
# raised and then a cancel signal shows up
# slightly after in which case the `else:`
# block here might not complete? For now,
# shield both.
with trio.CancelScope(shield=True):
etype = type(err)
etype = type(inner_err)
if etype in (
trio.Cancelled,
KeyboardInterrupt
) or (
is_multi_cancelled(err)
is_multi_cancelled(inner_err)
):
log.cancel(
f"Nursery for {current_actor().uid} "
@ -382,29 +384,23 @@ async def _open_and_supervise_one_cancels_all_nursery(
else:
log.exception(
f"Nursery for {current_actor().uid} "
f"errored with {err}, ")
f"errored with")
# cancel all subactors
await anursery.cancel()
except trio.MultiError as merr:
# If we receive additional errors while waiting on
# remaining subactors that were cancelled,
# aggregate those errors with the original error
# that triggered this teardown.
if err not in merr.exceptions:
raise trio.MultiError(merr.exceptions + [err])
else:
raise
# ria_nursery scope end
# XXX: do we need a `trio.Cancelled` catch here as well?
# this is the catch around the ``.run_in_actor()`` nursery
# TODO: this is the handler around the ``.run_in_actor()``
# nursery. Ideally we can drop this entirely in the future as
# the whole ``.run_in_actor()`` API should be built "on top of"
# this lower level spawn-request-cancel "daemon actor" API where
# a local in-actor task nursery is used with one-to-one task
# + `await Portal.run()` calls and the results/errors are
# handled directly (inline) and errors by the local nursery.
except (
Exception,
trio.MultiError,
BaseExceptionGroup,
trio.Cancelled
) as err:
@ -436,18 +432,20 @@ async def _open_and_supervise_one_cancels_all_nursery(
with trio.CancelScope(shield=True):
await anursery.cancel()
# use `MultiError` as needed
# use `BaseExceptionGroup` as needed
if len(errors) > 1:
raise trio.MultiError(tuple(errors.values()))
raise BaseExceptionGroup(
'tractor.ActorNursery errored with',
tuple(errors.values()),
)
else:
raise list(errors.values())[0]
# ria_nursery scope end - nursery checkpoint
# after nursery exit
# da_nursery scope end - nursery checkpoint
# final exit
@asynccontextmanager
@acm
async def open_nursery(
**kwargs,

View File

@ -48,7 +48,7 @@ log = get_logger('messaging')
async def fan_out_to_ctxs(
pub_async_gen_func: typing.Callable, # it's an async gen ... gd mypy
topics2ctxs: dict[str, list],
packetizer: typing.Callable = None,
packetizer: typing.Callable | None = None,
) -> None:
'''
Request and fan out quotes to each subscribed actor channel.
@ -144,7 +144,7 @@ _pubtask2lock: dict[str, trio.StrictFIFOLock] = {}
def pub(
wrapped: typing.Callable = None,
wrapped: typing.Callable | None = None,
*,
tasks: set[str] = set(),
):
@ -249,8 +249,8 @@ def pub(
topics: set[str],
*args,
# *,
task_name: str = None, # default: only one task allocated
packetizer: Callable = None,
task_name: str | None = None, # default: only one task allocated
packetizer: Callable | None = None,
**kwargs,
):
if task_name is None:

View File

@ -18,12 +18,14 @@
Log like a forester!
"""
from collections.abc import Mapping
import sys
import logging
import colorlog # type: ignore
from typing import Optional
from ._state import ActorContextInfo
import trio
from ._state import current_actor
_proj_name: str = 'tractor'
@ -36,7 +38,8 @@ LOG_FORMAT = (
# "{bold_white}{log_color}{asctime}{reset}"
"{log_color}{asctime}{reset}"
" {bold_white}{thin_white}({reset}"
"{thin_white}{actor}, {process}, {task}){reset}{bold_white}{thin_white})"
"{thin_white}{actor_name}[{actor_uid}], "
"{process}, {task}){reset}{bold_white}{thin_white})"
" {reset}{log_color}[{reset}{bold_log_color}{levelname}{reset}{log_color}]"
" {log_color}{name}"
" {thin_white}{filename}{log_color}:{reset}{thin_white}{lineno}{log_color}"
@ -136,9 +139,40 @@ class StackLevelAdapter(logging.LoggerAdapter):
)
_conc_name_getters = {
'task': lambda: trio.lowlevel.current_task().name,
'actor': lambda: current_actor(),
'actor_name': lambda: current_actor().name,
'actor_uid': lambda: current_actor().uid[1][:6],
}
class ActorContextInfo(Mapping):
"Dyanmic lookup for local actor and task names"
_context_keys = (
'task',
'actor',
'actor_name',
'actor_uid',
)
def __len__(self):
return len(self._context_keys)
def __iter__(self):
return iter(self._context_keys)
def __getitem__(self, key: str) -> str:
try:
return _conc_name_getters[key]()
except RuntimeError:
# no local actor/task context initialized yet
return f'no {key} context'
def get_logger(
name: str = None,
name: str | None = None,
_root_name: str = _proj_name,
) -> StackLevelAdapter:
@ -173,7 +207,7 @@ def get_logger(
def get_console_log(
level: str = None,
level: str | None = None,
**kwargs,
) -> logging.LoggerAdapter:
'''Get the package logger and enable a handler which writes to stderr.

View File

@ -23,7 +23,6 @@ from __future__ import annotations
from abc import abstractmethod
from collections import deque
from contextlib import asynccontextmanager
from dataclasses import dataclass
from functools import partial
from operator import ne
from typing import Optional, Callable, Awaitable, Any, AsyncIterator, Protocol
@ -33,7 +32,10 @@ import trio
from trio._core._run import Task
from trio.abc import ReceiveChannel
from trio.lowlevel import current_task
from msgspec import Struct
from tractor.log import get_logger
log = get_logger(__name__)
# A regular invariant generic type
T = TypeVar("T")
@ -86,8 +88,7 @@ class Lagged(trio.TooSlowError):
'''
@dataclass
class BroadcastState:
class BroadcastState(Struct):
'''
Common state to all receivers of a broadcast.
@ -110,7 +111,35 @@ class BroadcastState:
eoc: bool = False
# If the broadcaster was cancelled, we might as well track it
cancelled: bool = False
cancelled: dict[int, Task] = {}
def statistics(self) -> dict[str, Any]:
'''
Return broadcast receiver group "statistics" like many of
``trio``'s internal task-sync primitives.
'''
key: int | None
ev: trio.Event | None
subs = self.subs
if self.recv_ready is not None:
key, ev = self.recv_ready
else:
key = ev = None
qlens: dict[int, int] = {}
for tid, sz in subs.items():
qlens[tid] = sz if sz != -1 else 0
return {
'open_consumers': len(subs),
'queued_len_by_task': qlens,
'max_buffer_size': self.maxlen,
'tasks_waiting': ev.statistics().tasks_waiting if ev else 0,
'tasks_cancelled': self.cancelled,
'next_value_receiver_id': key,
}
class BroadcastReceiver(ReceiveChannel):
@ -128,23 +157,40 @@ class BroadcastReceiver(ReceiveChannel):
rx_chan: AsyncReceiver,
state: BroadcastState,
receive_afunc: Optional[Callable[[], Awaitable[Any]]] = None,
raise_on_lag: bool = True,
) -> None:
# register the original underlying (clone)
self.key = id(self)
self._state = state
# each consumer has an int count which indicates
# which index contains the next value that the task has not yet
# consumed and thus should read. In the "up-to-date" case the
# consumer task must wait for a new value from the underlying
# receiver and we use ``-1`` as the sentinel for this state.
state.subs[self.key] = -1
# underlying for this receiver
self._rx = rx_chan
self._recv = receive_afunc or rx_chan.receive
self._closed: bool = False
self._raise_on_lag = raise_on_lag
async def receive(self) -> ReceiveType:
def receive_nowait(
self,
_key: int | None = None,
_state: BroadcastState | None = None,
key = self.key
state = self._state
) -> Any:
'''
Sync version of `.receive()` which does all the low level work
of receiving from the underlying/wrapped receive channel.
'''
key = _key or self.key
state = _state or self._state
# TODO: ideally we can make some way to "lock out" the
# underlying receive channel in some way such that if some task
@ -177,32 +223,47 @@ class BroadcastReceiver(ReceiveChannel):
# return this value."
# https://docs.rs/tokio/1.11.0/tokio/sync/broadcast/index.html#lagging
mxln = state.maxlen
lost = seq - mxln
# decrement to the last value and expect
# consumer to either handle the ``Lagged`` and come back
# or bail out on its own (thus un-subscribing)
state.subs[key] = state.maxlen - 1
state.subs[key] = mxln - 1
# this task was overrun by the producer side
task: Task = current_task()
raise Lagged(f'Task {task.name} was overrun')
msg = f'Task `{task.name}` overrun and dropped `{lost}` values'
if self._raise_on_lag:
raise Lagged(msg)
else:
log.warning(msg)
return self.receive_nowait(_key, _state)
state.subs[key] -= 1
return value
# current task already has the latest value **and** is the
# first task to begin waiting for a new one
if state.recv_ready is None:
raise trio.WouldBlock
async def _receive_from_underlying(
self,
key: int,
state: BroadcastState,
) -> ReceiveType:
if self._closed:
raise trio.ClosedResourceError
event = trio.Event()
assert state.recv_ready is None
state.recv_ready = key, event
try:
# if we're cancelled here it should be
# fine to bail without affecting any other consumers
# right?
try:
value = await self._recv()
# items with lower indices are "newer"
@ -220,7 +281,6 @@ class BroadcastReceiver(ReceiveChannel):
# already retreived the last value
# XXX: which of these impls is fastest?
# subs = state.subs.copy()
# subs.pop(key)
@ -251,54 +311,85 @@ class BroadcastReceiver(ReceiveChannel):
# consumers will be awoken with a sequence of -1
# and will potentially try to rewait the underlying
# receiver instead of just cancelling immediately.
self._state.cancelled = True
self._state.cancelled[key] = current_task()
if event.statistics().tasks_waiting:
event.set()
raise
finally:
# Reset receiver waiter task event for next blocking condition.
# this MUST be reset even if the above ``.recv()`` call
# was cancelled to avoid the next consumer from blocking on
# an event that won't be set!
state.recv_ready = None
async def receive(self) -> ReceiveType:
key = self.key
state = self._state
try:
return self.receive_nowait(
_key=key,
_state=state,
)
except trio.WouldBlock:
pass
# current task already has the latest value **and** is the
# first task to begin waiting for a new one so we begin blocking
# until rescheduled with the a new value from the underlying.
if state.recv_ready is None:
return await self._receive_from_underlying(key, state)
# This task is all caught up and ready to receive the latest
# value, so queue sched it on the internal event.
# value, so queue/schedule it to be woken on the next internal
# event.
else:
seq = state.subs[key]
assert seq == -1 # sanity
while state.recv_ready is not None:
# seq = state.subs[key]
# assert seq == -1 # sanity
_, ev = state.recv_ready
await ev.wait()
try:
return self.receive_nowait(
_key=key,
_state=state,
)
except trio.WouldBlock:
if self._closed:
raise trio.ClosedResourceError
# NOTE: if we ever would like the behaviour where if the
# first task to recv on the underlying is cancelled but it
# still DOES trigger the ``.recv_ready``, event we'll likely need
# this logic:
subs = state.subs
if (
len(subs) == 1
and key in subs
# or cancelled
):
# XXX: we are the last and only user of this BR so
# likely it makes sense to unwind back to the
# underlying?
# import tractor
# await tractor.breakpoint()
log.warning(
f'Only one sub left for {self}?\n'
'We can probably unwind from breceiver?'
)
if seq > -1:
# stuff from above..
seq = state.subs[key]
value = state.queue[seq]
state.subs[key] -= 1
return value
elif seq == -1:
# XXX: In the case where the first task to allocate the
# ``.recv_ready`` event is cancelled we will be woken with
# a non-incremented sequence number and thus will read the
# oldest value if we use that. Instead we need to detect if
# we have not been incremented and then receive again.
return await self.receive()
# ``.recv_ready`` event is cancelled we will be woken
# with a non-incremented sequence number (the ``-1``
# sentinel) and thus will read the oldest value if we
# use that. Instead we need to detect if we have not
# been incremented and then receive again.
# return await self.receive()
else:
raise ValueError(f'Invalid sequence {seq}!?')
return await self._receive_from_underlying(key, state)
@asynccontextmanager
async def subscribe(
self,
raise_on_lag: bool = True,
) -> AsyncIterator[BroadcastReceiver]:
'''
Subscribe for values from this broadcast receiver.
@ -316,6 +407,7 @@ class BroadcastReceiver(ReceiveChannel):
rx_chan=self._rx,
state=state,
receive_afunc=self._recv,
raise_on_lag=raise_on_lag,
)
# assert clone in state.subs
assert br.key in state.subs
@ -352,7 +444,8 @@ def broadcast_receiver(
recv_chan: AsyncReceiver,
max_buffer_size: int,
**kwargs,
receive_afunc: Optional[Callable[[], Awaitable[Any]]] = None,
raise_on_lag: bool = True,
) -> BroadcastReceiver:
@ -363,5 +456,6 @@ def broadcast_receiver(
maxlen=max_buffer_size,
subs={},
),
**kwargs,
receive_afunc=receive_afunc,
raise_on_lag=raise_on_lag,
)

View File

@ -47,7 +47,7 @@ T = TypeVar("T")
@acm
async def maybe_open_nursery(
nursery: trio.Nursery = None,
nursery: trio.Nursery | None = None,
shield: bool = False,
) -> AsyncGenerator[trio.Nursery, Any]:
'''
@ -109,6 +109,17 @@ async def gather_contexts(
all_entered = trio.Event()
parent_exit = trio.Event()
# XXX: ensure greedy sequence of manager instances
# since a lazy inline generator doesn't seem to work
# with `async with` syntax.
mngrs = list(mngrs)
if not mngrs:
raise ValueError(
'input mngrs is empty?\n'
'Did try to use inline generator syntax?'
)
async with trio.open_nursery() as n:
for mngr in mngrs:
n.start_soon(
@ -122,12 +133,12 @@ async def gather_contexts(
# deliver control once all managers have started up
await all_entered.wait()
# NOTE: order *should* be preserved in the output values
# since ``dict``s are now implicitly ordered.
try:
yield tuple(unwrapped.values())
# we don't need a try/finally since cancellation will be triggered
# by the surrounding nursery on error.
finally:
# NOTE: this is ABSOLUTELY REQUIRED to avoid
# the following wacky bug:
# <tractorbugurlhere>
parent_exit.set()