forked from goodboy/tractor
1
0
Fork 0

Compare commits

...

60 Commits

Author SHA1 Message Date
Tyler Goodlet fec2ba004c Bleh more nested actor hackin.. 2022-07-31 15:42:13 -04:00
Tyler Goodlet ba9c914221 Allow for tree death in CI? 2022-07-31 14:41:39 -04:00
Tyler Goodlet 30ee3f2dcc Just skip expecting attach msg in CI for ctlc cases 2022-07-29 20:52:54 -04:00
Tyler Goodlet 0b4fc4fc47 Use `assert_before` more extensively 2022-07-29 19:56:11 -04:00
Tyler Goodlet 6b8c193221 Add another name hack for CI race conditions.. 2022-07-29 18:57:44 -04:00
Tyler Goodlet 05167bdc70 Try less times for EOF 2022-07-29 17:51:33 -04:00
Tyler Goodlet fa21083b51 Factor sigint overriding into lock methods 2022-07-29 17:51:12 -04:00
Tyler Goodlet e6ad7a117b Move all module vars into a `Lock` type 2022-07-29 16:03:36 -04:00
Tyler Goodlet 4366873582 Handle missing prompt render case? 2022-07-29 15:59:10 -04:00
Tyler Goodlet 9e6a22e52e Drop pytest full-tracing in CI again 2022-07-29 13:00:54 -04:00
Tyler Goodlet 0ab49cd244 Disable `pygments` highlighting on ctlc tests 2022-07-29 13:00:06 -04:00
Tyler Goodlet 3fafa87ea9 Avoid infinite wait for EOF 2022-07-29 12:20:56 -04:00
Tyler Goodlet a6f5b9396a Use `pytest-timeout` plug to try and prevent CI hang 2022-07-29 11:11:54 -04:00
Tyler Goodlet 61af2dc5aa Add timeout on spawn error msg check 2022-07-29 09:31:11 -04:00
Tyler Goodlet ba857fe85c Increase verbosity in ci tests for now 2022-07-29 09:30:55 -04:00
Tyler Goodlet cb221b9e7c Timeout on arbiter ping, avoid TCP SYN hangs in CI? 2022-07-28 14:54:03 -04:00
Tyler Goodlet 3bc4778950 Drop ctlc tests on Py3.9...
After many tries I just don't think it's worth it to make the tests work
since the repl UX in `pdbpp` is so unreliable in the latest release and
honestly we're trying to go 3.10+ ASAP.

Further,
- entirely drop the pattern matching inside the `do_ctlc()` for now.
- add a `subactor_error` parametrization that catches a case that
  previously caused a hang (when you use 'next' immediately after the
  first crash/debug lock (the fix was pushed just before this commit).
2022-07-28 14:24:51 -04:00
Tyler Goodlet 5ae21e4753 Factor lock-state release logic into helper
The common logic to both remove our custom SIGINT handler as well
as signal the actor global event that pdb is complete. Call this
whenever we exit a post mortem call and thus any time some rpc task
get's debugged inside `._actor._invoke()`.

Further, we have to manually print the REPL prompt on 3.9 for some wack
reason, so stick a version guard in the sigint handler for that..
2022-07-28 14:02:58 -04:00
Tyler Goodlet d4a36e57d1 Run release hook around `next` repl commands as well 2022-07-28 09:27:39 -04:00
Tyler Goodlet 58956ae950 Try disabling prompt expect in ctrlc cases 2022-07-28 09:00:41 -04:00
Tyler Goodlet a864f1e729 Add timeout guard around caller side context open 2022-07-27 15:13:27 -04:00
Tyler Goodlet a4bc5f79ad Always consider the debugger when exiting contexts
When in an uncertain teardown state and in debug mode a context can be
popped from actor runtime before a child finished debugging (the case
when the parent is tearing down but the child hasn't closed/completed
its tty lock IPC exit phase) and the child sends the "stop" message to
unlock the debugger but it's ignored bc the parent has already dropped
the ctx. Instead we call `._debug.maybe_wait_for_deugger()` before these
context removals to avoid the root getting stuck thinking the lock was
never released.

Further, add special `Actor._cancel_task()` handling code inside
`_invoke()` which continues to execute the method despite the IPC
channel to the caller being broken and thus avoiding potential hangs due
to a target (child) actor task remaining alive.
2022-07-27 11:40:03 -04:00
Tyler Goodlet c132b7f624 Add before assert helper and print console bytes on fail 2022-07-27 11:40:03 -04:00
Tyler Goodlet b659326d5b Do child decode on `do_ctlc` exit? 2022-07-27 11:40:03 -04:00
Tyler Goodlet d971e9bc9d Add ctl-c to remaining tests, only expect prompt in non-CI 2022-07-27 11:40:03 -04:00
Tyler Goodlet 611120c67c Add ctl-c case to `subactor_breakpoint` example test 2022-07-27 11:40:03 -04:00
Tyler Goodlet 7f6cace40b Add note about default behaviour of `fancycompleter` 2022-07-27 11:40:03 -04:00
Tyler Goodlet fe4adbf6f0 Disable ctl-c prompt checks again 2022-07-27 11:40:03 -04:00
Tyler Goodlet 6ccfeb17d5 Add back prompt expect via flag 2022-07-27 11:40:03 -04:00
Tyler Goodlet 9bed332a94 Test: drop expect prompt 2022-07-27 11:40:03 -04:00
Tyler Goodlet 13df959d90 Just drop all longlisting for now and leave comments 2022-07-27 11:40:03 -04:00
Tyler Goodlet d0074291a1 Go back to original longlist code 2022-07-27 11:40:03 -04:00
Tyler Goodlet 8559ad69f3 Just don't call longlist on 3.10+ for now 2022-07-27 11:40:03 -04:00
Tyler Goodlet e519df1bd2 Add longer delays around ctl-c loop, don't expect longlist 2022-07-27 11:40:02 -04:00
Tyler Goodlet 24fd87d969 Add sleep around ctl-c iteration loop 2022-07-27 11:40:02 -04:00
Tyler Goodlet 91054a8a42 Pin to specific `pdbppp` master commit 2022-07-27 11:40:02 -04:00
Tyler Goodlet cdc7bf6549 General typing fixes for `mypy` 2022-07-27 11:40:02 -04:00
Tyler Goodlet c865d01e85 Only call `.poll()` if a method on the spawn backend 2022-07-27 11:40:02 -04:00
Tyler Goodlet e1caeeb8de Fix loglevel in subactor test; actually pass the level XD 2022-07-27 11:40:02 -04:00
Tyler Goodlet 7c25aa176f Pin to `trio >= 0.20` 2022-07-27 11:40:02 -04:00
Tyler Goodlet 3b7985292f TOSQUASH: add note around delay 2022-07-27 11:40:02 -04:00
Tyler Goodlet e8fc820b92 Port to new `.lowlevel.open_process()` API 2022-07-27 11:40:02 -04:00
Tyler Goodlet b2fdbc44d1 Guard against asyncio canclled logged to console 2022-07-27 11:40:02 -04:00
Tyler Goodlet f7823a46b8 Add slight delay 2nd ctlc round.. 2022-07-27 11:40:02 -04:00
Tyler Goodlet f76c809c39 Call longlist normally when on py < 3.10 2022-07-27 11:40:02 -04:00
Tyler Goodlet 9e56881163 Only report disconnected actors if proc is still alive? 2022-07-27 11:40:02 -04:00
Tyler Goodlet 8291ee09b3 TOSQUASH: more loglevel for debug bs 2022-07-27 11:40:02 -04:00
Tyler Goodlet 4a441f0988 Only do `pdbpp` from `git` install on 3.10+ 2022-07-27 11:40:02 -04:00
Tyler Goodlet df0108a0bb I dunno, seems like `breakpoint()` needs this? 2022-07-27 11:40:02 -04:00
Tyler Goodlet 8537e17251 TOSQUASH: debug mode loglevel 2022-07-27 11:40:02 -04:00
Tyler Goodlet 20acb50d94 Add basic module-not-found when opening a ctx eg. 2022-07-27 11:40:02 -04:00
Tyler Goodlet eab895864f Always enable debug level logging if mode enabled 2022-07-27 11:40:02 -04:00
Tyler Goodlet 65a9f69d6c Add help msg for non `__main__` modules as well 2022-07-27 11:40:02 -04:00
Tyler Goodlet 24b6cc0209 Add basic ctl-c testing cases to suite 2022-07-27 11:40:02 -04:00
Tyler Goodlet f488db6d8d Hack around double long list print issue..
See https://github.com/pdbpp/pdbpp/issues/496
2022-07-27 11:40:02 -04:00
Tyler Goodlet c5d335c057 Show full KBI trace for help with CI hangs 2022-07-27 11:40:02 -04:00
Tyler Goodlet 4594fe3501 Move pydantic-click hang example to new dir, skip in test suite 2022-07-27 11:40:02 -04:00
Tyler Goodlet 5f0262fd98 Add spaces before values in log msg 2022-07-27 11:40:02 -04:00
Tyler Goodlet 59e7f29eed Add runtime level msg around channel draining 2022-07-27 11:40:02 -04:00
Tyler Goodlet e2dfd6e99d Always undo SIGINT overrides , cancel detached children
Ensure that even when `pdb` resumption methods are called during a crash
where `trio`'s runtime has already terminated (eg. `Event.set()` will
raise) we always revert our sigint handler to the original. Further
inside the handler if we hit a case where a child is in debug and
(thinks it) has the global pdb lock, if it has no IPC connection to
a parent, simply presume tty sync-coordination is now lost and cancel
the child immediately.
2022-07-27 11:40:02 -04:00
15 changed files with 750 additions and 333 deletions

View File

@ -74,7 +74,7 @@ jobs:
run: pip install -U . -r requirements-test.txt -r requirements-docs.txt --upgrade-strategy eager run: pip install -U . -r requirements-test.txt -r requirements-docs.txt --upgrade-strategy eager
- name: Run tests - name: Run tests
run: pytest tests/ --spawn-backend=${{ matrix.spawn_backend }} -rs run: pytest tests/ --spawn-backend=${{ matrix.spawn_backend }} -rs -v
# We skip 3.10 on windows for now due to # We skip 3.10 on windows for now due to
# https://github.com/pytest-dev/pytest/issues/8733 # https://github.com/pytest-dev/pytest/issues/8733
@ -111,4 +111,4 @@ jobs:
run: pip install -U . -r requirements-test.txt -r requirements-docs.txt --upgrade-strategy eager run: pip install -U . -r requirements-test.txt -r requirements-docs.txt --upgrade-strategy eager
- name: Run tests - name: Run tests
run: pytest tests/ --spawn-backend=${{ matrix.spawn_backend }} -rs run: pytest tests/ --spawn-backend=${{ matrix.spawn_backend }} -rs --full-trace

View File

@ -0,0 +1,40 @@
import trio
import tractor
@tractor.context
async def just_sleep(
ctx: tractor.Context,
**kwargs,
) -> None:
'''
Start and sleep.
'''
await ctx.started()
await trio.sleep_forever()
async def main() -> None:
async with tractor.open_nursery(
debug_mode=True,
) as n:
portal = await n.start_actor(
'ctx_child',
# XXX: we don't enable the current module in order
# to trigger `ModuleNotFound`.
enable_modules=[],
)
async with portal.open_context(
just_sleep, # taken from pytest parameterization
) as (ctx, sent):
raise KeyboardInterrupt
if __name__ == '__main__':
trio.run(main)

View File

@ -1,5 +1,6 @@
pytest pytest
pytest-trio pytest-trio
pytest-timeout
pdbpp pdbpp
mypy<0.920 mypy<0.920
trio_typing<0.7.0 trio_typing<0.7.0

View File

@ -43,7 +43,7 @@ setup(
install_requires=[ install_requires=[
# trio related # trio related
'trio>0.8', 'trio >= 0.20',
'async_generator', 'async_generator',
'trio_typing', 'trio_typing',
@ -54,15 +54,22 @@ setup(
# tooling # tooling
'colorlog', 'colorlog',
'wrapt', 'wrapt',
# pip ref docs on these specs:
# https://pip.pypa.io/en/stable/reference/requirement-specifiers/#examples
# and pep:
# https://peps.python.org/pep-0440/#version-specifiers
'pdbpp <= 0.10.1; python_version < "3.10"',
# windows deps workaround for ``pdbpp`` # windows deps workaround for ``pdbpp``
# https://github.com/pdbpp/pdbpp/issues/498 # https://github.com/pdbpp/pdbpp/issues/498
# https://github.com/pdbpp/fancycompleter/issues/37 # https://github.com/pdbpp/fancycompleter/issues/37
'pyreadline3 ; platform_system == "Windows"', 'pyreadline3 ; platform_system == "Windows"',
'pdbpp',
# 3.10 has an outstanding unreleased issue and `pdbpp` itself # 3.10 has an outstanding unreleased issue and `pdbpp` itself
# pins to patched forks of its own dependencies as well. # pins to patched forks of its own dependencies as well..and
"pdbpp @ git+https://github.com/pdbpp/pdbpp@master#egg=pdbpp", # noqa: E501 # we need a specific patch on master atm.
'pdbpp @ git+https://github.com/pdbpp/pdbpp@76c4be5#egg=pdbpp ; python_version > "3.9"', # noqa: E501
# serialization # serialization
'msgspec >= "0.4.0"' 'msgspec >= "0.4.0"'
@ -87,8 +94,8 @@ setup(
"License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)", "License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)",
"Programming Language :: Python :: Implementation :: CPython", "Programming Language :: Python :: Implementation :: CPython",
"Programming Language :: Python :: 3 :: Only", "Programming Language :: Python :: 3 :: Only",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.9", "Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Intended Audience :: Science/Research", "Intended Audience :: Science/Research",
"Intended Audience :: Developers", "Intended Audience :: Developers",
"Topic :: System :: Distributed Computing", "Topic :: System :: Distributed Computing",

View File

@ -85,11 +85,14 @@ def spawn_backend(request):
return request.config.option.spawn_backend return request.config.option.spawn_backend
_ci_env: bool = os.environ.get('CI', False)
@pytest.fixture(scope='session') @pytest.fixture(scope='session')
def ci_env() -> bool: def ci_env() -> bool:
"""Detect CI envoirment. """Detect CI envoirment.
""" """
return os.environ.get('TRAVIS', False) or os.environ.get('CI', False) return _ci_env
@pytest.fixture(scope='session') @pytest.fixture(scope='session')

View File

@ -265,42 +265,44 @@ async def test_callee_closes_ctx_after_stream_open():
enable_modules=[__name__], enable_modules=[__name__],
) )
async with portal.open_context( with trio.fail_after(2):
close_ctx_immediately, async with portal.open_context(
close_ctx_immediately,
# flag to avoid waiting the final result # flag to avoid waiting the final result
# cancel_on_exit=True, # cancel_on_exit=True,
) as (ctx, sent): ) as (ctx, sent):
assert sent is None assert sent is None
with trio.fail_after(0.5): with trio.fail_after(0.5):
async with ctx.open_stream() as stream: async with ctx.open_stream() as stream:
# should fall through since ``StopAsyncIteration`` # should fall through since ``StopAsyncIteration``
# should be raised through translation of # should be raised through translation of
# a ``trio.EndOfChannel`` by # a ``trio.EndOfChannel`` by
# ``trio.abc.ReceiveChannel.__anext__()`` # ``trio.abc.ReceiveChannel.__anext__()``
async for _ in stream: async for _ in stream:
assert 0 assert 0
else: else:
# verify stream is now closed # verify stream is now closed
try: try:
await stream.receive() await stream.receive()
except trio.EndOfChannel: except trio.EndOfChannel:
pass
# TODO: should be just raise the closed resource err
# directly here to enforce not allowing a re-open
# of a stream to the context (at least until a time of
# if/when we decide that's a good idea?)
try:
with trio.fail_after(0.5):
async with ctx.open_stream() as stream:
pass pass
except trio.ClosedResourceError:
# TODO: should be just raise the closed resource err
# directly here to enforce not allowing a re-open
# of a stream to the context (at least until a time of
# if/when we decide that's a good idea?)
try:
async with ctx.open_stream() as stream:
pass pass
except trio.ClosedResourceError:
pass
await portal.cancel_actor() await portal.cancel_actor()

View File

@ -1,5 +1,5 @@
""" """
That native debug better work! That "native" debug mode better work!
All these tests can be understood (somewhat) by running the equivalent All these tests can be understood (somewhat) by running the equivalent
`examples/debugging/` scripts manually. `examples/debugging/` scripts manually.
@ -10,16 +10,21 @@ TODO:
- wonder if any of it'll work on OS X? - wonder if any of it'll work on OS X?
""" """
import time
from os import path from os import path
from typing import Optional
import platform import platform
import sys
import time
import pytest import pytest
import pexpect import pexpect
from pexpect.exceptions import (
TIMEOUT,
EOF,
)
from conftest import repodir from conftest import repodir
# TODO: The next great debugger audit could be done by you! # TODO: The next great debugger audit could be done by you!
# - recurrent entry to breakpoint() from single actor *after* and an # - recurrent entry to breakpoint() from single actor *after* and an
# error in another task? # error in another task?
@ -73,6 +78,52 @@ def spawn(
return _spawn return _spawn
def assert_before(
child,
patts: list[str],
) -> None:
before = str(child.before.decode())
for patt in patts:
try:
assert patt in before
except AssertionError:
print(before)
raise
@pytest.fixture(
params=[False, True],
ids='ctl-c={}'.format,
)
def ctlc(request) -> bool:
use_ctlc = request.param
if (
sys.version_info <= (3, 10)
and use_ctlc
):
# on 3.9 it seems the REPL UX
# is highly unreliable and frankly annoying
# to test for. It does work from manual testing
# but i just don't think it's wroth it to try
# and get this working especially since we want to
# be 3.10+ mega-asap.
pytest.skip('Py3.9 and `pdbpp` son no bueno..')
if use_ctlc:
# XXX: disable pygments highlighting for auto-tests
# since some envs (like actions CI) will struggle
# the the added color-char encoding..
from tractor._debug import TractorConfig
TractorConfig.use_pygements = False
yield use_ctlc
@pytest.mark.parametrize( @pytest.mark.parametrize(
'user_in_out', 'user_in_out',
[ [
@ -137,20 +188,67 @@ def test_root_actor_bp(spawn, user_in_out):
assert expect_err_str in str(child.before) assert expect_err_str in str(child.before)
def test_root_actor_bp_forever(spawn): def do_ctlc(
child,
count: int = 3,
delay: float = 0.1,
patt: Optional[str] = None,
# XXX: literally no idea why this is an issue in CI but likely will
# flush out (hopefully) with proper 3.10 release of `pdbpp`...
expect_prompt: bool = True,
) -> None:
# make sure ctl-c sends don't do anything but repeat output
for _ in range(count):
time.sleep(delay)
child.sendcontrol('c')
# TODO: figure out why this makes CI fail..
# if you run this test manually it works just fine..
from conftest import _ci_env
if expect_prompt and not _ci_env:
before = str(child.before.decode())
time.sleep(delay)
child.expect(r"\(Pdb\+\+\)")
time.sleep(delay)
if patt:
# should see the last line on console
assert patt in before
def test_root_actor_bp_forever(
spawn,
ctlc: bool,
):
"Re-enter a breakpoint from the root actor-task." "Re-enter a breakpoint from the root actor-task."
child = spawn('root_actor_breakpoint_forever') child = spawn('root_actor_breakpoint_forever')
# do some "next" commands to demonstrate recurrent breakpoint # do some "next" commands to demonstrate recurrent breakpoint
# entries # entries
for _ in range(10): for _ in range(10):
child.sendline('next')
child.expect(r"\(Pdb\+\+\)") child.expect(r"\(Pdb\+\+\)")
# do one continue which should trigger a new task to lock the tty if ctlc:
do_ctlc(child)
child.sendline('next')
# do one continue which should trigger a
# new task to lock the tty
child.sendline('continue') child.sendline('continue')
child.expect(r"\(Pdb\+\+\)") child.expect(r"\(Pdb\+\+\)")
# seems that if we hit ctrl-c too fast the
# sigint guard machinery might not kick in..
time.sleep(0.001)
if ctlc:
do_ctlc(child)
# XXX: this previously caused a bug! # XXX: this previously caused a bug!
child.sendline('n') child.sendline('n')
child.expect(r"\(Pdb\+\+\)") child.expect(r"\(Pdb\+\+\)")
@ -158,8 +256,21 @@ def test_root_actor_bp_forever(spawn):
child.sendline('n') child.sendline('n')
child.expect(r"\(Pdb\+\+\)") child.expect(r"\(Pdb\+\+\)")
# quit out of the loop
child.sendline('q')
child.expect(pexpect.EOF)
def test_subactor_error(spawn):
@pytest.mark.parametrize(
'do_next',
(True, False),
ids='do_next={}'.format,
)
def test_subactor_error(
spawn,
ctlc: bool,
do_next: bool,
):
"Single subactor raising an error" "Single subactor raising an error"
child = spawn('subactor_error') child = spawn('subactor_error')
@ -170,23 +281,33 @@ def test_subactor_error(spawn):
before = str(child.before.decode()) before = str(child.before.decode())
assert "Attaching to pdb in crashed actor: ('name_error'" in before assert "Attaching to pdb in crashed actor: ('name_error'" in before
# send user command if do_next:
# (in this case it's the same for 'continue' vs. 'quit') child.sendline('n')
child.sendline('continue')
# the debugger should enter a second time in the nursery else:
# creating actor # make sure ctl-c sends don't do anything but repeat output
if ctlc:
do_ctlc(
child,
)
# send user command and (in this case it's the same for 'continue'
# vs. 'quit') the debugger should enter a second time in the nursery
# creating actor
child.sendline('continue')
child.expect(r"\(Pdb\+\+\)") child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode()) before = str(child.before.decode())
# root actor gets debugger engaged # root actor gets debugger engaged
assert "Attaching to pdb in crashed actor: ('root'" in before assert "Attaching to pdb in crashed actor: ('root'" in before
# error is a remote error propagated from the subactor # error is a remote error propagated from the subactor
assert "RemoteActorError: ('name_error'" in before assert "RemoteActorError: ('name_error'" in before
# another round
if ctlc:
do_ctlc(child)
child.sendline('c') child.sendline('c')
child.expect('\r\n') child.expect('\r\n')
@ -194,7 +315,10 @@ def test_subactor_error(spawn):
child.expect(pexpect.EOF) child.expect(pexpect.EOF)
def test_subactor_breakpoint(spawn): def test_subactor_breakpoint(
spawn,
ctlc: bool,
):
"Single subactor with an infinite breakpoint loop" "Single subactor with an infinite breakpoint loop"
child = spawn('subactor_breakpoint') child = spawn('subactor_breakpoint')
@ -211,6 +335,9 @@ def test_subactor_breakpoint(spawn):
child.sendline('next') child.sendline('next')
child.expect(r"\(Pdb\+\+\)") child.expect(r"\(Pdb\+\+\)")
if ctlc:
do_ctlc(child)
# now run some "continues" to show re-entries # now run some "continues" to show re-entries
for _ in range(5): for _ in range(5):
child.sendline('continue') child.sendline('continue')
@ -218,6 +345,9 @@ def test_subactor_breakpoint(spawn):
before = str(child.before.decode()) before = str(child.before.decode())
assert "Attaching pdb to actor: ('breakpoint_forever'" in before assert "Attaching pdb to actor: ('breakpoint_forever'" in before
if ctlc:
do_ctlc(child)
# finally quit the loop # finally quit the loop
child.sendline('q') child.sendline('q')
@ -228,6 +358,9 @@ def test_subactor_breakpoint(spawn):
assert "RemoteActorError: ('breakpoint_forever'" in before assert "RemoteActorError: ('breakpoint_forever'" in before
assert 'bdb.BdbQuit' in before assert 'bdb.BdbQuit' in before
if ctlc:
do_ctlc(child)
# quit the parent # quit the parent
child.sendline('c') child.sendline('c')
@ -239,11 +372,15 @@ def test_subactor_breakpoint(spawn):
assert 'bdb.BdbQuit' in before assert 'bdb.BdbQuit' in before
def test_multi_subactors(spawn): def test_multi_subactors(
""" spawn,
Multiple subactors, both erroring and breakpointing as well as ctlc: bool,
a nested subactor erroring. ):
""" '''
Multiple subactors, both erroring and
breakpointing as well as a nested subactor erroring.
'''
child = spawn(r'multi_subactors') child = spawn(r'multi_subactors')
# scan for the pdbpp prompt # scan for the pdbpp prompt
@ -252,12 +389,18 @@ def test_multi_subactors(spawn):
before = str(child.before.decode()) before = str(child.before.decode())
assert "Attaching pdb to actor: ('breakpoint_forever'" in before assert "Attaching pdb to actor: ('breakpoint_forever'" in before
if ctlc:
do_ctlc(child)
# do some "next" commands to demonstrate recurrent breakpoint # do some "next" commands to demonstrate recurrent breakpoint
# entries # entries
for _ in range(10): for _ in range(10):
child.sendline('next') child.sendline('next')
child.expect(r"\(Pdb\+\+\)") child.expect(r"\(Pdb\+\+\)")
if ctlc:
do_ctlc(child)
# continue to next error # continue to next error
child.sendline('c') child.sendline('c')
@ -267,14 +410,28 @@ def test_multi_subactors(spawn):
assert "Attaching to pdb in crashed actor: ('name_error'" in before assert "Attaching to pdb in crashed actor: ('name_error'" in before
assert "NameError" in before assert "NameError" in before
if ctlc:
do_ctlc(child)
# continue again # continue again
child.sendline('c') child.sendline('c')
# 2nd name_error failure # 2nd name_error failure
child.expect(r"\(Pdb\+\+\)") child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode())
assert "Attaching to pdb in crashed actor: ('name_error_1'" in before # XXX: lol honestly no idea why CI is cuck but
assert "NameError" in before # seems like this likely falls into our unhandled nested
# case and isn't working in that env due to raciness..
from conftest import _ci_env
if not ctlc and _ci_env:
name = 'name_error' if ctlc else 'name_error_1'
assert_before(child, [
f"Attaching to pdb in crashed actor: ('{name}'",
"NameError",
])
if ctlc:
do_ctlc(child)
# breakpoint loop should re-engage # breakpoint loop should re-engage
child.sendline('c') child.sendline('c')
@ -282,20 +439,30 @@ def test_multi_subactors(spawn):
before = str(child.before.decode()) before = str(child.before.decode())
assert "Attaching pdb to actor: ('breakpoint_forever'" in before assert "Attaching pdb to actor: ('breakpoint_forever'" in before
if ctlc:
do_ctlc(child)
# wait for spawn error to show up # wait for spawn error to show up
spawn_err = "Attaching to pdb in crashed actor: ('spawn_error'" spawn_err = "Attaching to pdb in crashed actor: ('spawn_error'"
while spawn_err not in before: start = time.time()
while (
spawn_err not in before
and (time.time() - start) < 3
):
child.sendline('c') child.sendline('c')
time.sleep(0.1) time.sleep(0.1)
child.expect(r"\(Pdb\+\+\)") child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode()) before = str(child.before.decode())
if ctlc:
do_ctlc(child)
# 2nd depth nursery should trigger # 2nd depth nursery should trigger
# child.sendline('c') if not ctlc:
# child.expect(r"\(Pdb\+\+\)") assert_before(child, [
# before = str(child.before.decode()) spawn_err,
assert spawn_err in before "RemoteActorError: ('name_error_1'",
assert "RemoteActorError: ('name_error_1'" in before ])
# now run some "continues" to show re-entries # now run some "continues" to show re-entries
for _ in range(5): for _ in range(5):
@ -304,33 +471,62 @@ def test_multi_subactors(spawn):
# quit the loop and expect parent to attach # quit the loop and expect parent to attach
child.sendline('q') child.sendline('q')
child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode()) try:
# debugger attaches to root child.expect(r"\(Pdb\+\+\)")
assert "Attaching to pdb in crashed actor: ('root'" in before except TIMEOUT:
# expect a multierror with exceptions for each sub-actor if _ci_env and not ctlc:
assert "RemoteActorError: ('breakpoint_forever'" in before raise
assert "RemoteActorError: ('name_error'" in before
assert "RemoteActorError: ('spawn_error'" in before # in ci seems like this can sometimes just result
assert "RemoteActorError: ('name_error_1'" in before # in full tree death?
assert 'bdb.BdbQuit' in before print('tree died?')
else:
before = str(child.before.decode())
assert_before(child, [
# debugger attaches to root
"Attaching to pdb in crashed actor: ('root'",
# expect a multierror with exceptions for each sub-actor
"RemoteActorError: ('breakpoint_forever'",
"RemoteActorError: ('name_error'",
"RemoteActorError: ('spawn_error'",
"RemoteActorError: ('name_error_1'",
'bdb.BdbQuit',
])
if ctlc:
do_ctlc(child)
# process should exit # process should exit
child.sendline('c') child.sendline('c')
child.expect(pexpect.EOF)
try:
child.expect(pexpect.EOF)
except TIMEOUT:
child.expect(r"\(Pdb\+\+\)")
# repeat of previous multierror for final output # repeat of previous multierror for final output
before = str(child.before.decode()) assert_before(child, [
assert "RemoteActorError: ('breakpoint_forever'" in before "RemoteActorError: ('breakpoint_forever'",
assert "RemoteActorError: ('name_error'" in before "RemoteActorError: ('name_error'",
assert "RemoteActorError: ('spawn_error'" in before "RemoteActorError: ('spawn_error'",
assert "RemoteActorError: ('name_error_1'" in before "RemoteActorError: ('name_error_1'",
assert 'bdb.BdbQuit' in before 'bdb.BdbQuit',
])
def test_multi_daemon_subactors(spawn, loglevel): def test_multi_daemon_subactors(
"""Multiple daemon subactors, both erroring and breakpointing within a spawn,
loglevel: str,
ctlc: bool
):
'''
Multiple daemon subactors, both erroring and breakpointing within a
stream. stream.
"""
'''
child = spawn('multi_daemon_subactors') child = spawn('multi_daemon_subactors')
child.expect(r"\(Pdb\+\+\)") child.expect(r"\(Pdb\+\+\)")
@ -352,6 +548,9 @@ def test_multi_daemon_subactors(spawn, loglevel):
else: else:
raise ValueError("Neither log msg was found !?") raise ValueError("Neither log msg was found !?")
if ctlc:
do_ctlc(child)
# NOTE: previously since we did not have clobber prevention # NOTE: previously since we did not have clobber prevention
# in the root actor this final resume could result in the debugger # in the root actor this final resume could result in the debugger
# tearing down since both child actors would be cancelled and it was # tearing down since both child actors would be cancelled and it was
@ -371,7 +570,7 @@ def test_multi_daemon_subactors(spawn, loglevel):
# now the root actor won't clobber the bp_forever child # now the root actor won't clobber the bp_forever child
# during it's first access to the debug lock, but will instead # during it's first access to the debug lock, but will instead
# wait for the lock to release, by the edge triggered # wait for the lock to release, by the edge triggered
# ``_debug._no_remote_has_tty`` event before sending cancel messages # ``_debug.Lock.no_remote_has_tty`` event before sending cancel messages
# (via portals) to its underlings B) # (via portals) to its underlings B)
# at some point here there should have been some warning msg from # at some point here there should have been some warning msg from
@ -379,6 +578,9 @@ def test_multi_daemon_subactors(spawn, loglevel):
# it seems unreliable in testing here to gnab it: # it seems unreliable in testing here to gnab it:
# assert "in use by child ('bp_forever'," in before # assert "in use by child ('bp_forever'," in before
if ctlc:
do_ctlc(child)
# wait for final error in root # wait for final error in root
while True: while True:
@ -394,17 +596,23 @@ def test_multi_daemon_subactors(spawn, loglevel):
except AssertionError: except AssertionError:
assert bp_forever_msg in before assert bp_forever_msg in before
if ctlc:
do_ctlc(child)
try: try:
child.sendline('c') child.sendline('c')
child.expect(pexpect.EOF) child.expect(pexpect.EOF)
except pexpect.exceptions.TIMEOUT: except TIMEOUT:
# Failed to exit using continue..? # Failed to exit using continue..?
child.sendline('q') child.sendline('q')
child.expect(pexpect.EOF) child.expect(pexpect.EOF)
def test_multi_subactors_root_errors(spawn): def test_multi_subactors_root_errors(
spawn,
ctlc: bool
):
''' '''
Multiple subactors, both erroring and breakpointing as well as Multiple subactors, both erroring and breakpointing as well as
a nested subactor erroring. a nested subactor erroring.
@ -419,33 +627,58 @@ def test_multi_subactors_root_errors(spawn):
before = str(child.before.decode()) before = str(child.before.decode())
assert "NameError: name 'doggypants' is not defined" in before assert "NameError: name 'doggypants' is not defined" in before
if ctlc:
do_ctlc(child)
# continue again to catch 2nd name error from # continue again to catch 2nd name error from
# actor 'name_error_1' (which is 2nd depth). # actor 'name_error_1' (which is 2nd depth).
child.sendline('c') child.sendline('c')
child.expect(r"\(Pdb\+\+\)") try:
before = str(child.before.decode()) child.expect(r"\(Pdb\+\+\)")
assert "Attaching to pdb in crashed actor: ('name_error_1'" in before except TIMEOUT:
assert "NameError" in before child.sendline('')
# XXX: lol honestly no idea why CI is cuck but
# seems like this likely falls into our unhandled nested
# case and isn't working in that env due to raciness..
from conftest import _ci_env
if not ctlc and _ci_env:
name = 'name_error' if ctlc else 'name_error_1'
assert_before(child, [
f"Attaching to pdb in crashed actor: ('{name}'",
"NameError",
])
if ctlc:
do_ctlc(child)
child.sendline('c') child.sendline('c')
child.expect(r"\(Pdb\+\+\)") child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode()) assert_before(child, [
assert "Attaching to pdb in crashed actor: ('spawn_error'" in before "Attaching to pdb in crashed actor: ('spawn_error'",
# boxed error from previous step # boxed error from previous step
assert "RemoteActorError: ('name_error_1'" in before "RemoteActorError: ('name_error_1'",
assert "NameError" in before "NameError",
])
if ctlc:
do_ctlc(child)
child.sendline('c') child.sendline('c')
child.expect(r"\(Pdb\+\+\)") child.expect(r"\(Pdb\+\+\)")
before = str(child.before.decode()) assert_before(child, [
assert "Attaching to pdb in crashed actor: ('root'" in before "Attaching to pdb in crashed actor: ('root'",
# boxed error from first level failure # boxed error from previous step
assert "RemoteActorError: ('name_error'" in before "RemoteActorError: ('name_error'",
assert "NameError" in before "NameError",
])
# warnings assert we probably don't need # warnings assert we probably don't need
# assert "Cancelling nursery in ('spawn_error'," in before # assert "Cancelling nursery in ('spawn_error'," in before
if ctlc:
do_ctlc(child)
# continue again # continue again
child.sendline('c') child.sendline('c')
child.expect(pexpect.EOF) child.expect(pexpect.EOF)
@ -455,7 +688,13 @@ def test_multi_subactors_root_errors(spawn):
assert "AssertionError" in before assert "AssertionError" in before
def test_multi_nested_subactors_error_through_nurseries(spawn): def test_multi_nested_subactors_error_through_nurseries(
spawn,
# TODO: address debugger issue for nested tree:
# <issuelink>
# ctlc: bool,
):
"""Verify deeply nested actors that error trigger debugger entries """Verify deeply nested actors that error trigger debugger entries
at each actor nurserly (level) all the way up the tree. at each actor nurserly (level) all the way up the tree.
@ -476,7 +715,7 @@ def test_multi_nested_subactors_error_through_nurseries(spawn):
child.sendline('c') child.sendline('c')
time.sleep(0.1) time.sleep(0.1)
except pexpect.exceptions.EOF: except EOF:
# race conditions on how fast the continue is sent? # race conditions on how fast the continue is sent?
print(f"Failed early on {i}?") print(f"Failed early on {i}?")
@ -490,9 +729,11 @@ def test_multi_nested_subactors_error_through_nurseries(spawn):
assert "NameError" in before assert "NameError" in before
@pytest.mark.timeout(15)
def test_root_nursery_cancels_before_child_releases_tty_lock( def test_root_nursery_cancels_before_child_releases_tty_lock(
spawn, spawn,
start_method start_method,
ctlc: bool,
): ):
"""Test that when the root sends a cancel message before a nested """Test that when the root sends a cancel message before a nested
child has unblocked (which can happen when it has the tty lock and child has unblocked (which can happen when it has the tty lock and
@ -509,6 +750,9 @@ def test_root_nursery_cancels_before_child_releases_tty_lock(
assert "tractor._exceptions.RemoteActorError: ('name_error'" not in before assert "tractor._exceptions.RemoteActorError: ('name_error'" not in before
time.sleep(0.5) time.sleep(0.5)
if ctlc:
do_ctlc(child)
child.sendline('c') child.sendline('c')
for i in range(4): for i in range(4):
@ -517,8 +761,8 @@ def test_root_nursery_cancels_before_child_releases_tty_lock(
child.expect(r"\(Pdb\+\+\)") child.expect(r"\(Pdb\+\+\)")
except ( except (
pexpect.exceptions.EOF, EOF,
pexpect.exceptions.TIMEOUT, TIMEOUT,
): ):
# races all over.. # races all over..
@ -533,15 +777,23 @@ def test_root_nursery_cancels_before_child_releases_tty_lock(
before = str(child.before.decode()) before = str(child.before.decode())
assert "NameError: name 'doggypants' is not defined" in before assert "NameError: name 'doggypants' is not defined" in before
child.sendline('c') if ctlc:
do_ctlc(child)
while True: child.sendline('c')
time.sleep(0.1)
for i in range(3):
try: try:
child.expect(pexpect.EOF) child.expect(pexpect.EOF)
break break
except pexpect.exceptions.TIMEOUT: except TIMEOUT:
child.sendline('c') child.sendline('c')
time.sleep(0.1)
print('child was able to grab tty lock again?') print('child was able to grab tty lock again?')
else:
child.sendline('q')
child.expect(pexpect.EOF)
if not timed_out_early: if not timed_out_early:
@ -553,6 +805,7 @@ def test_root_nursery_cancels_before_child_releases_tty_lock(
def test_root_cancels_child_context_during_startup( def test_root_cancels_child_context_during_startup(
spawn, spawn,
ctlc: bool,
): ):
'''Verify a fast fail in the root doesn't lock up the child reaping '''Verify a fast fail in the root doesn't lock up the child reaping
and all while using the new context api. and all while using the new context api.
@ -565,12 +818,16 @@ def test_root_cancels_child_context_during_startup(
before = str(child.before.decode()) before = str(child.before.decode())
assert "AssertionError" in before assert "AssertionError" in before
if ctlc:
do_ctlc(child)
child.sendline('c') child.sendline('c')
child.expect(pexpect.EOF) child.expect(pexpect.EOF)
def test_different_debug_mode_per_actor( def test_different_debug_mode_per_actor(
spawn, spawn,
ctlc: bool,
): ):
child = spawn('per_actor_debug') child = spawn('per_actor_debug')
child.expect(r"\(Pdb\+\+\)") child.expect(r"\(Pdb\+\+\)")
@ -580,6 +837,9 @@ def test_different_debug_mode_per_actor(
assert "Attaching to pdb in crashed actor: ('debugged_boi'" in before assert "Attaching to pdb in crashed actor: ('debugged_boi'" in before
assert "RuntimeError" in before assert "RuntimeError" in before
if ctlc:
do_ctlc(child)
child.sendline('c') child.sendline('c')
child.expect(pexpect.EOF) child.expect(pexpect.EOF)

View File

@ -81,11 +81,14 @@ def run_example_in_subproc(loglevel, testdir, arb_addr):
'example_script', 'example_script',
# walk yields: (dirpath, dirnames, filenames) # walk yields: (dirpath, dirnames, filenames)
[(p[0], f) for p in os.walk(examples_dir()) for f in p[2] [
(p[0], f) for p in os.walk(examples_dir()) for f in p[2]
if '__' not in f if '__' not in f
and f[0] != '_' and f[0] != '_'
and 'debugging' not in p[0]], and 'debugging' not in p[0]
and 'integration' not in p[0]
],
ids=lambda t: t[1], ids=lambda t: t[1],
) )
@ -113,9 +116,19 @@ def test_example(run_example_in_subproc, example_script):
# print(f'STDOUT: {out}') # print(f'STDOUT: {out}')
# if we get some gnarly output let's aggregate and raise # if we get some gnarly output let's aggregate and raise
errmsg = err.decode() if err:
errlines = errmsg.splitlines() errmsg = err.decode()
if err and 'Error' in errlines[-1]: errlines = errmsg.splitlines()
raise Exception(errmsg) last_error = errlines[-1]
if (
'Error' in last_error
# XXX: currently we print this to console, but maybe
# shouldn't eventually once we figure out what's
# a better way to be explicit about aio side
# cancels?
and 'asyncio.exceptions.CancelledError' not in last_error
):
raise Exception(errmsg)
assert proc.returncode == 0 assert proc.returncode == 0

View File

@ -150,13 +150,13 @@ def test_loglevel_propagated_to_subactor(
async def main(): async def main():
async with tractor.open_nursery( async with tractor.open_nursery(
name='arbiter', name='arbiter',
loglevel=level,
start_method=start_method, start_method=start_method,
arbiter_addr=arb_addr, arbiter_addr=arb_addr,
) as tn: ) as tn:
await tn.run_in_actor( await tn.run_in_actor(
check_loglevel, check_loglevel,
loglevel=level,
level=level, level=level,
) )

View File

@ -87,9 +87,10 @@ async def _invoke(
''' '''
__tracebackhide__ = True __tracebackhide__ = True
treat_as_gen = False treat_as_gen: bool = False
failed_resp: bool = False
# possible a traceback (not sure what typing is for this..) # possibly a traceback (not sure what typing is for this..)
tb = None tb = None
cancel_scope = trio.CancelScope() cancel_scope = trio.CancelScope()
@ -190,7 +191,8 @@ async def _invoke(
ctx._scope_nursery = scope_nursery ctx._scope_nursery = scope_nursery
cs = scope_nursery.cancel_scope cs = scope_nursery.cancel_scope
task_status.started(cs) task_status.started(cs)
await chan.send({'return': await coro, 'cid': cid}) res = await coro
await chan.send({'return': res, 'cid': cid})
except trio.MultiError: except trio.MultiError:
# if a context error was set then likely # if a context error was set then likely
@ -204,7 +206,12 @@ async def _invoke(
# XXX: only pop the context tracking if # XXX: only pop the context tracking if
# a ``@tractor.context`` entrypoint was called # a ``@tractor.context`` entrypoint was called
assert chan.uid assert chan.uid
# don't pop the local context until we know the
# associated child isn't in debug any more
await _debug.maybe_wait_for_debugger()
ctx = actor._contexts.pop((chan.uid, cid)) ctx = actor._contexts.pop((chan.uid, cid))
if ctx: if ctx:
log.runtime( log.runtime(
f'Context entrypoint {func} was terminated:\n{ctx}' f'Context entrypoint {func} was terminated:\n{ctx}'
@ -235,10 +242,24 @@ async def _invoke(
else: else:
# regular async function # regular async function
await chan.send({'functype': 'asyncfunc', 'cid': cid}) try:
await chan.send({'functype': 'asyncfunc', 'cid': cid})
except trio.BrokenResourceError:
failed_resp = True
if is_rpc:
raise
else:
log.warning(
f'Failed to respond to non-rpc request: {func}'
)
with cancel_scope as cs: with cancel_scope as cs:
task_status.started(cs) task_status.started(cs)
await chan.send({'return': await coro, 'cid': cid}) result = await coro
log.cancel(f'result: {result}')
if not failed_resp:
# only send result if we know IPC isn't down
await chan.send({'return': result, 'cid': cid})
except ( except (
Exception, Exception,
@ -283,6 +304,7 @@ async def _invoke(
except ( except (
trio.ClosedResourceError, trio.ClosedResourceError,
trio.BrokenResourceError, trio.BrokenResourceError,
BrokenPipeError,
): ):
# if we can't propagate the error that's a big boo boo # if we can't propagate the error that's a big boo boo
log.error( log.error(
@ -509,13 +531,20 @@ class Actor:
mne = ModuleNotExposed(*err.args) mne = ModuleNotExposed(*err.args)
if ns == '__main__': if ns == '__main__':
msg = ( modpath = '__name__'
"\n\nMake sure you exposed the current module using:\n\n" else:
"ActorNursery.start_actor(<name>, enable_modules=" modpath = f"'{ns}'"
"[__name__])"
)
mne.msg += msg msg = (
"\n\nMake sure you exposed the target module, `{ns}`, "
"using:\n"
"ActorNursery.start_actor(<name>, enable_modules=[{mod}])"
).format(
ns=ns,
mod=modpath,
)
mne.msg += msg
raise mne raise mne
@ -603,17 +632,6 @@ class Actor:
if ( if (
local_nursery local_nursery
): ):
if disconnected:
# if the transport died and this actor is still
# registered within a local nursery, we report that the
# IPC layer may have failed unexpectedly since it may be
# the cause of other downstream errors.
entry = local_nursery._children.get(uid)
if entry:
_, proc, _ = entry
log.warning(f'Actor {uid}@{proc} IPC connection broke!?')
# if proc.poll() is not None:
# log.error('Actor {uid} proc died and IPC broke?')
log.cancel(f"Waiting on cancel request to peer {chan.uid}") log.cancel(f"Waiting on cancel request to peer {chan.uid}")
# XXX: this is a soft wait on the channel (and its # XXX: this is a soft wait on the channel (and its
@ -630,6 +648,11 @@ class Actor:
# Attempt to wait for the far end to close the channel # Attempt to wait for the far end to close the channel
# and bail after timeout (2-generals on closure). # and bail after timeout (2-generals on closure).
assert chan.msgstream assert chan.msgstream
log.runtime(
f'Draining lingering msgs from stream {chan.msgstream}'
)
async for msg in chan.msgstream.drain(): async for msg in chan.msgstream.drain():
# try to deliver any lingering msgs # try to deliver any lingering msgs
# before we destroy the channel. # before we destroy the channel.
@ -646,7 +669,20 @@ class Actor:
await local_nursery.exited.wait() await local_nursery.exited.wait()
# if local_nursery._children if disconnected:
# if the transport died and this actor is still
# registered within a local nursery, we report that the
# IPC layer may have failed unexpectedly since it may be
# the cause of other downstream errors.
entry = local_nursery._children.get(uid)
if entry:
_, proc, _ = entry
poll = getattr(proc, 'poll', None)
if poll and poll() is None:
log.cancel(
f'Actor {uid} IPC broke but proc is alive?'
)
# ``Channel`` teardown and closure sequence # ``Channel`` teardown and closure sequence
@ -688,7 +724,7 @@ class Actor:
# await chan.aclose() # await chan.aclose()
except trio.BrokenResourceError: except trio.BrokenResourceError:
log.warning(f"Channel for {chan.uid} was already closed") log.runtime(f"Channel {chan.uid} was already closed")
async def _push_result( async def _push_result(
self, self,
@ -919,17 +955,19 @@ class Actor:
chan._exc = exc chan._exc = exc
raise exc raise exc
log.runtime( log.info(
f"Processing request from {actorid}\n" f"Processing request from {actorid}\n"
f"{ns}.{funcname}({kwargs})") f"{ns}.{funcname}({kwargs})")
if ns == 'self': if ns == 'self':
func = getattr(self, funcname) func = getattr(self, funcname)
if funcname == 'cancel': if funcname == 'cancel':
# don't start entire actor runtime # don't start entire actor runtime
# cancellation if this actor is in debug # cancellation if this actor is in debug
# mode # mode
pdb_complete = _debug._local_pdb_complete pdb_complete = _debug.Lock.local_pdb_complete
if pdb_complete: if pdb_complete:
await pdb_complete.wait() await pdb_complete.wait()
@ -960,12 +998,17 @@ class Actor:
# ``_async_main()`` # ``_async_main()``
kwargs['chan'] = chan kwargs['chan'] = chan
log.cancel( log.cancel(
f'{self.uid} was remotely cancelled by\n' f'Remote request to cancel task\n'
f'{chan.uid}!' f'remote actor: {chan.uid}\n'
) f'task: {cid}'
await _invoke(
self, cid, chan, func, kwargs, is_rpc=False
) )
try:
await _invoke(
self, cid, chan, func, kwargs, is_rpc=False
)
except BaseException:
log.exception("failed to cancel task?")
continue continue
else: else:
# complain to client about restricted modules # complain to client about restricted modules
@ -1370,7 +1413,7 @@ class Actor:
# kill any debugger request task to avoid deadlock # kill any debugger request task to avoid deadlock
# with the root actor in this tree # with the root actor in this tree
dbcs = _debug._debugger_request_cs dbcs = _debug.Lock._debugger_request_cs
if dbcs is not None: if dbcs is not None:
log.cancel("Cancelling active debugger request") log.cancel("Cancelling active debugger request")
dbcs.cancel() dbcs.cancel()
@ -1403,12 +1446,14 @@ class Actor:
# n.cancel_scope.cancel() # n.cancel_scope.cancel()
async def _cancel_task(self, cid, chan): async def _cancel_task(self, cid, chan):
"""Cancel a local task by call-id / channel. '''
Cancel a local task by call-id / channel.
Note this method will be treated as a streaming function Note this method will be treated as a streaming function
by remote actor-callers due to the declaration of ``ctx`` by remote actor-callers due to the declaration of ``ctx``
in the signature (for now). in the signature (for now).
"""
'''
# right now this is only implicitly called by # right now this is only implicitly called by
# streaming IPC but it should be called # streaming IPC but it should be called
# to cancel any remotely spawned task # to cancel any remotely spawned task

View File

@ -41,6 +41,7 @@ from .log import get_logger
from ._discovery import get_root from ._discovery import get_root
from ._state import is_root_process, debug_mode from ._state import is_root_process, debug_mode
from ._exceptions import is_multi_cancelled from ._exceptions import is_multi_cancelled
from ._ipc import Channel
try: try:
@ -59,31 +60,87 @@ log = get_logger(__name__)
__all__ = ['breakpoint', 'post_mortem'] __all__ = ['breakpoint', 'post_mortem']
# TODO: wrap all these in a static global class: ``DebugLock`` maybe? class Lock:
'''
Actor global debug lock state.
# placeholder for function to set a ``trio.Event`` on debugger exit Mostly to avoid a lot of ``global`` declarations for now XD.
_pdb_release_hook: Optional[Callable] = None
# actor-wide variable pointing to current task name using debugger '''
_local_task_in_debug: Optional[str] = None # placeholder for function to set a ``trio.Event`` on debugger exit
pdb_release_hook: Optional[Callable] = None
# actor tree-wide actor uid that supposedly has the tty lock # actor-wide variable pointing to current task name using debugger
_global_actor_in_debug: Optional[Tuple[str, str]] = None local_task_in_debug: Optional[str] = None
# lock in root actor preventing multi-access to local tty # actor tree-wide actor uid that supposedly has the tty lock
_debug_lock: trio.StrictFIFOLock = trio.StrictFIFOLock() global_actor_in_debug: Optional[Tuple[str, str]] = None
_local_pdb_complete: Optional[trio.Event] = None
_no_remote_has_tty: Optional[trio.Event] = None
# XXX: set by the current task waiting on the root tty lock local_pdb_complete: Optional[trio.Event] = None
# and must be cancelled if this actor is cancelled via message no_remote_has_tty: Optional[trio.Event] = None
# otherwise deadlocks with the parent actor may ensure
_debugger_request_cs: Optional[trio.CancelScope] = None # lock in root actor preventing multi-access to local tty
_debug_lock: trio.StrictFIFOLock = trio.StrictFIFOLock()
# XXX: set by the current task waiting on the root tty lock
# and must be cancelled if this actor is cancelled via message
# otherwise deadlocks with the parent actor may ensure
_debugger_request_cs: Optional[trio.CancelScope] = None
_orig_sigint_handler: Optional[Callable] = None
@classmethod
def shield_sigint(cls):
cls._orig_sigint_handler = signal.signal(
signal.SIGINT,
shield_sigint,
)
@classmethod
def unshield_sigint(cls):
if cls._orig_sigint_handler is not None:
# restore original sigint handler
signal.signal(
signal.SIGINT,
cls._orig_sigint_handler
)
cls._orig_sigint_handler = None
@classmethod
def maybe_release(cls):
cls.local_task_in_debug = None
if cls.pdb_release_hook:
cls.pdb_release_hook()
@classmethod
def root_release(cls):
try:
cls._debug_lock.release()
except RuntimeError:
# uhhh makes no sense but been seeing the non-owner
# release error even though this is definitely the task
# that locked?
owner = cls._debug_lock.statistics().owner
if owner:
raise
cls.global_actor_in_debug = None
cls.local_task_in_debug = None
try:
# sometimes the ``trio`` might already be terminated in
# which case this call will raise.
cls.local_pdb_complete.set()
finally:
# restore original sigint handler
cls.unshield_sigint()
class TractorConfig(pdbpp.DefaultConfig): class TractorConfig(pdbpp.DefaultConfig):
"""Custom ``pdbpp`` goodness. """Custom ``pdbpp`` goodness.
""" """
use_pygments = True
# sticky_by_default = True # sticky_by_default = True
enable_hidden_frames = False enable_hidden_frames = False
@ -96,25 +153,23 @@ class MultiActorPdb(pdbpp.Pdb):
# override the pdbpp config with our coolio one # override the pdbpp config with our coolio one
DefaultConfig = TractorConfig DefaultConfig = TractorConfig
# def preloop(self):
# print('IN PRELOOP')
# super().preloop()
# TODO: figure out how to disallow recursive .set_trace() entry # TODO: figure out how to disallow recursive .set_trace() entry
# since that'll cause deadlock for us. # since that'll cause deadlock for us.
def set_continue(self): def set_continue(self):
try: try:
super().set_continue() super().set_continue()
finally: finally:
global _local_task_in_debug, _pdb_release_hook Lock.maybe_release()
_local_task_in_debug = None
if _pdb_release_hook:
_pdb_release_hook()
def set_quit(self): def set_quit(self):
try: try:
super().set_quit() super().set_quit()
finally: finally:
global _local_task_in_debug, _pdb_release_hook Lock.maybe_release()
_local_task_in_debug = None
if _pdb_release_hook:
_pdb_release_hook()
# TODO: will be needed whenever we get to true remote debugging. # TODO: will be needed whenever we get to true remote debugging.
@ -153,7 +208,6 @@ class MultiActorPdb(pdbpp.Pdb):
# log.info("Closing stdin hijack") # log.info("Closing stdin hijack")
# break # break
@acm @acm
async def _acquire_debug_lock( async def _acquire_debug_lock(
uid: Tuple[str, str] uid: Tuple[str, str]
@ -168,8 +222,6 @@ async def _acquire_debug_lock(
to the ``pdb`` repl. to the ``pdb`` repl.
''' '''
global _debug_lock, _global_actor_in_debug, _no_remote_has_tty
task_name = trio.lowlevel.current_task().name task_name = trio.lowlevel.current_task().name
log.runtime( log.runtime(
@ -183,15 +235,15 @@ async def _acquire_debug_lock(
f"entering lock checkpoint, remote task: {task_name}:{uid}" f"entering lock checkpoint, remote task: {task_name}:{uid}"
) )
we_acquired = True we_acquired = True
await _debug_lock.acquire() await Lock._debug_lock.acquire()
if _no_remote_has_tty is None: if Lock.no_remote_has_tty is None:
# mark the tty lock as being in use so that the runtime # mark the tty lock as being in use so that the runtime
# can try to avoid clobbering any connection from a child # can try to avoid clobbering any connection from a child
# that's currently relying on it. # that's currently relying on it.
_no_remote_has_tty = trio.Event() Lock.no_remote_has_tty = trio.Event()
_global_actor_in_debug = uid Lock.global_actor_in_debug = uid
log.runtime(f"TTY lock acquired, remote task: {task_name}:{uid}") log.runtime(f"TTY lock acquired, remote task: {task_name}:{uid}")
# NOTE: critical section: this yield is unshielded! # NOTE: critical section: this yield is unshielded!
@ -204,32 +256,32 @@ async def _acquire_debug_lock(
# surrounding caller side context should cancel normally # surrounding caller side context should cancel normally
# relaying back to the caller. # relaying back to the caller.
yield _debug_lock yield Lock._debug_lock
finally: finally:
# if _global_actor_in_debug == uid: # if Lock.global_actor_in_debug == uid:
if ( if (
we_acquired we_acquired
and _debug_lock.locked() and Lock._debug_lock.locked()
): ):
_debug_lock.release() Lock._debug_lock.release()
# IFF there are no more requesting tasks queued up fire, the # IFF there are no more requesting tasks queued up fire, the
# "tty-unlocked" event thereby alerting any monitors of the lock that # "tty-unlocked" event thereby alerting any monitors of the lock that
# we are now back in the "tty unlocked" state. This is basically # we are now back in the "tty unlocked" state. This is basically
# and edge triggered signal around an empty queue of sub-actor # and edge triggered signal around an empty queue of sub-actor
# tasks that may have tried to acquire the lock. # tasks that may have tried to acquire the lock.
stats = _debug_lock.statistics() stats = Lock._debug_lock.statistics()
if ( if (
not stats.owner not stats.owner
): ):
log.runtime(f"No more tasks waiting on tty lock! says {uid}") log.runtime(f"No more tasks waiting on tty lock! says {uid}")
if _no_remote_has_tty is not None: if Lock.no_remote_has_tty is not None:
_no_remote_has_tty.set() Lock.no_remote_has_tty.set()
_no_remote_has_tty = None Lock.no_remote_has_tty = None
_global_actor_in_debug = None Lock.global_actor_in_debug = None
log.runtime( log.runtime(
f"TTY lock released, remote task: {task_name}:{uid}" f"TTY lock released, remote task: {task_name}:{uid}"
@ -264,11 +316,8 @@ async def _hijack_stdin_for_child(
) )
log.debug(f"Actor {subactor_uid} is WAITING on stdin hijack lock") log.debug(f"Actor {subactor_uid} is WAITING on stdin hijack lock")
Lock.shield_sigint()
orig_handler = signal.signal(
signal.SIGINT,
shield_sigint,
)
try: try:
with ( with (
trio.CancelScope(shield=True), trio.CancelScope(shield=True),
@ -320,10 +369,7 @@ async def _hijack_stdin_for_child(
return "pdb_unlock_complete" return "pdb_unlock_complete"
finally: finally:
signal.signal( Lock.unshield_sigint()
signal.SIGINT,
orig_handler
)
async def wait_for_parent_stdin_hijack( async def wait_for_parent_stdin_hijack(
@ -341,10 +387,8 @@ async def wait_for_parent_stdin_hijack(
debug (see below inside ``maybe_wait_for_debugger()``). debug (see below inside ``maybe_wait_for_debugger()``).
''' '''
global _debugger_request_cs
with trio.CancelScope(shield=True) as cs: with trio.CancelScope(shield=True) as cs:
_debugger_request_cs = cs Lock._debugger_request_cs = cs
try: try:
async with get_root() as portal: async with get_root() as portal:
@ -364,9 +408,9 @@ async def wait_for_parent_stdin_hijack(
# unblock local caller # unblock local caller
try: try:
assert _local_pdb_complete assert Lock.local_pdb_complete
task_status.started(cs) task_status.started(cs)
await _local_pdb_complete.wait() await Lock.local_pdb_complete.wait()
finally: finally:
# TODO: shielding currently can cause hangs... # TODO: shielding currently can cause hangs...
@ -382,32 +426,25 @@ async def wait_for_parent_stdin_hijack(
log.warning('Root actor cancelled debug lock') log.warning('Root actor cancelled debug lock')
finally: finally:
log.debug(f"Exiting debugger for actor {actor_uid}") log.pdb(f"Exiting debugger for actor {actor_uid}")
global _local_task_in_debug Lock.local_task_in_debug = None
_local_task_in_debug = None log.pdb(f"Child {actor_uid} released parent stdio lock")
log.debug(f"Child {actor_uid} released parent stdio lock")
def mk_mpdb() -> tuple[MultiActorPdb, Callable]: def mk_mpdb() -> tuple[MultiActorPdb, Callable]:
pdb = MultiActorPdb() pdb = MultiActorPdb()
signal.signal = pdbpp.hideframe(signal.signal) # signal.signal = pdbpp.hideframe(signal.signal)
orig_handler = signal.signal(
signal.SIGINT, Lock.shield_sigint()
partial(shield_sigint, pdb_obj=pdb),
) # XXX: These are the important flags mentioned in
# https://github.com/python-trio/trio/issues/1155
# which resolve the traceback spews to console.
pdb.allow_kbdint = True pdb.allow_kbdint = True
pdb.nosigint = True pdb.nosigint = True
# TODO: add this as method on our pdb obj? return pdb, Lock.unshield_sigint
def undo_sigint():
# restore original sigint handler
signal.signal(
signal.SIGINT,
orig_handler
)
return pdb, undo_sigint
async def _breakpoint( async def _breakpoint(
@ -429,9 +466,6 @@ async def _breakpoint(
actor = tractor.current_actor() actor = tractor.current_actor()
task_name = trio.lowlevel.current_task().name task_name = trio.lowlevel.current_task().name
global _local_pdb_complete, _pdb_release_hook
global _local_task_in_debug, _global_actor_in_debug
# TODO: is it possible to debug a trio.Cancelled except block? # TODO: is it possible to debug a trio.Cancelled except block?
# right now it seems like we can kinda do with by shielding # right now it seems like we can kinda do with by shielding
# around ``tractor.breakpoint()`` but not if we move the shielded # around ``tractor.breakpoint()`` but not if we move the shielded
@ -439,14 +473,14 @@ async def _breakpoint(
# with trio.CancelScope(shield=shield): # with trio.CancelScope(shield=shield):
# await trio.lowlevel.checkpoint() # await trio.lowlevel.checkpoint()
if not _local_pdb_complete or _local_pdb_complete.is_set(): if not Lock.local_pdb_complete or Lock.local_pdb_complete.is_set():
_local_pdb_complete = trio.Event() Lock.local_pdb_complete = trio.Event()
# TODO: need a more robust check for the "root" actor # TODO: need a more robust check for the "root" actor
if actor._parent_chan and not is_root_process(): if actor._parent_chan and not is_root_process():
if _local_task_in_debug: if Lock.local_task_in_debug:
if _local_task_in_debug == task_name: if Lock.local_task_in_debug == task_name:
# this task already has the lock and is # this task already has the lock and is
# likely recurrently entering a breakpoint # likely recurrently entering a breakpoint
return return
@ -456,23 +490,26 @@ async def _breakpoint(
# support for recursive entries to `tractor.breakpoint()` # support for recursive entries to `tractor.breakpoint()`
log.warning(f"{actor.uid} already has a debug lock, waiting...") log.warning(f"{actor.uid} already has a debug lock, waiting...")
await _local_pdb_complete.wait() await Lock.local_pdb_complete.wait()
await trio.sleep(0.1) await trio.sleep(0.1)
# mark local actor as "in debug mode" to avoid recurrent # mark local actor as "in debug mode" to avoid recurrent
# entries/requests to the root process # entries/requests to the root process
_local_task_in_debug = task_name Lock.local_task_in_debug = task_name
def child_release_hook(): def child_release():
# _local_task_in_debug = None try:
_local_pdb_complete.set() # sometimes the ``trio`` might already be termianated in
# which case this call will raise.
# restore original sigint handler Lock.local_pdb_complete.set()
undo_sigint() finally:
# restore original sigint handler
undo_sigint()
# should always be cleared in the hijack hook aboved right?
# _local_task_in_debug = None
# assign unlock callback for debugger teardown hooks # assign unlock callback for debugger teardown hooks
# _pdb_release_hook = _local_pdb_complete.set Lock.pdb_release_hook = child_release
_pdb_release_hook = child_release_hook
# this **must** be awaited by the caller and is done using the # this **must** be awaited by the caller and is done using the
# root nursery so that the debugger can continue to run without # root nursery so that the debugger can continue to run without
@ -489,66 +526,39 @@ async def _breakpoint(
actor.uid, actor.uid,
) )
except RuntimeError: except RuntimeError:
child_release_hook() Lock.pdb_release_hook()
raise raise
elif is_root_process(): elif is_root_process():
# we also wait in the root-parent for any child that # we also wait in the root-parent for any child that
# may have the tty locked prior # may have the tty locked prior
global _debug_lock
# TODO: wait, what about multiple root tasks acquiring it though? # TODO: wait, what about multiple root tasks acquiring it though?
# root process (us) already has it; ignore # root process (us) already has it; ignore
if _global_actor_in_debug == actor.uid: if Lock.global_actor_in_debug == actor.uid:
return return
# XXX: since we need to enter pdb synchronously below, # XXX: since we need to enter pdb synchronously below,
# we have to release the lock manually from pdb completion # we have to release the lock manually from pdb completion
# callbacks. Can't think of a nicer way then this atm. # callbacks. Can't think of a nicer way then this atm.
if _debug_lock.locked(): if Lock._debug_lock.locked():
log.warning( log.warning(
'Root actor attempting to shield-acquire active tty lock' 'Root actor attempting to shield-acquire active tty lock'
f' owned by {_global_actor_in_debug}') f' owned by {Lock.global_actor_in_debug}')
# must shield here to avoid hitting a ``Cancelled`` and # must shield here to avoid hitting a ``Cancelled`` and
# a child getting stuck bc we clobbered the tty # a child getting stuck bc we clobbered the tty
with trio.CancelScope(shield=True): with trio.CancelScope(shield=True):
await _debug_lock.acquire() await Lock._debug_lock.acquire()
else: else:
# may be cancelled # may be cancelled
await _debug_lock.acquire() await Lock._debug_lock.acquire()
_global_actor_in_debug = actor.uid Lock.global_actor_in_debug = actor.uid
_local_task_in_debug = task_name Lock.local_task_in_debug = task_name
# the lock must be released on pdb completion # the lock must be released on pdb completion
def teardown(): Lock.pdb_release_hook = Lock.root_release
global _local_pdb_complete, _debug_lock
global _global_actor_in_debug, _local_task_in_debug
try:
_debug_lock.release()
except RuntimeError:
# uhhh makes no sense but been seeing the non-owner
# release error even though this is definitely the task
# that locked?
owner = _debug_lock.statistics().owner
if owner:
raise
_global_actor_in_debug = None
_local_task_in_debug = None
_local_pdb_complete.set()
# restore original sigint handler
undo_sigint()
_pdb_release_hook = teardown
# frame = sys._getframe()
# last_f = frame.f_back
# last_f.f_globals['__tracebackhide__'] = True
try: try:
# block here one (at the appropriate frame *up*) where # block here one (at the appropriate frame *up*) where
@ -557,15 +567,13 @@ async def _breakpoint(
debug_func(actor, pdb) debug_func(actor, pdb)
except bdb.BdbQuit: except bdb.BdbQuit:
if _pdb_release_hook: Lock.maybe_release()
_pdb_release_hook()
raise raise
# XXX: apparently we can't do this without showing this frame # XXX: apparently we can't do this without showing this frame
# in the backtrace on first entry to the REPL? Seems like an odd # in the backtrace on first entry to the REPL? Seems like an odd
# behaviour that should have been fixed by now. This is also why # behaviour that should have been fixed by now. This is also why
# we scrapped all the @cm approaches that were tried previously. # we scrapped all the @cm approaches that were tried previously.
# finally: # finally:
# __tracebackhide__ = True # __tracebackhide__ = True
# # frame = sys._getframe() # # frame = sys._getframe()
@ -595,12 +603,25 @@ def shield_sigint(
''' '''
__tracebackhide__ = True __tracebackhide__ = True
global _local_task_in_debug, _global_actor_in_debug uid_in_debug = Lock.global_actor_in_debug
uid_in_debug = _global_actor_in_debug
actor = tractor.current_actor() actor = tractor.current_actor()
def do_cancel():
# If we haven't tried to cancel the runtime then do that instead
# of raising a KBI (which may non-gracefully destroy
# a ``trio.run()``).
if not actor._cancel_called:
actor.cancel_soon()
# If the runtime is already cancelled it likely means the user
# hit ctrl-c again because teardown didn't full take place in
# which case we do the "hard" raising of a local KBI.
else:
raise KeyboardInterrupt
any_connected = False any_connected = False
if uid_in_debug is not None: if uid_in_debug is not None:
# try to see if the supposed (sub)actor in debug still # try to see if the supposed (sub)actor in debug still
# has an active connection to *this* actor, and if not # has an active connection to *this* actor, and if not
@ -616,6 +637,7 @@ def shield_sigint(
f'{uid_in_debug}\n' f'{uid_in_debug}\n'
'Allowing SIGINT propagation..' 'Allowing SIGINT propagation..'
) )
return do_cancel()
# root actor branch that reports whether or not a child # root actor branch that reports whether or not a child
# has locked debugger. # has locked debugger.
@ -644,7 +666,17 @@ def shield_sigint(
elif ( elif (
not is_root_process() not is_root_process()
): ):
task = _local_task_in_debug chan: Channel = actor._parent_chan
if not chan or not chan.connected():
log.warning(
'A global actor reported to be in debug '
'but no connection exists for its parent:\n'
f'{uid_in_debug}\n'
'Allowing SIGINT propagation..'
)
return do_cancel()
task = Lock.local_task_in_debug
if task: if task:
log.pdb( log.pdb(
f"Ignoring SIGINT while task in debug mode: `{task}`" f"Ignoring SIGINT while task in debug mode: `{task}`"
@ -654,28 +686,20 @@ def shield_sigint(
# that **is not** marked in debug mode? # that **is not** marked in debug mode?
# elif debug_mode(): # elif debug_mode():
else:
log.pdb(
"Ignoring SIGINT since debug mode is enabled"
)
# noone has the debugger so raise KBI
else: else:
# If we haven't tried to cancel the runtime then do that instead log.pdb(
# of raising a KBI (which may non-gracefully destroy "Ignoring SIGINT since debug mode is enabled"
# a ``trio.run()``). )
if not actor._cancel_called:
actor.cancel_soon()
# If the runtime is already cancelled it likely means the user # NOTE: currently (at least on ``fancycompleter`` 0.9.2)
# hit ctrl-c again because teardown didn't full take place in # it lookks to be that the last command that was run (eg. ll)
# which case we do the "hard" raising of a local KBI. # will be repeated by default.
else:
raise KeyboardInterrupt
# maybe redraw/print last REPL output to console
if pdb_obj:
# TODO: maybe redraw/print last REPL output to console
if (
pdb_obj
and sys.version_info <= (3, 10)
):
# TODO: make this work like sticky mode where if there is output # TODO: make this work like sticky mode where if there is output
# detected as written to the tty we redraw this part underneath # detected as written to the tty we redraw this part underneath
# and erase the past draw of this same bit above? # and erase the past draw of this same bit above?
@ -686,12 +710,16 @@ def shield_sigint(
# https://github.com/goodboy/tractor/issues/130#issuecomment-663752040 # https://github.com/goodboy/tractor/issues/130#issuecomment-663752040
# https://github.com/prompt-toolkit/python-prompt-toolkit/blob/c2c6af8a0308f9e5d7c0e28cb8a02963fe0ce07a/prompt_toolkit/patch_stdout.py # https://github.com/prompt-toolkit/python-prompt-toolkit/blob/c2c6af8a0308f9e5d7c0e28cb8a02963fe0ce07a/prompt_toolkit/patch_stdout.py
try: # XXX: lol, see ``pdbpp`` issue:
pdb_obj.do_longlist(None) # https://github.com/pdbpp/pdbpp/issues/496
print(pdb_obj.prompt, end='', flush=True)
except AttributeError: # TODO: pretty sure this is what we should expect to have to run
log.exception('pdbpp longlist failed...') # in total but for now we're just going to wait until `pdbpp`
raise KeyboardInterrupt # figures out it's own stuff on 3.10 (and maybe we'll help).
# pdb_obj.do_longlist(None)
# XXX: we were doing this but it shouldn't be required..
print(pdb_obj.prompt, end='', flush=True)
def _set_trace( def _set_trace(
@ -707,19 +735,21 @@ def _set_trace(
# last_f.f_globals['__tracebackhide__'] = True # last_f.f_globals['__tracebackhide__'] = True
# start 2 levels up in user code # start 2 levels up in user code
frame: FrameType = sys._getframe() frame: Optional[FrameType] = sys._getframe()
if frame: if frame:
frame = frame.f_back.f_back # type: ignore frame = frame.f_back # type: ignore
if pdb and actor is not None: if frame and pdb and actor is not None:
log.pdb(f"\nAttaching pdb to actor: {actor.uid}\n") log.pdb(f"\nAttaching pdb to actor: {actor.uid}\n")
# no f!#$&* idea, but when we're in async land
# we need 2x frames up?
frame = frame.f_back
else: else:
pdb, undo_sigint = mk_mpdb() pdb, undo_sigint = mk_mpdb()
# we entered the global ``breakpoint()`` built-in from sync code? # we entered the global ``breakpoint()`` built-in from sync code?
global _local_task_in_debug, _pdb_release_hook Lock.local_task_in_debug = 'sync'
_local_task_in_debug = 'sync'
pdb.set_trace(frame=frame) pdb.set_trace(frame=frame)
@ -794,6 +824,7 @@ async def _maybe_enter_pm(err):
): ):
log.debug("Actor crashed, entering debug mode") log.debug("Actor crashed, entering debug mode")
await post_mortem() await post_mortem()
Lock.maybe_release()
return True return True
else: else:
@ -838,8 +869,6 @@ async def maybe_wait_for_debugger(
if ( if (
is_root_process() is_root_process()
): ):
global _no_remote_has_tty, _global_actor_in_debug, _wait_all_tasks_lock
# If we error in the root but the debugger is # If we error in the root but the debugger is
# engaged we don't want to prematurely kill (and # engaged we don't want to prematurely kill (and
# thus clobber access to) the local tty since it # thus clobber access to) the local tty since it
@ -851,11 +880,13 @@ async def maybe_wait_for_debugger(
for _ in range(poll_steps): for _ in range(poll_steps):
if _global_actor_in_debug: if Lock.global_actor_in_debug:
sub_in_debug = tuple(_global_actor_in_debug) sub_in_debug = tuple(Lock.global_actor_in_debug)
# alive = tractor.current_actor().child_alive(sub_in_debug)
# if not alive:
# break
log.debug( log.debug('Root polling for debug')
'Root polling for debug')
with trio.CancelScope(shield=True): with trio.CancelScope(shield=True):
await trio.sleep(poll_delay) await trio.sleep(poll_delay)
@ -866,7 +897,7 @@ async def maybe_wait_for_debugger(
# XXX: doesn't seem to work # XXX: doesn't seem to work
# await trio.testing.wait_all_tasks_blocked(cushion=0) # await trio.testing.wait_all_tasks_blocked(cushion=0)
debug_complete = _no_remote_has_tty debug_complete = Lock.no_remote_has_tty
if ( if (
(debug_complete and (debug_complete and
not debug_complete.is_set()) not debug_complete.is_set())

View File

@ -511,8 +511,8 @@ class Portal:
if ctx.chan.connected(): if ctx.chan.connected():
log.info( log.info(
'Waiting on final context-task result for\n' 'Waiting on final context-task result for\n'
f'task:{cid}\n' f'task: {cid}\n'
f'actor:{uid}' f'actor: {uid}'
) )
result = await ctx.result() result = await ctx.result()
@ -542,6 +542,17 @@ class Portal:
f'value from callee `{result}`' f'value from callee `{result}`'
) )
# XXX: (MEGA IMPORTANT) if this is a root opened process we
# wait for any immediate child in debug before popping the
# context from the runtime msg loop otherwise inside
# ``Actor._push_result()`` the msg will be discarded and in
# the case where that msg is global debugger unlock (via
# a "stop" msg for a stream), this can result in a deadlock
# where the root is waiting on the lock to clear but the
# child has already cleared it and clobbered IPC.
from ._debug import maybe_wait_for_debugger
await maybe_wait_for_debugger()
# remove the context from runtime tracking # remove the context from runtime tracking
self.actor._contexts.pop((self.channel.uid, ctx.cid)) self.actor._contexts.pop((self.channel.uid, ctx.cid))

View File

@ -103,13 +103,7 @@ async def open_root_actor(
_default_arbiter_port, _default_arbiter_port,
) )
if loglevel is None: loglevel = (loglevel or log._default_loglevel).upper()
loglevel = log.get_loglevel()
else:
log._default_loglevel = loglevel
log.get_console_log(loglevel)
assert loglevel
if debug_mode and _spawn._spawn_method == 'trio': if debug_mode and _spawn._spawn_method == 'trio':
_state._runtime_vars['_debug_mode'] = True _state._runtime_vars['_debug_mode'] = True
@ -124,7 +118,7 @@ async def open_root_actor(
logging.getLevelName( logging.getLevelName(
# lul, need the upper case for the -> int map? # lul, need the upper case for the -> int map?
# sweet "dynamic function behaviour" stdlib... # sweet "dynamic function behaviour" stdlib...
loglevel.upper() loglevel,
) > logging.getLevelName('PDB') ) > logging.getLevelName('PDB')
): ):
loglevel = 'PDB' loglevel = 'PDB'
@ -134,19 +128,24 @@ async def open_root_actor(
"Debug mode is only supported for the `trio` backend!" "Debug mode is only supported for the `trio` backend!"
) )
# make a temporary connection to see if an arbiter exists log.get_console_log(loglevel)
arbiter_found = False
try: try:
# make a temporary connection to see if an arbiter exists,
# if one can't be made quickly we assume none exists.
arbiter_found = False
# TODO: this connect-and-bail forces us to have to carefully # TODO: this connect-and-bail forces us to have to carefully
# rewrap TCP 104-connection-reset errors as EOF so as to avoid # rewrap TCP 104-connection-reset errors as EOF so as to avoid
# propagating cancel-causing errors to the channel-msg loop # propagating cancel-causing errors to the channel-msg loop
# machinery. Likely it would be better to eventually have # machinery. Likely it would be better to eventually have
# a "discovery" protocol with basic handshake instead. # a "discovery" protocol with basic handshake instead.
async with _connect_chan(host, port): with trio.move_on_after(1):
arbiter_found = True async with _connect_chan(host, port):
arbiter_found = True
except OSError: except OSError:
# TODO: make this a "discovery" log level?
logger.warning(f"No actor could be found @ {host}:{port}") logger.warning(f"No actor could be found @ {host}:{port}")
# create a local actor and start up its main routine/task # create a local actor and start up its main routine/task
@ -216,7 +215,8 @@ async def open_root_actor(
finally: finally:
# NOTE: not sure if we'll ever need this but it's # NOTE: not sure if we'll ever need this but it's
# possibly better for even more determinism? # possibly better for even more determinism?
# logger.cancel(f'Waiting on {len(nurseries)} nurseries in root..') # logger.cancel(
# f'Waiting on {len(nurseries)} nurseries in root..')
# nurseries = actor._actoruid2nursery.values() # nurseries = actor._actoruid2nursery.values()
# async with trio.open_nursery() as tempn: # async with trio.open_nursery() as tempn:
# for an in nurseries: # for an in nurseries:

View File

@ -307,7 +307,8 @@ async def new_proc(
proc: Optional[trio.Process] = None proc: Optional[trio.Process] = None
try: try:
try: try:
proc = await trio.open_process(spawn_cmd) # TODO: needs ``trio_typing`` patch?
proc = await trio.lowlevel.open_process(spawn_cmd) # type: ignore
log.runtime(f"Started {proc}") log.runtime(f"Started {proc}")
@ -334,6 +335,9 @@ async def new_proc(
await proc.wait() await proc.wait()
raise raise
# a sub-proc ref **must** exist now
assert proc
portal = Portal(chan) portal = Portal(chan)
actor_nursery._children[subactor.uid] = ( actor_nursery._children[subactor.uid] = (
subactor, proc, portal) subactor, proc, portal)