Compare commits

..

26 Commits

Author SHA1 Message Date
Tyler Goodlet fee1ee315c Hide `._rpc._invoke()` frame, again.. 2025-09-12 12:19:27 -04:00
Tyler Goodlet 22e62ed88e Explain the `infect_asyncio: bool` param to pass in RTE msg 2025-09-12 12:19:27 -04:00
Tyler Goodlet fdba9e42d3 Toss in masked `.set_trace()` for unshielded `.pause()` debug 2025-09-12 12:19:27 -04:00
Tyler Goodlet 3ec72e6af8 Mask tpt-closed handling of `chan.send(return_msg)`
A partial revert of commit c05d08e426 since it seem we already
suppress tpt-closed errors lower down in `.ipc.Channel.send()`; given
that i'm pretty sure this new handler code should basically never run?

Left in a todo to remove the masked content once i'm done more
thoroughly testing under `piker`.
2025-09-12 12:19:27 -04:00
Tyler Goodlet c538cb3004 More `TransportClosed`-handling around IPC-IO
For IPC-disconnects-during-teardown edge cases, augment some `._rpc`
machinery,
- in `._invoke()` around the `await chan.send(return_msg)` where we
  suppress if the underlying `Channel` already disconnected.
- add a disjoint handler in `_errors_relayed_via_ipc()` which just
  reports-n-reraises the exc (same as prior behaviour).
  * originally i thought it needed to be handled specially (to avoid
    being crash handled) but turns out that isn't necessary?
  * hence the also-added-bu-masked-out `debug_filter` / guard expression
    around the `await debug._maybe_enter_pm()` line.
- show the `._invoke()` frame for the moment.
2025-09-12 12:19:27 -04:00
Tyler Goodlet 8842b758d7 Use new `pkg_name` in log-sys test suites 2025-09-11 17:05:35 -04:00
Tyler Goodlet 54ee624632 Implicitly name sub-logs by caller's mod
That is when no `name` is passed to `get_logger()`, try to introspect
the caller's `module.__name__` and use it to infer/get the "namespace
path" to that module the same as if using `name=__name__` as in the most
common usage.

Further, change the `_root_name` to be `pkg_name: str`, a public and
more obvious param name, and deprecate the former. This obviously adds
the necessary impl to make the new
`test_sys_log::test_implicit_mod_name_applied_for_child` test pass.

Impl detalles for `get_logger()`,
- add `pkg_name` and deprecate `_root_name`, include failover logic
  and a warning.
- implement calling module introspection using
  `inspect.stack()/getmodule()` to get both the `.__name__` and
  `.__package__` info alongside adjusted logic to set the `name`
  when not provided but only when a new `mk_sublog: bool` is set.
- tweak the `name` processing for implicitly set case,
  - rename `sub_name` -> `pkg_path: str` which is the path
    to the calling module minus that module's name component.
  - only partition `name` if `pkg_name` is `in` it.
  - use the `_root_log` for `pkg_name` duplication warnings.

Other/related,
- add types to various public mod vars missing them.
- rename `.log.log` -> `.log._root_log`.
2025-09-11 17:05:35 -04:00
Tyler Goodlet e8f2dfc088 Add an implicit-pkg-path-as-logger-name test
A bit of test driven dev to anticipate support  of `.log.get_logger()`
usage such that it can be called from arbitrary sub-modules, themselves
embedded in arbitrary sub-pkgs, of some project; the when not provided,
the `sub_name` passed to the `Logger.getChild(<sub_name>)` will be set
as the sub-pkg path "down to" the calling module.

IOW if you call something like,

`log = tractor.log.get_logger(pkg_name='mypylib')`

from some `submod.py` in a project-dir that looks like,

mypylib/
  mod.py
  subpkg/
    submod.py  <- calling module

the `log: StackLevelAdapter` child-`Logger` instance will have a
`.name: str = 'mypylib.subpkg'`, discluding the `submod` part since this
already rendered as the `{filename}` header in `log.LOG_FORMAT`.

Previously similar behaviour would be obtained by passing
`get_logger(name=__name__)` in the calling module and so much so it
motivated me to make this the default, presuming we can introspect for
the info.

Impl deats,
- duplicated a `load_module_from_path()` from `modden` to load the
  `testdir` rendered py project dir from its path.
 |_should prolly factor it down to this lib anyway bc we're going to
   need it for hot code reload? (well that and `watchfiles` Bp)
- in each of `mod.py` and `submod.py` render the `get_logger()` code
  sin `name`, expecting the (coming shortly) implicit introspection
  feat to do this.
- do `.name` and `.parent` checks against expected sub-logger values
  from `StackLevelAdapter.logger.getChildren()`.
2025-09-11 17:05:35 -04:00
Tyler Goodlet d2282f4275 Start a logging-sys unit-test module
To start ensuring that when `name=__name__` is passed we try to
de-duplicate the `_root_name` and any `leaf_mod: str` since it's already
included in the headers as `{filename}`.

Deats,
- heavily document the de-duplication `str.partition()`s in
  `.log.get_logger()` and provide the end fix by changing the predicate,
  `if rname == 'tractor':` -> `if rname == _root_name`.
  * also toss in some warnings for when we still detect duplicates.
- add todo comments around logging "filters" (vs. our "adapter").
- create the new `test_log_sys.test_root_pkg_not_duplicated()` which
  runs green with the fixes from ^.
- add a ton of test-suite todos both for existing and anticipated
  logging sys feats in the new mod.
2025-09-11 17:05:35 -04:00
Bd 83ce2275b9
Merge pull request #399 from goodboy/oob_cancel_testing
OoB (out-of-band) cancellation testing, proper.
2025-09-11 14:33:52 -04:00
Tyler Goodlet 9f757ffa63 Woops, fix missing `assert` thanks to copilot 2025-09-11 13:13:18 -04:00
Tyler Goodlet 0c6d512ba4 Solve another OoB cancellation case, the bg task one
Such that we are able to (finally) detect when we should
`Context._scope.cancel()` specifically when the `.parent_task` is
**not** blocking on receiving from the underlying `._rx_chan`, since if
the task is blocking on `.receive()` it will call `.cancel()`
implicitly.

This is a lot to explain with very little code actually needed for the
implementation (are we like `trio` yet anyone?? XD) but the main jist is
that `Context._maybe_cancel_and_set_remote_error()` needed the
additional case of calling `._scope.cancel()` whenever we know that
a remote-error/ctxc won't be immediately handled, bc user code is doing
non `Context`-API things, and result in a similar outcome as if that
task was waiting on `Context.wait_for_result()` or `.__aexite__()`.

Impl details,
- add a new `._is_blocked_on_rx_chan()` method which predicates whether
  the (new) `.parent_task` is blocking on `._rx_chan.receive()`.
  * see various stipulations about the current impl and how we might
    need to adjust for the future given `trio`'s commitment to the
    `Task.custom_sleep_data` attr..
- add `.parent_task`, a pub wrapper for `._task`.
- check for `not ._is_blocked_on_rx_chan()` before manually cancelling
  the local `.parent_task`
- minimize the surrounding branch case expressions.

Other,
- tweak a couple logs.
- add a new `.cancel()` pre-started msg.
- mask the `.cancel_called` setter, it's only (been) used for tracing.
- todos around maybe moving the `._nursery` allocation "around" the
  `.start_remote_task()` call and various subsequent tweaks therein.
2025-09-11 13:12:52 -04:00
Tyler Goodlet fc130d06b8 Check off REPL-ing todo add masked usage in `drain_to_final_msg()` 2025-09-11 10:13:04 -04:00
Tyler Goodlet 73423ef2b7 Timeout on `test_peer_spawns_and_cancels_service_subactor`
While working on a fix to the hang case found from
`test_cancel_ctx_with_parent_side_entered_in_bg_task` an initial
solution caused this test to hang indefinitely; solve it with a small
wrapping `_main()` + `trio.fail_after()` entrypoint.

Further suite refinements,
- move the top-most `try:`->`else:` block
- toss in a masked base-exc block for tracing unexpected
  `ctx.wait_for_result()` outcomes.
- tweak the `raise_sub_spawn_error_after` to be an optional `float`
  which scales the `rng_seed: int = 50` msg counter to
  `tell_little_bro()` so that the abs value to the `range()` can be
  changed.
2025-09-11 10:13:04 -04:00
Tyler Goodlet b1f2a6b394 Rename var for and hide the `_open_and_supervise_one_cancels_all_nursery` frame 2025-09-11 10:13:04 -04:00
Tyler Goodlet 9489a2f84d Add timeout around `test_peer_spawns_and_cancels_service_subactor` suite 2025-09-11 10:13:04 -04:00
Tyler Goodlet 92eaed6fec Parametrize with `Portal.cancel_actor()` only case
Such that when `maybe_context.cancel()` is not called (explicitly) and
only the subactor is cancelled by its parent we expect to see a ctxc
raised both from any call to `Context.wait_for_result()` and out of
the `Portal.open_context()` scope, up to the `trio.run()`.

Deats,
- obvi call-n-catch the ctxc (in scope) for the oob-only
  subactor-cancelled case.
- add branches around `trio.run()` entry to match.
2025-09-11 10:13:04 -04:00
Tyler Goodlet 217d54b9d1 Add the minimal OoB cancel edge case from #391
Discovered while writing a `@context` sanity test to verify unmasker
ignore-cases support. Masked code is due to the process of finding the
minimal example causing the original hang discovered in the original
examples script. Details are in the test-fn doc strings and surrounding
comments; more refinement and cleanup coming obviously.

Also moved over the self-cancel todos from the inter-peer tests module.
2025-09-11 10:13:04 -04:00
Bd 34ca02ed11
Merge pull request #391 from goodboy/cancelled_masking_guards
A refined `trio.Cancelled`-unmasking helper
2025-09-11 10:10:41 -04:00
Tyler Goodlet 62a364a1d3 Tweaks from copilot, type fix, typos, language. 2025-09-11 10:01:25 -04:00
Tyler Goodlet 07781e38cd Reduce "ignore cases" script to `trio`-only
Remove all the `tractor` usage (with IPC ctxs) and just get us
a min-reproducing-example with a multi-task-single `trio.Lock`.
The wrapping test suite runs the exact same with an ignore case and
an `.xfail()` for when we let the `trio.WouldBlock` be unmasked.
2025-09-07 18:49:21 -04:00
Tyler Goodlet 9c6b90ef04 Add a ignore-masking-case script + suite
Demonstrating the guilty `trio.Lock.acquire()` impl which puts
a checkpoint inside its `trio.WouldBlock` handler and which will always
appear to mask the "sync path" case on (graceful) cancellation.

This first script draft demos the issue from within a `tractor.context`
ep bc that's where it was orig discovered, however i'm going to factor
out the `tractor` code and instead just use
a `.trionics.maybe_raise_from_masking_exc()` to demo its low-level
ignore-case feature.

Further, this script exposed a previously unhandled remote graceful
cancellation case which hangs:

- parent actor spawns child and opens a >1 ctxs with it,
- the parent then OoB (out-of-band) cancels the child actor (with
  `Portal.cancel_actor()`),
- since the open ctxs raise a ctxc with a `.canceller == parent.uid` the
  `Context._is_self_cancelled()` will eval `True`,
- the `Context._scope` will NOT be cancelled in
  `._maybe_cancel_and_set_remote_error()` resulting in any bg-task which
  is waiting on a `Portal.open_context()` to not be cancelled/unblocked.

So my plan is to factor this ^^ scenario into a standalone unit test
as well as another test which consumes from al low-level `trio`-only
version of **this** script-scenario to sanity check the interaction
of the unmasker-with-ignore-cases usage implicitly around a ctx ep.
2025-09-06 14:03:02 -04:00
Tyler Goodlet 542d4c7840 Ignore `examples/trio/` in docs-examples test suite 2025-09-06 13:39:08 -04:00
Tyler Goodlet 9aebe7d8f9 Only read `_mask_cases` if truthy, allow disabling for xfails 2025-09-05 22:23:51 -04:00
Tyler Goodlet 04c3d5e239 Wrap `send_chan_aclose_masks_beg.py` as test suite
Call it `test_trioisms::test_unmask_aclose_as_checkpoint_on_aexit` and
parametrize all script-mod`.main()` toggles including `.xfails()` for
the `raise_unmasked=False` cases.
2025-09-05 18:46:20 -04:00
Tyler Goodlet 759174729c Prep masking `.aclose()` script for test suite
So we can parametrize in various toggles to `main()` including,
- `child_errors_mid_stream: bool` which now also drives whether an
  additional, and otherwise non-affecting, `_tn` is allocated in
  the `finite_stream_to_rent()` subtask, only in the early stream
  termination case does it seem to produce a masked outcome?
  * see surrounding notes within.
- `raise_unmasked: bool` to toggle whether the embedded unmasker fn
  will actually raise the masked user RTE; this enables demoing the
  masked outcomes via simple switch and makes it easy to wrap them
  as `pytest.xfail()` outcomes.

Also in support,
- use `.trionics.collapse_eg()` around the root tn to ensure when
  unmasking we can catch the EG-unwrapped RTE easily from a test.
- flip stream `msg` logs to `.debug()` to reduce console noise.
- tweak mod's script iface to report/trace unexpected non-RTEs.
2025-09-05 18:46:20 -04:00
10 changed files with 696 additions and 96 deletions

View File

@ -0,0 +1,85 @@
from contextlib import (
asynccontextmanager as acm,
)
from functools import partial
import tractor
import trio
log = tractor.log.get_logger(
name=__name__
)
_lock: trio.Lock|None = None
@acm
async def acquire_singleton_lock(
) -> None:
global _lock
if _lock is None:
log.info('Allocating LOCK')
_lock = trio.Lock()
log.info('TRYING TO LOCK ACQUIRE')
async with _lock:
log.info('ACQUIRED')
yield _lock
log.info('RELEASED')
async def hold_lock_forever(
task_status=trio.TASK_STATUS_IGNORED
):
async with (
tractor.trionics.maybe_raise_from_masking_exc(),
acquire_singleton_lock() as lock,
):
task_status.started(lock)
await trio.sleep_forever()
async def main(
ignore_special_cases: bool,
loglevel: str = 'info',
debug_mode: bool = True,
):
async with (
trio.open_nursery() as tn,
# tractor.trionics.maybe_raise_from_masking_exc()
# ^^^ XXX NOTE, interestingly putting the unmasker
# here does not exhibit the same behaviour ??
):
if not ignore_special_cases:
from tractor.trionics import _taskc
_taskc._mask_cases.clear()
_lock = await tn.start(
hold_lock_forever,
)
with trio.move_on_after(0.2):
await tn.start(
hold_lock_forever,
)
tn.cancel_scope.cancel()
# XXX, manual test as script
if __name__ == '__main__':
tractor.log.get_console_log(level='info')
for case in [True, False]:
log.info(
f'\n'
f'------ RUNNING SCRIPT TRIAL ------\n'
f'ignore_special_cases: {case!r}\n'
)
trio.run(partial(
main,
ignore_special_cases=case,
loglevel='info',
))

View File

@ -9,8 +9,10 @@ import tractor
import trio
log = tractor.log.get_logger(__name__)
tractor.log.get_console_log('info')
log = tractor.log.get_logger(
name=__name__
)
@cm
def teardown_on_exc(
@ -54,6 +56,7 @@ def teardown_on_exc(
async def finite_stream_to_rent(
tx: trio.abc.SendChannel,
child_errors_mid_stream: bool,
raise_unmasked: bool,
task_status: trio.TaskStatus[
trio.CancelScope,
@ -68,20 +71,41 @@ async def finite_stream_to_rent(
# inside the child task!
#
# TODO, uncomment next LoC to see the supprsessed beg[RTE]!
# tractor.trionics.maybe_raise_from_masking_exc(),
tractor.trionics.maybe_raise_from_masking_exc(
raise_unmasked=raise_unmasked,
),
tx as tx, # .aclose() is the guilty masker chkpt!
trio.open_nursery() as _tn,
# XXX, this ONLY matters in the
# `child_errors_mid_stream=False` case oddly!?
# THAT IS, if no tn is opened in that case then the
# test will not fail; it raises the RTE correctly?
#
# -> so it seems this new scope somehow affects the form of
# eventual in the parent EG?
tractor.trionics.maybe_open_nursery(
nursery=(
None
if not child_errors_mid_stream
else True
),
) as _tn,
):
# pass our scope back to parent for supervision\
# control.
task_status.started(_tn.cancel_scope)
cs: trio.CancelScope|None = (
None
if _tn is True
else _tn.cancel_scope
)
task_status.started(cs)
with teardown_on_exc(
raise_from_handler=not child_errors_mid_stream,
):
for i in range(100):
log.info(
log.debug(
f'Child tx {i!r}\n'
)
if (
@ -107,23 +131,31 @@ async def main(
# bug and raises.
#
child_errors_mid_stream: bool,
raise_unmasked: bool = False,
loglevel: str = 'info',
):
tractor.log.get_console_log(level=loglevel)
# the `.aclose()` being checkpoints on these
# is the source of the problem..
tx, rx = trio.open_memory_channel(1)
async with (
tractor.trionics.collapse_eg(),
trio.open_nursery() as tn,
rx as rx,
):
_child_cs = await tn.start(
partial(
finite_stream_to_rent,
child_errors_mid_stream=child_errors_mid_stream,
raise_unmasked=raise_unmasked,
tx=tx,
)
)
async for msg in rx:
log.info(
log.debug(
f'Rent rx {msg!r}\n'
)
@ -139,7 +171,25 @@ async def main(
tn.cancel_scope.cancel()
# XXX, manual test as script
if __name__ == '__main__':
tractor.log.get_console_log(level='info')
for case in [True, False]:
trio.run(main, case)
log.info(
f'\n'
f'------ RUNNING SCRIPT TRIAL ------\n'
f'child_errors_midstream: {case!r}\n'
)
try:
trio.run(partial(
main,
child_errors_mid_stream=case,
# raise_unmasked=True,
loglevel='info',
))
except Exception as _exc:
exc = _exc
log.exception(
'Should have raised an RTE or Cancelled?\n'
)
breakpoint()

View File

@ -95,6 +95,7 @@ def run_example_in_subproc(
and 'integration' not in p[0]
and 'advanced_faults' not in p[0]
and 'multihost' not in p[0]
and 'trio' not in p[0]
)
],
ids=lambda t: t[1],

View File

@ -24,14 +24,10 @@ from tractor._testing import (
)
# XXX TODO cases:
# - [ ] peer cancelled itself - so other peers should
# get errors reflecting that the peer was itself the .canceller?
# - [x] WE cancelled the peer and thus should not see any raised
# `ContextCancelled` as it should be reaped silently?
# => pretty sure `test_context_stream_semantics::test_caller_cancels()`
# already covers this case?
# - [x] INTER-PEER: some arbitrary remote peer cancels via
# Portal.cancel_actor().
# => all other connected peers should get that cancel requesting peer's
@ -44,16 +40,6 @@ from tractor._testing import (
# that also spawned a remote task task in that same peer-parent.
# def test_self_cancel():
# '''
# 2 cases:
# - calls `Actor.cancel()` locally in some task
# - calls LocalPortal.cancel_actor()` ?
# '''
# ...
@tractor.context
async def open_stream_then_sleep_forever(
ctx: Context,
@ -806,7 +792,7 @@ async def basic_echo_server(
ctx: Context,
peer_name: str = 'wittle_bruv',
err_after: int|None = None,
err_after_imsg: int|None = None,
) -> None:
'''
@ -835,8 +821,9 @@ async def basic_echo_server(
await ipc.send(resp)
if (
err_after
and i > err_after
err_after_imsg
and
i > err_after_imsg
):
raise RuntimeError(
f'Simulated error in `{peer_name}`'
@ -978,7 +965,8 @@ async def tell_little_bro(
actor_name: str,
caller: str = '',
err_after: int|None = None,
err_after: float|None = None,
rng_seed: int = 50,
):
# contact target actor, do a stream dialog.
async with (
@ -989,14 +977,18 @@ async def tell_little_bro(
basic_echo_server,
# XXX proxy any delayed err condition
err_after=err_after,
err_after_imsg=(
err_after * rng_seed
if err_after is not None
else None
),
) as (sub_ctx, first),
sub_ctx.open_stream() as echo_ipc,
):
actor: Actor = current_actor()
uid: tuple = actor.uid
for i in range(100):
for i in range(rng_seed):
msg: tuple = (
uid,
i,
@ -1021,13 +1013,13 @@ async def tell_little_bro(
)
@pytest.mark.parametrize(
'raise_sub_spawn_error_after',
[None, 50],
[None, 0.5],
)
def test_peer_spawns_and_cancels_service_subactor(
debug_mode: bool,
raise_client_error: str,
reg_addr: tuple[str, int],
raise_sub_spawn_error_after: int|None,
raise_sub_spawn_error_after: float|None,
):
# NOTE: this tests for the modden `mod wks open piker` bug
# discovered as part of implementing workspace ctx
@ -1041,6 +1033,7 @@ def test_peer_spawns_and_cancels_service_subactor(
# and the server's spawned child should cancel and terminate!
peer_name: str = 'little_bro'
def check_inner_rte(rae: RemoteActorError):
'''
Validate the little_bro's relayed inception!
@ -1134,8 +1127,7 @@ def test_peer_spawns_and_cancels_service_subactor(
)
try:
res = await client_ctx.result(hide_tb=False)
res = await client_ctx.wait_for_result(hide_tb=False)
# in remote (relayed inception) error
# case, we should error on the line above!
if raise_sub_spawn_error_after:
@ -1146,6 +1138,23 @@ def test_peer_spawns_and_cancels_service_subactor(
assert isinstance(res, ContextCancelled)
assert client_ctx.cancel_acked
assert res.canceller == root.uid
assert not raise_sub_spawn_error_after
# cancelling the spawner sub should
# transitively cancel it's sub, the little
# bruv.
print('root cancelling server/client sub-actors')
await spawn_ctx.cancel()
async with tractor.find_actor(
name=peer_name,
) as sub:
assert not sub
# XXX, only for tracing
# except BaseException as _berr:
# berr = _berr
# await tractor.pause(shield=True)
# raise berr
except RemoteActorError as rae:
_err = rae
@ -1174,19 +1183,8 @@ def test_peer_spawns_and_cancels_service_subactor(
raise
# await tractor.pause()
else:
assert not raise_sub_spawn_error_after
# cancelling the spawner sub should
# transitively cancel it's sub, the little
# bruv.
print('root cancelling server/client sub-actors')
await spawn_ctx.cancel()
async with tractor.find_actor(
name=peer_name,
) as sub:
assert not sub
# await tractor.pause()
# await server.cancel_actor()
except RemoteActorError as rae:
@ -1199,7 +1197,7 @@ def test_peer_spawns_and_cancels_service_subactor(
# since we called `.cancel_actor()`, `.cancel_ack`
# will not be set on the ctx bc `ctx.cancel()` was not
# called directly fot this confext.
# called directly for this confext.
except ContextCancelled as ctxc:
_ctxc = ctxc
print(
@ -1239,12 +1237,19 @@ def test_peer_spawns_and_cancels_service_subactor(
# assert spawn_ctx.cancelled_caught
async def _main():
with trio.fail_after(
3 if not debug_mode
else 999
):
await main()
if raise_sub_spawn_error_after:
with pytest.raises(RemoteActorError) as excinfo:
trio.run(main)
trio.run(_main)
rae: RemoteActorError = excinfo.value
check_inner_rte(rae)
else:
trio.run(main)
trio.run(_main)

View File

@ -0,0 +1,239 @@
'''
Define the details of inter-actor "out-of-band" (OoB) cancel
semantics, that is how cancellation works when a cancel request comes
from the different concurrency (primitive's) "layer" then where the
eventual `trio.Task` actually raises a signal.
'''
from functools import partial
# from contextlib import asynccontextmanager as acm
# import itertools
import pytest
import trio
import tractor
from tractor import ( # typing
ActorNursery,
Portal,
Context,
# ContextCancelled,
# RemoteActorError,
)
# from tractor._testing import (
# tractor_test,
# expect_ctxc,
# )
# XXX TODO cases:
# - [ ] peer cancelled itself - so other peers should
# get errors reflecting that the peer was itself the .canceller?
# def test_self_cancel():
# '''
# 2 cases:
# - calls `Actor.cancel()` locally in some task
# - calls LocalPortal.cancel_actor()` ?
#
# things to ensure!
# -[ ] the ctxc raised in a child should ideally show the tb of the
# underlying `Cancelled` checkpoint, i.e.
# `raise scope_error from ctxc`?
#
# -[ ] a self-cancelled context, if not allowed to block on
# `ctx.result()` at some point will hang since the `ctx._scope`
# is never `.cancel_called`; cases for this include,
# - an `open_ctx()` which never starteds before being OoB actor
# cancelled.
# |_ parent task will be blocked in `.open_context()` for the
# `Started` msg, and when the OoB ctxc arrives `ctx._scope`
# will never have been signalled..
# '''
# ...
# TODO, sanity test against the case in `/examples/trio/lockacquire_not_unmasked.py`
# but with the `Lock.acquire()` from a `@context` to ensure the
# implicit ignore-case-non-unmasking.
#
# @tractor.context
# async def acquire_actor_global_lock(
# ctx: tractor.Context,
# ignore_special_cases: bool,
# ):
# async with maybe_unmask_excs(
# ignore_special_cases=ignore_special_cases,
# ):
# await ctx.started('locked')
# # block til cancelled
# await trio.sleep_forever()
@tractor.context
async def sleep_forever(
ctx: tractor.Context,
# ignore_special_cases: bool,
do_started: bool,
):
# async with maybe_unmask_excs(
# ignore_special_cases=ignore_special_cases,
# ):
# await ctx.started('locked')
if do_started:
await ctx.started()
# block til cancelled
print('sleepin on child-side..')
await trio.sleep_forever()
@pytest.mark.parametrize(
'cancel_ctx',
[True, False],
)
def test_cancel_ctx_with_parent_side_entered_in_bg_task(
debug_mode: bool,
loglevel: str,
cancel_ctx: bool,
):
'''
The most "basic" out-of-band-task self-cancellation case where
`Portal.open_context()` is entered in a bg task and the
parent-task (of the containing nursery) calls `Context.cancel()`
without the child knowing; the `Context._scope` should be
`.cancel_called` when the IPC ctx's child-side relays
a `ContextCancelled` with a `.canceller` set to the parent
actor('s task).
'''
async def main():
with trio.fail_after(
2 if not debug_mode else 999,
):
an: ActorNursery
async with (
tractor.open_nursery(
debug_mode=debug_mode,
loglevel='devx',
enable_stack_on_sig=True,
) as an,
trio.open_nursery() as tn,
):
ptl: Portal = await an.start_actor(
'sub',
enable_modules=[__name__],
)
async def _open_ctx_async(
do_started: bool = True,
task_status=trio.TASK_STATUS_IGNORED,
):
# do we expect to never enter the
# `.open_context()` below.
if not do_started:
task_status.started()
async with ptl.open_context(
sleep_forever,
do_started=do_started,
) as (ctx, first):
task_status.started(ctx)
await trio.sleep_forever()
# XXX, this is the key OoB part!
#
# - start the `.open_context()` in a bg task which
# blocks inside the embedded scope-body,
#
# - when we call `Context.cancel()` it **is
# not** from the same task which eventually runs
# `.__aexit__()`,
#
# - since the bg "opener" task will be in
# a `trio.sleep_forever()`, it must be interrupted
# by the `ContextCancelled` delivered from the
# child-side; `Context._scope: CancelScope` MUST
# be `.cancel_called`!
#
print('ASYNC opening IPC context in subtask..')
maybe_ctx: Context|None = await tn.start(partial(
_open_ctx_async,
))
if (
maybe_ctx
and
cancel_ctx
):
print('cancelling first IPC ctx!')
await maybe_ctx.cancel()
# XXX, note that despite `maybe_context.cancel()`
# being called above, it's the parent (bg) task
# which was originally never interrupted in
# the `ctx._scope` body due to missing case logic in
# `ctx._maybe_cancel_and_set_remote_error()`.
#
# It didn't matter that the subactor process was
# already terminated and reaped, nothing was
# cancelling the ctx-parent task's scope!
#
print('cancelling subactor!')
await ptl.cancel_actor()
if maybe_ctx:
try:
await maybe_ctx.wait_for_result()
except tractor.ContextCancelled as ctxc:
assert not cancel_ctx
assert (
ctxc.canceller
==
tractor.current_actor().aid.uid
)
# don't re-raise since it'll trigger
# an EG from the above tn.
if cancel_ctx:
# graceful self-cancel
trio.run(main)
else:
# ctx parent task should see OoB ctxc due to
# `ptl.cancel_actor()`.
with pytest.raises(tractor.ContextCancelled) as excinfo:
trio.run(main)
assert 'root' in excinfo.value.canceller[0]
# def test_parent_actor_cancels_subactor_with_gt1_ctxs_open_to_it(
# debug_mode: bool,
# loglevel: str,
# ):
# '''
# Demos OoB cancellation from the perspective of a ctx opened with
# a child subactor where the parent cancels the child at the "actor
# layer" using `Portal.cancel_actor()` and thus the
# `ContextCancelled.canceller` received by the ctx's parent-side
# task will appear to be a "self cancellation" even though that
# specific task itself was not cancelled and thus
# `Context.cancel_called ==False`.
# '''
# TODO, do we have an existing implied ctx
# cancel test like this?
# with trio.move_on_after(0.5):# as cs:
# await _open_ctx_async(
# do_started=False,
# )
# in-line ctx scope should definitely raise
# a ctxc with `.canceller = 'root'`
# async with ptl.open_context(
# sleep_forever,
# do_started=True,
# ) as pair:

View File

@ -6,11 +6,18 @@ want to see changed.
from contextlib import (
asynccontextmanager as acm,
)
from types import ModuleType
from functools import partial
import pytest
from _pytest import pathlib
from tractor.trionics import collapse_eg
import trio
from trio import TaskStatus
from tractor._testing import (
examples_dir,
)
@pytest.mark.parametrize(
@ -106,8 +113,9 @@ def test_acm_embedded_nursery_propagates_enter_err(
debug_mode: bool,
):
'''
Demo how a masking `trio.Cancelled` could be handled by unmasking from the
`.__context__` field when a user (by accident) re-raises from a `finally:`.
Demo how a masking `trio.Cancelled` could be handled by unmasking
from the `.__context__` field when a user (by accident) re-raises
from a `finally:`.
'''
import tractor
@ -158,13 +166,13 @@ def test_acm_embedded_nursery_propagates_enter_err(
assert len(assert_eg.exceptions) == 1
def test_gatherctxs_with_memchan_breaks_multicancelled(
debug_mode: bool,
):
'''
Demo how a using an `async with sndchan` inside a `.trionics.gather_contexts()` task
will break a strict-eg-tn's multi-cancelled absorption..
Demo how a using an `async with sndchan` inside
a `.trionics.gather_contexts()` task will break a strict-eg-tn's
multi-cancelled absorption..
'''
from tractor import (
@ -190,7 +198,6 @@ def test_gatherctxs_with_memchan_breaks_multicancelled(
f'Closed {task!r}\n'
)
async def main():
async with (
# XXX should ensure ONLY the KBI
@ -211,3 +218,85 @@ def test_gatherctxs_with_memchan_breaks_multicancelled(
with pytest.raises(KeyboardInterrupt):
trio.run(main)
@pytest.mark.parametrize(
'raise_unmasked', [
True,
pytest.param(
False,
marks=pytest.mark.xfail(
reason="see examples/trio/send_chan_aclose_masks.py"
)
),
]
)
@pytest.mark.parametrize(
'child_errors_mid_stream',
[True, False],
)
def test_unmask_aclose_as_checkpoint_on_aexit(
raise_unmasked: bool,
child_errors_mid_stream: bool,
debug_mode: bool,
):
'''
Verify that our unmasker util works over the common case where
a mem-chan's `.aclose()` is included in an `@acm` stack
and it being currently a checkpoint, can `trio.Cancelled`-mask an embedded
exception from user code resulting in a silent failure which
appears like graceful cancellation.
This test suite is mostly implemented as an example script so it
could more easily be shared with `trio`-core peeps as `tractor`-less
minimum reproducing example.
'''
mod: ModuleType = pathlib.import_path(
examples_dir()
/ 'trio'
/ 'send_chan_aclose_masks_beg.py',
root=examples_dir(),
consider_namespace_packages=False,
)
with pytest.raises(RuntimeError):
trio.run(partial(
mod.main,
raise_unmasked=raise_unmasked,
child_errors_mid_stream=child_errors_mid_stream,
))
@pytest.mark.parametrize(
'ignore_special_cases', [
True,
pytest.param(
False,
marks=pytest.mark.xfail(
reason="see examples/trio/lockacquire_not_umasked.py"
)
),
]
)
def test_cancelled_lockacquire_in_ipctx_not_unmasked(
ignore_special_cases: bool,
loglevel: str,
debug_mode: bool,
):
mod: ModuleType = pathlib.import_path(
examples_dir()
/ 'trio'
/ 'lockacquire_not_unmasked.py',
root=examples_dir(),
consider_namespace_packages=False,
)
async def _main():
with trio.fail_after(2):
await mod.main(
ignore_special_cases=ignore_special_cases,
loglevel=loglevel,
debug_mode=debug_mode,
)
trio.run(_main)

View File

@ -442,25 +442,25 @@ class Context:
'''
Records whether cancellation has been requested for this context
by a call to `.cancel()` either due to,
- either an explicit call by some local task,
- an explicit call by some local task,
- or an implicit call due to an error caught inside
the ``Portal.open_context()`` block.
the `Portal.open_context()` block.
'''
return self._cancel_called
@cancel_called.setter
def cancel_called(self, val: bool) -> None:
'''
Set the self-cancelled request `bool` value.
# XXX, to debug who frickin sets it..
# @cancel_called.setter
# def cancel_called(self, val: bool) -> None:
# '''
# Set the self-cancelled request `bool` value.
'''
# to debug who frickin sets it..
# if val:
# from .devx import pause_from_sync
# pause_from_sync()
# '''
# if val:
# from .devx import pause_from_sync
# pause_from_sync()
self._cancel_called = val
# self._cancel_called = val
@property
def canceller(self) -> tuple[str, str]|None:
@ -635,6 +635,71 @@ class Context:
'''
await self.chan.send(Stop(cid=self.cid))
@property
def parent_task(self) -> trio.Task:
'''
This IPC context's "owning task" which is a `trio.Task`
on one of the "sides" of the IPC.
Note that the "parent_" prefix here refers to the local
`trio` task tree using the same interface as
`trio.Nursery.parent_task` whereas for IPC contexts,
a different cross-actor task hierarchy exists:
- a "parent"-side which originally entered
`Portal.open_context()`,
- the "child"-side which was spawned and scheduled to invoke
a function decorated with `@tractor.context`.
This task is thus a handle to mem-domain-distinct/per-process
`Nursery.parent_task` depending on in which of the above
"sides" this context exists.
'''
return self._task
def _is_blocked_on_rx_chan(self) -> bool:
'''
Predicate to indicate whether the owner `._task: trio.Task` is
currently blocked (by `.receive()`-ing) on its underlying RPC
feeder `._rx_chan`.
This knowledge is highly useful when handling so called
"out-of-band" (OoB) cancellation conditions where a peer
actor's task transmitted some remote error/cancel-msg and we
must know whether to signal-via-cancel currently executing
"user-code" (user defined code embedded in `ctx._scope`) or
simply to forward the IPC-msg-as-error **without calling**
`._scope.cancel()`.
In the latter case it is presumed that if the owner task is
blocking for the next IPC msg, it will eventually receive,
process and raise the equivalent local error **without**
requiring `._scope.cancel()` to be explicitly called by the
*delivering OoB RPC-task* (via `_deliver_msg()`).
'''
# NOTE, see the mem-chan meth-impls for *why* this
# logic works,
# `trio._channel.MemoryReceiveChannel.receive[_nowait]()`
#
# XXX realize that this is NOT an
# official/will-be-loudly-deprecated API:
# - https://trio.readthedocs.io/en/stable/reference-lowlevel.html#trio.lowlevel.Task.custom_sleep_data
# |_https://trio.readthedocs.io/en/stable/reference-lowlevel.html#trio.lowlevel.wait_task_rescheduled
#
# orig repo intro in the mem-chan change over patch:
# - https://github.com/python-trio/trio/pull/586#issuecomment-414039117
# |_https://github.com/python-trio/trio/pull/616
# |_https://github.com/njsmith/trio/commit/98c38cef6f62e731bf8c7190e8756976bface8f0
#
return (
self._task.custom_sleep_data
is
self._rx_chan
)
def _maybe_cancel_and_set_remote_error(
self,
error: BaseException,
@ -787,13 +852,27 @@ class Context:
if self._canceller is None:
log.error('Ctx has no canceller set!?')
cs: trio.CancelScope = self._scope
# ?TODO? see comment @ .start_remote_task()`
#
# if not cs:
# from .devx import mk_pdb
# mk_pdb().set_trace()
# raise RuntimeError(
# f'IPC ctx was not be opened prior to remote error delivery !?\n'
# f'{self}\n'
# f'\n'
# f'`Portal.open_context()` must be entered (somewhere) beforehand!\n'
# )
# Cancel the local `._scope`, catch that
# `._scope.cancelled_caught` and re-raise any remote error
# once exiting (or manually calling `.wait_for_result()`) the
# `.open_context()` block.
cs: trio.CancelScope = self._scope
if (
cs
and not cs.cancel_called
# XXX this is an expected cancel request response
# message and we **don't need to raise it** in the
@ -802,8 +881,7 @@ class Context:
# if `._cancel_called` then `.cancel_acked and .cancel_called`
# always should be set.
and not self._is_self_cancelled()
and not cs.cancel_called
and not cs.cancelled_caught
# and not cs.cancelled_caught
):
if (
msgerr
@ -814,7 +892,7 @@ class Context:
not self._cancel_on_msgerr
):
message: str = (
'NOT Cancelling `Context._scope` since,\n'
f'NOT Cancelling `Context._scope` since,\n'
f'Context._cancel_on_msgerr = {self._cancel_on_msgerr}\n\n'
f'AND we got a msg-type-error!\n'
f'{error}\n'
@ -824,13 +902,43 @@ class Context:
# `trio.Cancelled` subtype here ;)
# https://github.com/goodboy/tractor/issues/368
message: str = 'Cancelling `Context._scope` !\n\n'
# from .devx import pause_from_sync
# pause_from_sync()
self._scope.cancel()
else:
message: str = 'NOT cancelling `Context._scope` !\n\n'
cs.cancel()
# TODO, explicit condition for OoB (self-)cancellation?
# - we called `Portal.cancel_actor()` from this actor
# and the peer ctx task delivered ctxc due to it.
# - currently `self._is_self_cancelled()` will be true
# since the ctxc.canceller check will match us even though it
# wasn't from this ctx specifically!
elif (
cs
and self._is_self_cancelled()
and not cs.cancel_called
):
message: str = (
'Cancelling `ctx._scope` due to OoB self-cancel ?!\n'
'\n'
)
# from .devx import mk_pdb
# mk_pdb().set_trace()
# TODO XXX, required to fix timeout failure in
# `test_cancelled_lockacquire_in_ipctx_not_unmaskeed`
#
# XXX NOTE XXX, this is SUPER SUBTLE!
# we only want to cancel our embedded `._scope`
# if the ctx's current/using task is NOT blocked
# on `._rx_chan.receive()` and on some other
# `trio`-checkpoint since in the former case
# any `._remote_error` will be relayed through
# the rx-chan and appropriately raised by the owning
# `._task` directly. IF the owner task is however
# blocking elsewhere we need to interrupt it **now**.
if not self._is_blocked_on_rx_chan():
cs.cancel()
else:
# rx_stats = self._rx_chan.statistics()
message: str = 'NOT cancelling `Context._scope` !\n\n'
fmt_str: str = 'No `self._scope: CancelScope` was set/used ?\n'
if (
@ -854,6 +962,7 @@ class Context:
+
cs_fmt
)
log.cancel(
message
+
@ -946,8 +1055,9 @@ class Context:
'''
side: str = self.side
# XXX for debug via the `@.setter`
self.cancel_called = True
self._cancel_called = True
# ^ XXX for debug via the `@.setter`
# self.cancel_called = True
header: str = (
f'Cancelling ctx from {side!r}-side\n'
@ -2011,6 +2121,9 @@ async def open_context_from_portal(
f'|_{portal.actor}\n'
)
# ?TODO? could we move this to inside the `tn` block?
# -> would allow doing `ctx.parent_task = tn.parent_task` ?
# -> would allow a `if not ._scope: => raise RTE` ?
ctx: Context = await portal.actor.start_remote_task(
portal.channel,
nsf=nsf,
@ -2037,6 +2150,7 @@ async def open_context_from_portal(
scope_err: BaseException|None = None
ctxc_from_child: ContextCancelled|None = None
try:
# from .devx import pause
async with (
collapse_eg(),
trio.open_nursery() as tn,
@ -2059,6 +2173,10 @@ async def open_context_from_portal(
# the dialog, the `Error` msg should be raised from the `msg`
# handling block below.
try:
log.runtime(
f'IPC ctx parent waiting on Started msg..\n'
f'ctx.cid: {ctx.cid!r}\n'
)
started_msg, first = await ctx._pld_rx.recv_msg(
ipc=ctx,
expect_msg=Started,
@ -2067,16 +2185,16 @@ async def open_context_from_portal(
)
except trio.Cancelled as taskc:
ctx_cs: trio.CancelScope = ctx._scope
log.cancel(
f'IPC ctx was cancelled during "child" task sync due to\n\n'
f'.cid: {ctx.cid!r}\n'
f'.maybe_error: {ctx.maybe_error!r}\n'
)
# await pause(shield=True)
if not ctx_cs.cancel_called:
raise
# from .devx import pause
# await pause(shield=True)
log.cancel(
'IPC ctx was cancelled during "child" task sync due to\n\n'
f'{ctx.maybe_error}\n'
)
# OW if the ctx's scope was cancelled manually,
# likely the `Context` was cancelled via a call to
# `._maybe_cancel_and_set_remote_error()` so ensure
@ -2199,7 +2317,7 @@ async def open_context_from_portal(
# documenting it as a definittive example of
# debugging the tractor-runtime itself using it's
# own `.devx.` tooling!
#
#
# await debug.pause()
# CASE 2: context was cancelled by local task calling
@ -2272,13 +2390,16 @@ async def open_context_from_portal(
match scope_err:
case trio.Cancelled():
logmeth = log.cancel
cause: str = 'cancelled'
# XXX explicitly report on any non-graceful-taskc cases
case _:
cause: str = 'errored'
logmeth = log.exception
logmeth(
f'ctx {ctx.side!r}-side exited with {ctx.repr_outcome()!r}\n'
f'ctx {ctx.side!r}-side {cause!r} with,\n'
f'{ctx.repr_outcome()!r}\n'
)
if debug_mode():
@ -2303,6 +2424,7 @@ async def open_context_from_portal(
# told us it's cancelled ;p
if ctxc_from_child is None:
try:
# await pause(shield=True)
await ctx.cancel()
except (
trio.BrokenResourceError,
@ -2459,8 +2581,10 @@ async def open_context_from_portal(
log.cancel(
f'Context cancelled by local {ctx.side!r}-side task\n'
f'c)>\n'
f' |_{ctx._task}\n\n'
f'{repr(scope_err)}\n'
f' |_{ctx.parent_task}\n'
f' .cid={ctx.cid!r}\n'
f'\n'
f'{scope_err!r}\n'
)
# TODO: should we add a `._cancel_req_received`

View File

@ -446,12 +446,12 @@ class ActorNursery:
@acm
async def _open_and_supervise_one_cancels_all_nursery(
actor: Actor,
tb_hide: bool = False,
hide_tb: bool = True,
) -> typing.AsyncGenerator[ActorNursery, None]:
# normally don't need to show user by default
__tracebackhide__: bool = tb_hide
__tracebackhide__: bool = hide_tb
outer_err: BaseException|None = None
inner_err: BaseException|None = None

View File

@ -613,10 +613,9 @@ async def drain_to_final_msg(
# msg: dict = await ctx._rx_chan.receive()
# if res_cs.cancelled_caught:
#
# -[ ] make sure pause points work here for REPLing
# -[x] make sure pause points work here for REPLing
# the runtime itself; i.e. ensure there's no hangs!
# |_from tractor.devx.debug import pause
# await pause()
# |_see masked code below in .cancel_called path
# NOTE: we get here if the far end was
# `ContextCancelled` in 2 cases:
@ -652,6 +651,10 @@ async def drain_to_final_msg(
f'IPC ctx cancelled externally during result drain ?\n'
f'{ctx}'
)
# XXX, for tracing `Cancelled`..
# from tractor.devx.debug import pause
# await pause(shield=True)
# CASE 2: mask the local cancelled-error(s)
# only when we are sure the remote error is
# the source cause of this local task's

View File

@ -72,7 +72,7 @@ _mask_cases: dict[
dict[
int, # inner-frame index into `inspect.getinnerframes()`
# `FrameInfo.function/filename: str`s to match
tuple[str, str],
dict[str, str],
],
] = {
trio.WouldBlock: {
@ -163,6 +163,7 @@ async def maybe_raise_from_masking_exc(
# ^XXX, special case(s) where we warn-log bc likely
# there will be no operational diff since the exc
# is always expected to be consumed.
) -> BoxedMaybeException:
'''
Maybe un-mask and re-raise exception(s) suppressed by a known
@ -263,6 +264,8 @@ async def maybe_raise_from_masking_exc(
if len(masked) < 2:
# don't unmask already known "special" cases..
if (
_mask_cases
and
(cases := _mask_cases.get(type(exc_ctx)))
and
(masker_frame := is_expected_masking_case(
@ -272,7 +275,8 @@ async def maybe_raise_from_masking_exc(
))
):
log.warning(
f'Ignoring already-known/non-ideally-valid masker code @\n'
f'Ignoring already-known, non-ideal-but-valid '
f'masker code @\n'
f'{masker_frame}\n'
f'\n'
f'NOT raising {exc_ctx} from masker {exc_match!r}\n'