It was expecting `AssertionError` as a proceed-in-test signal (by
breaking from a continue loop), but `in_prompt_msg(raise_on_err=True)`
was changed to raise `ValueError`; so instead just use as a predicate
for the `break`.
Also rework `in_prompt_msg()` to accept the `child: BaseSpawn` as input
instead of `before: str` remove the casting boilerplate, and adjust all
usage to match.
The reason for this "duplication" with the `--asyncio` CLI flag (passed
to the child during spawn) is 2-fold:
- allows verifying inside `Actor._from_parent()` that the `trio` runtime was
started via `.start_guest_run()` as well as if the
`Actor._infected_aio` spawn-entrypoint value has been set (by the
`._entry.<spawn-backend>_main()` whenever `--asyncio` is passed)
such that any mismatch can be signaled via an `InternalError`.
- enables checking the `._state._runtime_vars['_is_infected_aio']` value
directly (say from a non-actor/`trio`-thread) instead of calling
`._state.current_actor(err_on_no_runtime=False)` in certain edge
cases.
Impl/testing deats:
- add `._state._runtime_vars['_is_infected_aio'] = False` default.
- raise `InternalError` on any `--asyncio`-flag-passed vs.
`_runtime_vars`-value-relayed-from-parent inside
`Actor._from_parent()` and include a `Runner.is_guest` assert for good
measure B)
- set and relay `infect_asyncio: bool` via runtime-vars to child in
`ActorNursery.start_actor()`.
- verify `actor.is_infected_aio()`, `actor._infected_aio` and
`_state._runtime_vars['_is_infected_aio']` are all set in test suite's
`asyncio_actor()` endpoint.
Finally this reproduces the issue as it (originally?) exhibited inside
`piker` where the `Actor.lifetime_stack` wasn't closed in cases where
during `infected_aio`-actor cancellation/shutdown `trio` side tasks
which are doing shielded (teardown) work are NOT being watched/waited on
from the `aio_main()` task-closure inside `run_as_asyncio_guest()`!
This is then the root cause of the guest-run being abandoned since if
our `aio_main()` task-closure doesn't know it should allow the run to
finish, it's going to call `loop.close()` eventually resulting in the
`GeneratorExit` thrown into `trio._core._run.unrolled_run()`..
So, this extends the `test_sigint_closes_lifetime_stack()` suite to
include cases for such shielded `trio`-task ops:
- add a new `trio_side_is_shielded: bool` which will toggle whether to
add a shielded 0.5s `trio.sleep()` loop to `manage_file()` which
should outlive the `asyncio` event-loop shutdown sequence and result
in an abandoned guest-run and thus a leaked file.
- parametrize the existing suite with this case resulting in a total 16
test set B)
This patch demonstrates the problem with our `aio_main()` task-closure
impl via the now 4 failing tests, a fix is coming in a follow up commit!
Turns out it somehow breaks our `to_asyncio` error relay since obvi
`asyncio`'s runtime seems to specially handle it (prolly via
`isinstance()` ?) and it caused our
`test_aio_cancelled_from_aio_causes_trio_cancelled()` to hang..
Further, obvi `unpack_error()` won't be able to find the type def if not
kept inside `._exceptions`..
So given all that, revert the change/move as well as:
- tweak the aio-from-aio cancel test to timeout.
- do `trio.sleep()` conc with any bg aio task by moving out nursery
block.
- add a `send_sigint_to: str` parameter to
`test_sigint_closes_lifetime_stack()` such that we test the SIGINT
being relayed to just the parent or the child.
Took me a while to figure out what the heck was going on but, turns out
`asyncio` changed their SIGINT handling in 3.11 as per:
https://docs.python.org/3/library/asyncio-runner.html#handling-keyboard-interruption
I'm not entirely sure if it's the 3.11 changes or possibly wtv further
updates were made in 3.12 but more or less due to the way
our current main task was written the `trio` guest-run was getting
abandoned on SIGINTs sent from the OS to the infected child proc..
Note that much of the bug and soln cases are layed out in very detailed
comment-notes both in the new test and `run_as_asyncio_guest()`, right
above the final "fix" lines.
Add new `test_infected_aio.test_sigint_closes_lifetime_stack()` test suite
which reliably triggers all abandonment issues with multiple cases
of different parent behaviour post-sending-SIGINT-to-child:
1. briefly sleep then raise a KBI in the parent which was originally
demonstrating the file leak not being cleaned up by `Actor.lifetime_stack.close()`
and simulates a ctl-c from the console (relayed in tandem by
the OS to the parent and child processes).
2. do `Context.wait_for_result()` on the child context which would
hang and timeout since the actor runtime would never complete and
thus never relay a `ContextCancelled`.
3. both with and without running a `asyncio` task in the `manage_file`
child actor; originally it seemed that with an aio task scheduled in
the child actor the guest-run abandonment always was the "loud" case
where there seemed to be some actor teardown but with tbs from
python failing to gracefully exit the `trio` runtime..
The (seemingly working) "fix" required 2 lines of code to be run inside
a `asyncio.CancelledError` handler around the call to `await trio_done_fut`:
- `Actor.cancel_soon()` which schedules the actor runtime to cancel on
the next `trio` runner cycle and results in a "self cancellation" of
the actor.
- "pumping the `asyncio` event loop" with a non-0 `.sleep(0.1)` XD
|_ seems that a "shielded" pump with some actual `delay: float >= 0`
did the trick to get `asyncio` to allow the `trio` runner/loop to
fully complete its guest-run without abandonment.
Other supporting changes:
- move `._exceptions.AsyncioCancelled`, our renamed
`asyncio.CancelledError` error-sub-type-wrapper, to `.to_asyncio` and make
it derive from `CancelledError` so as to be sure when raised by our
`asyncio` x-> `trio` exception relay machinery that `asyncio` is
getting the specific type it expects during cancellation.
- do "summary status" style logging in `run_as_asyncio_guest()` wherein
we compile the eventual `startup_msg: str` emitted just before waiting
on the `trio_done_fut`.
- shield-wait with `out: Outcome = await asyncio.shield(trio_done_fut)`
even though it seems to do nothing in the SIGINT handling case..(I
presume it might help avoid abandonment in a `asyncio.Task.cancel()`
case maybe?)
Mostly adjustments for the new pld-receiver semantics/shim-layer which
results more often in the direct delivery of `RemoteActorError`s from
IPC API primitives (like `Portal.result()`) instead of being embedded in
an `ExceptionGroup` bundled from an embedded nursery.
Tossed usage of the `debug_mode: bool` fixture to a couple problematic
tests while i was working on them.
Also includes detailed assertion updates to the inter-peer cancellation
suite in terms of,
- `Context.canceller` state correctly matching the true src actor when
expecting a ctxc.
- any rxed `ContextCancelled` should instance match the `Context._local/remote_error`
as should the `.msgdata` and `._ipc_msg`.
It's **almost** there, we're just missing the final translation code to
get from an `asyncio` side task to be able to call
`.devx._debug..wait_for_parent_stdin_hijack()` to do root actor TTY
locking. Then we just need to ensure internals also do the right thing
with `greenback()` for equivalent sync `breakpoint()` style pause
points.
Since i'm deferring this until later, tossing in some xfail tests to
`test_infected_asyncio` with TODOs for the needed implementation as well
as eventual test org.
By "provision" it means we add:
- `greenback` init block to `_run_asyncio_task()` when debug mode is
enabled (but which will currently rte when `asyncio` is detected)
using `.bestow_portal()` around the `asyncio.Task`.
- a call to `_debug.maybe_init_greenback()` in the `run_as_asyncio_guest()`
guest-mode entry point.
- as part of `._debug.Lock.is_main_trio_thread()` whenever the async-lib
is not 'trio' error lock the backend name (which is obvi `'asyncio'`
in this use case).
- `trio_typing` is nearly obsolete since `trio >= 0.23`
- `exceptiongroup` is built-in to python 3.11
- `async_generator` primitives have lived in `contextlib` for quite
a while!
Since importing from our top level `conftest.py` is not scaleable
or as "future forward thinking" in terms of:
- LoC-wise (it's only one file),
- prevents "external" (aka non-test) example scripts from importing
content easily,
- seemingly(?) can't be used via abs-import if using
a `[tool.pytest.ini_options]` in a `pyproject.toml` vs.
a `pytest.ini`, see:
https://docs.pytest.org/en/8.0.x/reference/customize.html#pyproject-toml)
=> Go back to having an internal "testing" pkg like `trio` (kinda) does.
Deats:
- move generic top level helpers into pkg-mod including the new
`expect_ctxc()` (which i needed in the advanced faults testing script.
- move `@tractor_test` into `._testing.pytest` sub-mod.
- adjust all the helper imports to be a `from tractor._testing import <..>`
Rework `test_ipc_channel_break_during_stream()` and backing script:
- make test(s) pull `debug_mode` from new fixture (which is now
controlled manually from `--tpdb` flag) and drop the previous
parametrized input.
- update logic in ^ test for "which-side-fails" cases to better match
recently updated/stricter cancel/failure semantics in terms of
`ClosedResouruceError` vs. `EndOfChannel` expectations.
- handle `ExceptionGroup`s with expected embedded errors in test.
- better pendantics around whether to expect a user simulated KBI.
- for `examples/advanced_faults/ipc_failure_during_stream.py` script:
- generalize ipc breakage in new `break_ipc()` with support for diff
internal `trio` methods and a #TODO for future disti frameworks
- only make one sub-actor task break and the other just stream.
- use new `._testing.expect_ctxc()` around ctx block.
- add a bit of exception handling with `print()`s around ctxc (unused
except if 'msg' break method is set) and eoc cases.
- don't break parent side ipc in loop any more then once
after first break, checked via flag var.
- add a `pre_close: bool` flag to control whether
`MsgStreama.aclose()` is called *before* any ipc breakage method.
Still TODO:
- drop `pytest.ini` and add the alt section to `pyproject.py`.
-> currently can't get `--rootdir=` opt to work.. not showing in
console header.
-> ^ also breaks on 'tests' `enable_modules` imports in subactors
during discovery tests?
With the seeming cause that some cases occasionally raise
`ExceptionGroup` instead of a (collapsed out) single error which, in
those cases at least try to check that `.exceptions` has the original
error.
This fixes an previously undetected bug where if an
`.open_channel_from()` spawned task errored the error would not be
propagated to the `trio` side and instead would fail silently with
a console log error. What was most odd is that it only seems easy to
trigger when you put a slight task sleep before the error is raised
(:eyeroll:). This patch adds a few things to address this and just in
general improve iter-task lifetime syncing:
- add `LinkedTaskChannel._trio_exited: bool` a flag set from the `trio`
side when the channel block exits.
- add a `wait_on_aio_task: bool` flag to `translate_aio_errors` which
toggles whether to wait the `asyncio` task termination event on exit.
- cancel the `asyncio` task if the trio side has ended, when
`._trio_exited == True`.
- always close the `trio` mem channel when the task exits such that
the `asyncio` side can error on any next `.send()` call.
Verify that if the `asyncio` side task cancels (itself) that we raise
that `asyncio.CancelledError` on the `trio` side. In the case where
`trio` initiated the cancel whether or not the `asyncio` side ended up
raising `CancelledError` doesn't really matter to us as long as the far
task did indeed terminate.