Commit Graph

542 Commits (dereg_on_oserror)

Author SHA1 Message Date
Tyler Goodlet c9fda3ff2c Rename `.delete_sockaddr()` -> `.delete_addr()` 2025-09-30 01:09:16 -04:00
Tyler Goodlet 0c2fb98d5e Add stale entry deleted from registrar test
By spawning an actor task that immediately shuts down the transport
server and then sleeps, verify that attempting to connect via the
`._discovery.find_actor()` helper delivers `None` for the `Portal`
value.

Relates to #184 and #216
2025-09-29 23:10:36 -04:00
Tyler Goodlet 9f757ffa63 Woops, fix missing `assert` thanks to copilot 2025-09-11 13:13:18 -04:00
Tyler Goodlet 73423ef2b7 Timeout on `test_peer_spawns_and_cancels_service_subactor`
While working on a fix to the hang case found from
`test_cancel_ctx_with_parent_side_entered_in_bg_task` an initial
solution caused this test to hang indefinitely; solve it with a small
wrapping `_main()` + `trio.fail_after()` entrypoint.

Further suite refinements,
- move the top-most `try:`->`else:` block
- toss in a masked base-exc block for tracing unexpected
  `ctx.wait_for_result()` outcomes.
- tweak the `raise_sub_spawn_error_after` to be an optional `float`
  which scales the `rng_seed: int = 50` msg counter to
  `tell_little_bro()` so that the abs value to the `range()` can be
  changed.
2025-09-11 10:13:04 -04:00
Tyler Goodlet 9489a2f84d Add timeout around `test_peer_spawns_and_cancels_service_subactor` suite 2025-09-11 10:13:04 -04:00
Tyler Goodlet 92eaed6fec Parametrize with `Portal.cancel_actor()` only case
Such that when `maybe_context.cancel()` is not called (explicitly) and
only the subactor is cancelled by its parent we expect to see a ctxc
raised both from any call to `Context.wait_for_result()` and out of
the `Portal.open_context()` scope, up to the `trio.run()`.

Deats,
- obvi call-n-catch the ctxc (in scope) for the oob-only
  subactor-cancelled case.
- add branches around `trio.run()` entry to match.
2025-09-11 10:13:04 -04:00
Tyler Goodlet 217d54b9d1 Add the minimal OoB cancel edge case from #391
Discovered while writing a `@context` sanity test to verify unmasker
ignore-cases support. Masked code is due to the process of finding the
minimal example causing the original hang discovered in the original
examples script. Details are in the test-fn doc strings and surrounding
comments; more refinement and cleanup coming obviously.

Also moved over the self-cancel todos from the inter-peer tests module.
2025-09-11 10:13:04 -04:00
Tyler Goodlet 62a364a1d3 Tweaks from copilot, type fix, typos, language. 2025-09-11 10:01:25 -04:00
Tyler Goodlet 9c6b90ef04 Add a ignore-masking-case script + suite
Demonstrating the guilty `trio.Lock.acquire()` impl which puts
a checkpoint inside its `trio.WouldBlock` handler and which will always
appear to mask the "sync path" case on (graceful) cancellation.

This first script draft demos the issue from within a `tractor.context`
ep bc that's where it was orig discovered, however i'm going to factor
out the `tractor` code and instead just use
a `.trionics.maybe_raise_from_masking_exc()` to demo its low-level
ignore-case feature.

Further, this script exposed a previously unhandled remote graceful
cancellation case which hangs:

- parent actor spawns child and opens a >1 ctxs with it,
- the parent then OoB (out-of-band) cancels the child actor (with
  `Portal.cancel_actor()`),
- since the open ctxs raise a ctxc with a `.canceller == parent.uid` the
  `Context._is_self_cancelled()` will eval `True`,
- the `Context._scope` will NOT be cancelled in
  `._maybe_cancel_and_set_remote_error()` resulting in any bg-task which
  is waiting on a `Portal.open_context()` to not be cancelled/unblocked.

So my plan is to factor this ^^ scenario into a standalone unit test
as well as another test which consumes from al low-level `trio`-only
version of **this** script-scenario to sanity check the interaction
of the unmasker-with-ignore-cases usage implicitly around a ctx ep.
2025-09-06 14:03:02 -04:00
Tyler Goodlet 542d4c7840 Ignore `examples/trio/` in docs-examples test suite 2025-09-06 13:39:08 -04:00
Tyler Goodlet 04c3d5e239 Wrap `send_chan_aclose_masks_beg.py` as test suite
Call it `test_trioisms::test_unmask_aclose_as_checkpoint_on_aexit` and
parametrize all script-mod`.main()` toggles including `.xfails()` for
the `raise_unmasked=False` cases.
2025-09-05 18:46:20 -04:00
Tyler Goodlet 25c5847f2e Drop `tn` input from `maybe_raise_from_masking_exc()`
Including all caller usage throughout. Moving to a non-`except*` impl
means it's never needed as a signal from the caller - we can just catch
the beg outright (like we should have always been doing)..
2025-08-20 12:45:49 -04:00
Tyler Goodlet d17864a432 Adjust test suites to new `maybe_raise_from_masking_exc()` changes 2025-08-20 12:45:49 -04:00
Tyler Goodlet ee32bc433c Add a root-already-cancelled crash handling test
Such that we audit the `shield=root_tn.cancel_scope.cancel_called,`
passed to `await debug._maybe_enter_pm()` in the `open_root_actor()`
exit handler block.
2025-08-20 10:18:52 -04:00
Tyler Goodlet 6e4c76245b Add LoC pattern matches for `test_post_mortem_api` 2025-08-19 14:14:27 -04:00
Tyler Goodlet b74e93ee55 Change one infected-aio test to use `chan` in fn sig 2025-08-18 22:32:51 -04:00
Tyler Goodlet 4a7491bda4 Add "raises-pre-started" `open_channel_from()` test
Verifying that if any exc is raised pre `chan.send_nowait()` (our
currentlly shite version of a `chan.started()`) then that exc is indeed
raised through on the `trio`-parent task side. This case was reproduced
from a `piker.brokers.ib` issue with a similar embedded
`.trionics.maybe_open_context()` call.

Deats,
- call the suite `test_aio_side_raises_before_started`.
- mk the `@context` simply `maybe_open_context(acm_func=open_channel_from)`
  with a `target=raise_before_started` which,
- simply sleeps then immediately raises a RTE.
- expect the RTE from the aio-child-side to propagate all the way up to
  the root-actor's task right up through the `trio.run()`.
2025-08-18 22:32:51 -04:00
Tyler Goodlet 79f502034f Don't hard code runtime-dir, read it with `._state.get_rt_dir()` 2025-08-18 21:30:48 -04:00
Tyler Goodlet 00112edd58 UDS: implicitly create `Address.bindspace: Path`
Since it's merely a local-file-sys subdirectory and there should be no
reason file creation conflicts with other bind spaces.

Also add 2 test suites to match,
- `tests/ipc/test_each_tpt::test_uds_bindspace_created_implicitly` to
  verify the dir creation when DNE.
- `..test_uds_double_listen_raises_connerr` to ensure a double bind
  raises a `ConnectionError` from the src `OSError`.
2025-08-18 21:30:48 -04:00
Tyler Goodlet 4ba3590450 Add `.trionics.maybe_open_context()` locking test
Call it `test_lock_not_corrupted_on_fast_cancel()` and includes
a detailed doc string to explain. Implemented it "cleverly" by having
the target `@acm` cancel its parent nursery after a peer, cache-hitting
task, is already waiting on the task mutex release.
2025-08-18 21:07:12 -04:00
Tyler Goodlet 70664b98de Well then, I guess it just needed, a checkpoint XD
Here I was thinking the bcaster (usage) maybe required a rework but,
NOPE it's just bc a checkpoint was needed in the parent task owning the
`tn` which spawns `get_sub_and_pull()` tasks to ensure the bg allocated
`an`/portal is eventually cancel-called..

Ah well, at least i started a patch for `MsgStream.subscribe()` to make
it multicast revertible.. XD

Anyway, I tossed in some checks & notes related to all that unnecessary
effort since I do think i'll move forward implementing it:
- for the `cache_hit` case always verify that the `bcast` clone is
  unregistered from the common state subs after
  `.subscribe().__aexit__()`.
- do a light check that the implicit `MsgStream._broadcaster` is always
  the only bcrx instance left-leaked into that state.. that is until
  i get the proper de-allocation/reversion from multicast -> unicast
  working.
- put in mega detailed note about the required parent-task checkpoint.
2025-08-18 21:07:12 -04:00
Tyler Goodlet 1c425cbd22 Tool-up `test_resource_cache.test_open_local_sub_to_stream`
Since I recently discovered a very subtle race-case that can sometimes
cause the suite to hang, seemingly due to the `an: ActorNursery`
allocated *behind* the `.trionics.maybe_open_context()` usage; this can
result in never cancelling the 'streamer' subactor despite the `main()`
timeout-guard?

This led me to dig in and find that the underlying issue was 2-fold,

- our `BroadcastReceiver` termination-mgmt semantics in
  `MsgStream.subscribe()` can result in the first subscribing task to
  always keep the `MsgStream._broadcaster` instance allocated; it's
  never `.aclose()`ed, which makes it tough to determine (and thus
  trace) when all subscriber-tasks are actually complete and
  exited-from-`.subscribe()`..

- i was shield waiting `.ipc._server.Server.wait_for_no_more_peers()` in
  `._runtime.async_main()`'s shutdown sequence which would then compound
  the issue resulting in a SIGINT-shielded hang.. the worst kind XD

Actual changes here are just styling, printing, and some mucking with
passing the `an`-ref up to the parent task in the root-actor where i was
doing a conditional `ActorNursery.cancel()` to mk sure that was actually
the problem. Presuming this is fixed the `.pause()` i left unmasked
should never hit.
2025-08-18 21:07:06 -04:00
Tyler Goodlet 88c1c083bd Add timeout to inf-streamer test 2025-08-18 13:31:15 -04:00
Tyler Goodlet b096867d40 Remove lingering seg=False-flags from tests 2025-08-18 12:03:32 -04:00
Tyler Goodlet 0ffcea1033 Adjust `test_trio_prestarted_task_bubbles()` suite to expect non-eg raises 2025-08-18 10:46:37 -04:00
Tyler Goodlet a7bdf0486c Styling tweaks to quadruple streaming test fn 2025-08-18 10:46:37 -04:00
Tyler Goodlet d2ac9ecf95 Resolve `test_cancel_while_childs_child_in_sync_sleep`
Was failing due to the `.fail_after()` timeout being *too short* and
somehow the new interplay of that with strict-exception groups resulting
in the `TooSlowError` never raising but instead an eg with the embedded
`AssertionError`?? I still don't really get it honestly..

I've written up lengthy notes around the different `delay` settings that
can be used to see the diff outcomes, the failing case being the one
i still don't really grok and think is justification for `trio` to
bubble inner `Cancelled`s differently possibly?

For now i've included the original failing case as an `xfail`
parametrization for now which will hopefully drive a follow lowlevel
`trio` test in `test_trioisms`!
2025-08-18 10:46:37 -04:00
Tyler Goodlet dcb1062bb8 Fix cluster suite, chng to new `gather_contexts()`
Namely `test_empty_mngrs_input_raises()` was failing due to
lazy-iterator use as input to `mngrs` which i guess i added support for
a while back (by it doing a `list(mngrs)` internally)? So just change it
to `gather_contexts(mngrs=())` and also tweak the `trio.fail_after(3)`
since it appears that the prior 1sec was causing
too-fast-of-a-cancellation (before the cluster fully spawned) and thus
the expected `ValueError` never to show..

Also, mask the `tractor.trionics.collapse_eg()` usage (again?) in
`open_actor_cluster()` since it seems unnecessary.
2025-08-18 10:46:37 -04:00
Tyler Goodlet 05d865c0f1 WIP tinkering with strict-eg-tns and cluster API
Seems that the way the actor-nursery interacts with the
`.trionics.gather_contexts()` API on cancellation makes our
`.trionics.collapse_eg()` not work as intended?

I need to dig into how `ActorNursery.cancel()` and `.__aexit__()` might
be causing this discrepancy..

Consider this a commit-of-my-index type save for rn.
2025-08-18 10:46:37 -04:00
Tyler Goodlet 8218f0f51f Bit of multi-line styling / name tweaks in cancellation suites 2025-08-18 10:46:37 -04:00
Tyler Goodlet f776c47cb4 Drop msging-err patt from `subactor_breakpoint` ex
Since the `bdb` module was added to the namespace lookup set in
`._exceptions.get_err_type()` we can now relay a RAE-boxed
`bdb.BdbQuit`.
2025-08-18 10:46:37 -04:00
Tyler Goodlet 0ca3d50602 Use `._supervise._shutdown_msg` in tooling test 2025-08-15 16:29:05 -04:00
Tyler Goodlet 547cf5a210 Drop stale comment from inter-peer suite 2025-07-18 00:35:35 -04:00
Tyler Goodlet 4569d11052 Move `.is_multi_cancelled()` to `.trioniics._beg`
Since it's for beg filtering, the current impl should be renamed anyway;
it's not just for filtering cancelled excs.

Deats,
- added a real doc string, links to official eg docs and fixed the
  return typing.
- adjust all internal imports to match.
2025-07-16 15:49:18 -04:00
Tyler Goodlet 35977dcebb Adjust ep-masking-suite for the real-use-case
Namely that the more common-and-pertinent case is when
a `@context`-ep-fn contains the `finally`-footgun but without
a surrounding embedded `tn` (which currently still requires its own
scope embedded `trionics.maybe_raise_from_masking_exc()`) which can't
be compensated-for by `._rpc._invoke()` easily. Instead the test is
composed where the `._invoke()`-internal `tn` is the machinery being
addressed in terms of masking user-code excs with `trio.Cancelled`.

Deats,
- rename the test -> `test_unmasked_remote_exc` to reflect what the
  runtime should actually be addressing/solving.
- drop the embedded `tn` from `sleep_n_chkpt_in_finally()` (for now)
  since that case can't currently easily be addressed without the user
  code using its own `trionics.maybe_raise_from_masking_exc()` inside
  the nursery scope.
- as such drop all `tn` related params/logic/usage from the ep.
- add in a `Cancelled` handler block which checks for RTE masking and
  always prints the occurrence loudly.

Follow up,
- obvi this suite will currently fail until the appropriate adjustment
  is made to `._rpc._invoke()` to do the unmasking; coming next.
- we probably still need a case with an embedded user `tn` where if
  the default strict-eg mode is used then a ctxc from the parent might
  cause a non-graceful `Context.cancel()` outcome?
 |_since the embedded user-`tn` will raise
   `ExceptionGroup[trio.Cancelled]` upward despite the parent nursery's
   scope being the canceller, or will a `collapse_eg()` inside the
   `._invoke()` scope handle this as well?
2025-07-15 07:23:21 -04:00
Tyler Goodlet 63c5b7696a Mv `maybe_raise_from_masking_exc()` to `.trionics`
Factor the `@acm`-closure it out of the
`test_trioisms::test_acm_embedded_nursery_propagates_enter_err` suite
for real use internally.
2025-07-15 07:23:21 -04:00
Tyler Goodlet 5f94f52226 Add ctx-ep suite for `trio`'s *finally-footgun*
Deats are documented within, but basically a subtlety we already track
with `trio`'s masking of excs by a checkpoint-in-`finally` can cause
compounded issues with our `@context` endpoints, mostly in terms of
remote error and cancel-ack relay semantics.
2025-07-15 07:23:21 -04:00
Tyler Goodlet 9ff448faa3 Add `open_crash_handler()` / `repl_fixture` suite
Nicely nailing 2 birds by leveraging the new `repl_fixture` support to
actually avoid use of a `pexpect`-style test B)

Functionality audit summary,
- ensures `open_crash_handler() as bxerr:` adheres to,
  - `raise_on_exit` semantics including only raising from a list of exc-types,
  - `repl_fixture` setup/teardown invocation and that `yield False` blocks REPL
    interaction,
  - delivering a `BoxedMaybeException` with the correct state set post
    crash.
  - all the above outside the actor-runtime existing.

Also luckily enough, this seems to have found a bug for which a fix is
coming right up!
2025-07-14 17:55:18 -04:00
Tyler Goodlet 760b9890c4 Add `debugging/subactor_bp_in_ctx.py` test set
It's been in the debug scripts quite a while without a wrapping test and
will be,
- only the 2nd such REPL test which uses a lower-level `@context` ep-API
- the first official and explicit use of `enable_transports=['uds']`
  a suite.

Deats,
- flip to 'uds' tpt and 'devx' level logging in the script.
- add a new 2-case suite `test_ctxep_pauses_n_maybe_ipc_breaks` which
  validates both the quit-early (via `BdbQuit`) and
  channel-dropped-need-to-ctlc cases from a single test fn.
2025-07-14 13:15:07 -04:00
Tyler Goodlet bbd2ea3e4f Prevent `test_breakpoint_hook_restored` subproc hangs
If the underlying example script fails (say due to a console output
pattern-mismatch, `AssertionError`) the `pexpect` managed subproc with
a `debug_mode=True` crash-handling-REPL engaged will ofc *not terminate*
due to any SIGINT sent by the test harnesss (since we shield from it as
part of normal sub-actor debugger operation). So instead always send
a 'continue' cmd to the active `PdbREPL`'s stdin so it deactivates and
allows the py-script-process to raise and terminate, unblocking the
`pexpect.spawn`'s internal subproc joiner (which would otherwise hang
without manual intervention, blocking downstream tests..).

Also, use the new `PexpectSpawner` type alias after actually importing
future annots.. XD
2025-07-14 00:00:13 -04:00
Tyler Goodlet 6b903f7746 Type alias our `pexpect.spawn()` closure fixture
Such that we can more easily annotate any consumer test's of our
`.tests.devx.conftest.spawn()` fixture which delivers a closure which, when
called in a test fn body, transitively sub-invokes:
`pytest.Pytester.spawn()` -> `pexpect.spawn()`

IMO Expecting `Callable[[str], pexpect.pty_spawn.spawn]]` to be used all
over is a bit too.. verbose?
2025-07-14 00:00:13 -04:00
Tyler Goodlet 2280bad135 Type annot the `testdir` fixture 2025-07-14 00:00:13 -04:00
Tyler Goodlet 1c6660c497 Mk `.devx._debug` a sub-pkg `.devx.debug`
With plans for much factoring of the original module into sub-mods!
Adjust all imports and refs throughout to match.
2025-07-14 00:00:12 -04:00
Tyler Goodlet 3aee702733 Add a `debug_mode`-state reversion test 2025-07-14 00:00:12 -04:00
Tyler Goodlet 37f843a128 Add an `enable_transports` test-suite
Like it sounds, verifying that when that param is passed to the runtime
startup eps (`.open_root_actor()/.open_nursery()`), the appropriate
tpt-protocol is deployed for IPC (both the server and bound endpoints)
in both the root and any sub-actors (as passed down from rent to child
via the `.msg.types.SpawnSpec`).
2025-07-13 15:26:37 -04:00
Tyler Goodlet 29cd2ddbac Drop 'IPC' prefix from `._server` types
We already have the `.ipc` sub-pkg name so it seems a bit
redundant/noisy for a namespace path Bp

Leave an alias for the `Server` rn since it's already used in a few
other internal mods.. will likely rename later if everyone is cool with
it..
2025-07-13 15:26:37 -04:00
Tyler Goodlet 295b06511b Plugin-ize some re-usable `conftest` parts
Namely any CLI driven runtime-config fixtures such as,

- `--spawn-backend` and `start_method`,
- `--tpdb` and `debug_mode`,
- `--tpt-proto` and `tpt_protos`/`tpt_proto`,
- `reg_addr` as driven by the above.

This moves all fixtures and necessary hook funcs (CLI parsing,
configuring and test-gen) to the `._testing.pytest` module and thus
allows any dependent project to leverage these fixtures in their own
test suites after pointing to that plugin mod using,

```python
    # conftest.py
    pytest_plugins: tuple[str] = (
        "tractor._testing.pytest",
    )
```

Also, add a new `._testing.addr` helper mod which now contains
a factored `get_rando_addr()` helper for creating test-sesh unique
tpt-specific registry (or other) IPC endpoint addrs.
2025-07-13 15:26:37 -04:00
Tyler Goodlet 1e6b5b3f0a Start a very basic ipc-server unit test suite
For now it just boots a server, parametrized over all tpt-protos, sin
any actor runtime bootup. Obvi the future todo is ensuring it all works
with a client connecting via the equivalent lowlevel
`.ipc._chan._connect_chan()` API(s).
2025-07-13 15:26:37 -04:00
Tyler Goodlet 36ddb85197 Fix assert on `.devx.maybe_open_crash_handler()` delivered `bxerr` 2025-07-13 15:26:37 -04:00
Tyler Goodlet d6b0ddecd7 Improve bit of tooling for `test_resource_cache.py`
Namely while what I was actually trying to solve was why
`TransportClosed` was getting raised from `Portal.cancel_actor()` but
still useful edge case auditing either way. Also opts into the
`debug_mode` fixture with apprope timeout adjustment B)
2025-07-13 15:26:37 -04:00