Commit Graph

1638 Commits (dec2b1f0f52fad7002420451995912028e45715f)

Author SHA1 Message Date
Tyler Goodlet dec2b1f0f5 Reapply "Port all tests to new `reg_addr` fixture name"
This reverts-the-revert of commit
bc13599e1f which was needed to land pre
`multihomed` feat branch history.
2025-03-20 19:50:31 -04:00
goodboy 3ccbfd7e54
Merge pull request #372 from goodboy/devx_subpkg
Start ` tractor.devx` sub-pkg
2025-03-20 19:48:42 -04:00
Tyler Goodlet 8d318a8ac5 Flip a last `MultiError` to a beg, add todo on `@stream` func 2025-03-20 15:07:27 -04:00
Tyler Goodlet d5eec6eb6c Re-revert back to `.devx` subpkg after rebase.. 2025-03-20 15:07:27 -04:00
Tyler Goodlet a88564549a Yahh, add `.devx` package to installed subpkgs.. 2025-03-20 15:07:27 -04:00
Tyler Goodlet f028181e19 Add `stackscope` as dep, drop legacy `pdb` issue cruft 2025-03-20 15:07:27 -04:00
Tyler Goodlet 3a317c1581 Enable `stackscope` render via root in debug mode
If `stackscope` is importable and debug_mode is enabled then we by
default call and report `.devx.enable_stack_on_sig()` is set B)

This makes debugging unexpected (SIGINT ignoring) hangs a cinch!
2025-03-20 15:07:27 -04:00
Tyler Goodlet 65e49696e7 Woops, fix `_post_mortem()` type sig..
We're passing a `extra_frames_up_when_async=2` now (from prior attempt
to hide `CancelScope.__exit__()` when `shield=True`) and thus both
`debug_func`s must accept it 🤦

On the brighter side found out that the `TypeError` from the call-sig
mismatch was actually being swallowed entirely so add some
`.exception()` msgs for such cases to at least alert the dev they broke
stuff XD
2025-03-20 15:07:27 -04:00
Tyler Goodlet e834297503 Add `shield: bool` support to `.pause()`
It's been on the todo for a while and I've given up trying to properly
hide the `trio.CancelScope.__exit__()` frame for now instead opting to
just `log.pdb()` a big apology XD

Users can obvi still just not use the flag and wrap `tractor.pause()` in
their own cs block if they want to avoid having to hit `'up'` in the pdb
REPL if needed in a cancelled task-scope.

Impl deatz:
- factor orig `.pause()` impl into new `._pause()` so that we can more tersely
  wrap the original content depending on `shield: bool` input; only open
  the cancel-scope when shield is set to avoid aforemented extra strack
  frame annoyance.
- pass through `shield` to underlying `_pause` and `debug_func()` so we
  can actually know when so log our apology.
- add a buncha notes to new `.pause()` wrapper regarding the inability
  to hide the cancel-scope `.__exit__()`, inluding that overriding the
  code in `trio._core._run.CancelScope` doesn't seem to solve the issue
  either..

Unrelated `maybe_wait_for_debugger()` tweaks:
- don't read `Lock.global_actor_in_debug` more then needed, rename local
  read var to `in_debug` (since it can also hold the root actor uid, not
  just sub-actors).
- shield the `await debug_complete.wait()` since ideally we avoid the
  root cancellation child-actors in debug even when the root calls this
  func in a cancelled scope.
2025-03-20 15:07:27 -04:00
Tyler Goodlet e3bb9c914c Mk debugger tests work for arbitrary pre-REPL format
Since this was changed as part of overall project wide logging format
updates, and i ended up changing the both the crash and pause `.pdb()`
msgs to include some multi-line-ascii-"stuff", might as well make the
pre-prompt checks in the test suite more flexible to match.

As such, this exposes 2 new constants inside the `.devx._debug` mod:
- `._pause_msg: str` for the pre `tractor.pause()` header emitted via
  `log.pdb()` and,
- `._crash_msg: str` for the pre `._post_mortem()` equiv when handling
  errors in debug mode.

Adjust the test suite to use these values and thus make us more capable
to absorb changes in the future as well:
- add a new `in_prompt_msg()` predicate, very similar to `assert_before()`
  but minus `assert`s which takes in a `parts: list[str]` to match
  in the pre-prompt stdout.
- delegate to `in_prompt_msg()` in `assert_before()` since it was mostly
  duplicate minus `assert`.
- adjust all previous `<patt> in before` asserts to instead use
  `in_prompt_msg()` with separated pre-prompt-header vs. actor-name
  `parts`.
- use new `._pause/crash_msg` values in all such calls including any
  `assert_before()` cases.
2025-03-20 15:07:27 -04:00
Tyler Goodlet 526add2cae Support `maybe_wait_for_debugger(header_msg: str)`
Allow callers to stick in a header to the `.pdb()` level emitted msg(s)
such that any "waiting status" content is only shown if the caller
actually get's blocked waiting for the debug lock; use it inside the
`._spawn` sub-process reaper call.

Also, return early if `Lock.global_actor_in_debug == None` and thus
only enter the poll loop when actually needed, consequently raise
if we fall through the loop without acquisition.
2025-03-20 15:07:27 -04:00
Tyler Goodlet 1fb4d7318b Fix `.devx.maybe_wait_for_debugger()` polling deats
When entered by the root actor avoid excessive polling cycles by,
- blocking on the `Lock.no_remote_has_tty: trio.Event` and breaking
  *immediately* when set (though we should really also lock
  it from the root right?) to avoid extra loops..
- shielding the `await trio.sleep(poll_delay)` call to avoid any local
  cancellation causing the (presumably root-actor task) caller to move
  on (possibly to cancel its children) and instead to continue
  poll-blocking until the lock is actually released by its user.
- `break` the poll loop immediately if no remote locker is detected.
- use `.pdb()` level for reporting lock state changes.

Also add a #TODO to handle calls by non-root actors as it pertains to
2025-03-20 15:07:27 -04:00
Tyler Goodlet 199ca48cc4 Add `stackscope` tree pprinter triggered by SIGUSR1
Can be optionally enabled via a new `enable_stack_on_sig()` which will
swap in the SIGUSR1 handler. Much thanks to @oremanj for writing this
amazing project, it's thus far helped me fix some very subtle hangs
inside our new IPC-context cancellation machinery that would have
otherwise taken much more manual pdb-ing and hair pulling XD

Full credit for `dump_task_tree()` goes to the original project author
with some minor tweaks as was handed to me via the trio-general matrix
room B)

Slight changes from orig version:
- use a `log.pdb()` emission to pprint to console
- toss in an ex sh CLI cmd to trigger the dump from another terminal
  using `kill` + `pgrep`.
2025-03-20 15:07:27 -04:00
Tyler Goodlet 5b3bcbaa7d Only use `greenback` if actor-runtime is up.. 2025-03-20 15:07:27 -04:00
Tyler Goodlet 8647421ef9 Ignore `greenback` import error if not installed 2025-03-20 15:07:27 -04:00
Tyler Goodlet ba9448d52f Change old `._debug._pause()` name, cherry to #362 re `greenback` 2025-03-20 15:07:27 -04:00
Tyler Goodlet f5c35dca55 Runtime import `.get_root()` in stdin hijacker to avoid import cycle 2025-03-20 15:07:27 -04:00
Tyler Goodlet cebc2cb515 Ignore kbis in `open_crash_handler()` by default 2025-03-20 15:07:27 -04:00
Tyler Goodlet 5042f1fdb8 Comment all `.pause(shield=True)` attempts again, need to solve cancel scope `.__exit__()` frame hiding issue.. 2025-03-20 15:07:27 -04:00
Tyler Goodlet 5912fecdc9 Add shielding support to `.pause()`
Implement it like you'd expect using simply a wrapping
`trio.CancelScope` which is itself shielded by the input `shield: bool`
B)

There's seemingly still some issues with the frame selection when the
REPL engages and not sure how to resolve it yet but at least this does
indeed work for practical purposes. Still needs a test obviously!
2025-03-20 15:07:27 -04:00
Tyler Goodlet cca4f952ed Move `maybe_open_crash_handler()` CLI `--pdb`-driven wrapper to debug mod 2025-03-20 15:07:27 -04:00
Tyler Goodlet ab0c0fb71d Start `.devx.cli` extensions for pop CLI frameworks
Starting of with just a `typer` (and thus transitively `click`)
`typer.Typer.callback` hook which allows passthrough of the `--ll
<loglevel: str>` and `--pdb <debug_mode: bool>` flags for use when
building CLIs that use the runtime Bo

Still needs lotsa refinement and obviously better docs but, the doc
string for `load_runtime_vars()` shows how to use the underlying
`.devx._debug.open_crash_handler()` via a wrapper that can be passed the
`--pdb` flag and then enable debug mode throughout the entire actor
system.
2025-03-20 15:07:27 -04:00
Tyler Goodlet b00ba158f1 Kick off `.devx` subpkg for our dev tools B)
Where `.devx` is "developer experience", a hopefully broad enough subpkg
name for all the slick stuff planned to augment working on the actor
runtime 💥

Move the `._debug` module into the new subpkg and adjust rest of core
code base to reflect import path change. Also add a new
`.devx._debug.open_crash_handler()` manager for wrapping any sync code
outside a `trio.run()` which is handy for eventual CLI addons for
popular frameworks like `click`/`typer`.
2025-03-20 15:07:27 -04:00
Tyler Goodlet 93f489e263 Expose `Channel` @ pkg level, drop `_debug.pp()` alias 2025-03-20 15:07:27 -04:00
Tyler Goodlet fa5f458de0 Move `.to_asyncio` to modern optional value type annots 2025-03-20 15:07:27 -04:00
Tyler Goodlet 6de4a5a9f3 Map `breakpoint()` built-in to new `.pause_from_sync()` ep 2025-03-20 15:07:27 -04:00
Tyler Goodlet ab8bd9b787 Fix frame-selection display on first REPL entry
For whatever reason pdb(p), and in general, will show the frame of the
*next* python instruction/LOC on initial entry (at least using
`.set_trace()`), as such remove the `try/finally` block in the sync
code entrypoint `.pause_from_sync()`, and also since doesn't seem like
we really need it anyway.

Further, and to this end:
- enable hidden frames support in our default config.
- fix/drop/mask all the frame ref-ing/mangling we had prior since it's no
  longer needed as well as manual `Lock` releasing which seems to work
  already by having the `greenback` spawned task do it's normal thing?
- move to no `Union` type annots.
- hide all frames that can add "this is the runtime confusion" to
  traces.
2025-03-20 15:07:27 -04:00
Tyler Goodlet 1deed8dbee ._runtime: log level tweaks, use crit for stale debug lock detection 2025-03-20 15:07:27 -04:00
Tyler Goodlet 36d2aa1852 Add longer "required reading" list B) 2025-03-20 15:07:27 -04:00
Tyler Goodlet f0417d802b First proto: use `greenback` for sync func breakpointing
This works now for supporting a new `tractor.pause_from_sync()`
`tractor`-aware-replacement for `Pdb.set_trace()` from sync functions
which are also scheduled from our runtime. Uses `greenback` to do all
the magic of scheduling the bg `tractor._debug._pause()` task and
engaging the normal TTY locking machinery triggered by `await
tractor.breakpoint()`

Further this starts some public API renaming, making a switch to
`tractor.pause()` from `.breakpoint()` which IMO much better expresses
the semantics of the runtime intervention required to suffice
multi-process "breakpointing"; it also is an alternate name for the same
in computer science more generally: https://en.wikipedia.org/wiki/Breakpoint
It also avoids using the same name as the `breakpoint()` built-in which
is important since there **is alot more going on** when you call our
equivalent API.

Deats of that:
- add deprecation warning for `tractor.breakpoint()`
- add `tractor.pause()` and a shorthand, easier-to-type, alias `.pp()`
  for "pause-point" B)
- add `pause_from_sync()` as the new `breakpoint()`-from-sync-function
  hack which does all the `greenback` stuff for the user.

Still TODO:
- figure out where in the runtime and when to call
  `greenback.ensure_portal()`.
- fix the frame selection issue where
  `trio._core._ki._ki_protection_decorator:wrapper` seems to be always
  shown on REPL start as the selected frame..
2025-03-20 15:07:27 -04:00
Tyler Goodlet 62a0fff2fd Add a debug-mode-breakpoint-causes-hang case!
Only found this by luck more or less (while working on something in
a client project) and it turns out we can actually get to (yet another)
hang state where SIGINT will be ignored by the root actor on teardown..

I've added all the necessary logic flags to reproduce. We obviously need
a follow up bug issue and a test suite to replicate!

It appears as though the following are required based on very light
tinkering:
- infected asyncio mode active
- debug mode active
- the `trio` context must breakpoint *before* `.started()`-ing
- the `asyncio` must **not** error
2025-03-20 15:07:26 -04:00
Tyler Goodlet d65e4bbad7 Add (first-draft) infected-`asyncio` actor task uses debugger example 2025-03-20 15:07:26 -04:00
goodboy ee372933a7
Merge pull request #357 from goodboy/ctx_cancel_semantics_and_overruns
`Context` semantics for cross-actor-task cancellation and overruns
2025-03-20 15:06:57 -04:00
Tyler Goodlet 96cdcd8f39 Pin to exact `trio` version that still has loose egs 2025-03-20 14:24:21 -04:00
Tyler Goodlet bc13599e1f Revert "Port all tests to new `reg_addr` fixture name"
This reverts commit 715348c5c2.
2025-03-19 15:34:30 -04:00
Tyler Goodlet 54576851e9 Add a `debug_mode: bool` fixture via `--tpdb` flag
Allows tests (including any `@tractor_test`s) to subscribe to a CLI flag
`--tpdb` (for "tractor python debugger") which the session can provide
to tests which can then proxy the value to `open_root_actor()` (via
`open_nursery()`) when booting the runtime - thus enabling our debug
mode globally to any subscribers B)

This is real handy if you have some failures but can't determine the
root issue without jumping into a `pdbp` REPL inside a (sub-)actor's
spawned-task.
2025-03-19 15:34:30 -04:00
Tyler Goodlet 2a5ff82061 Only run CI on py3.11 2025-03-19 15:34:30 -04:00
Tyler Goodlet f2d3f0cc21 Backport skipping `examples/multihost/` in tests
This was actually fixed on a downstream dev branch (adding py3.13
support i think?); so backport it here to get us running again on 3.11.
2025-03-19 15:34:30 -04:00
Tyler Goodlet 6b282bfa06 Add `._testing` as subpkg.. 2025-03-19 15:34:30 -04:00
Tyler Goodlet 11bab13a06 Various adjustments to fix breakage after rebase
- Remove `exceptiongroup` import,
- pin to py 3.11 in `setup.py`
- revert any lingering `tractor.devx` imports; sub-pkg is coming in
  a downstream PR!
- remove weird double `@property` lingering from conflict reso..
- modern `pytest` requires conftest mod mods to be  relative imported.
2025-03-19 15:30:59 -04:00
Tyler Goodlet 9a8cd13894 Another cancel-req-invalid log msg fmt tweak 2025-03-16 16:06:26 -04:00
Tyler Goodlet 3706abca71 Adjust advanced faults test(s) for absorbed EoCs
More or less just simplifies to not seeing the stream closure errors and
instead expecting KBIs from the simulated user who 'ctl-cs after hang'.

Toss in a little `stuff_hangin_ctlc()` to the script to wrap all that
and always check stream closure before sending the final KBI.
2025-03-16 16:06:26 -04:00
Tyler Goodlet 771fc33801 Absorb EoCs via `Context.open_stream()` silently
I swear long ago it used to operate this way but, I guess this finalizes
the design decision. It makes a lot more sense to *not* propagate any
`trio.EndOfChannel` raised from a `Context.open_stream() as stream:`
block when that EoC is due to graceful-explicit stream termination.
We use the EoC much like a `StopAsyncIteration` where the error
indicates termination of the stream due to either:
- reception of a stop IPC msg indicating the far end ended the stream
  (gracecfully),
- closure of the underlying `Context._recv_chan` either by the runtime
  or due to user code having called `MsgStream.aclose()`.

User code shouldn't expect to handle EoC outside the block since the
`@acm` having closed should indicate the exactly same lifetime state
(of said stream) ;)

Deats:
- add special EoC handler in `.open_stream()` which silently "absorbs"
  the error only when the stream is already marked as closed (meaning
  the EoC indeed corresponds to IPC closure) with an assert for now
  ensuring the error is the same as set to `MsgStream._eoc`.
- in `MsgStream.receive()` break up the handlers for EoC and
  `trio.ClosedResourceError` since the error instances are saved to
  different variables and we **don't** want to rewrite the exception in
  the eoc case (normally to mask `trio` internals in tbs) bc we need the
  instance to be the exact one for doing checks inside
  `.open_stream().__aexit__()` to absorb it.

Other surrounding "improvements":
- start using the new `Context.maybe_raise()` helper where it can easily
  replace existing equivalent block-sections.
- use new `RemoteActorError.src_uid` as required.
2025-03-16 16:06:26 -04:00
Tyler Goodlet a87df3009f Drop now-deprecated deps on modern `trio`/Python
- `trio_typing` is nearly obsolete since `trio >= 0.23`
- `exceptiongroup` is built-in to python 3.11
- `async_generator` primitives have lived in `contextlib` for quite
  a while!
2025-03-16 16:06:24 -04:00
Tyler Goodlet 05f28c8728 Pin to `trio>=0.24` to avoid `trio_typing` 2025-03-16 15:52:55 -04:00
Tyler Goodlet 85825cdd76 Add `.trionics._broadcast` todos for py 3.12 2025-03-16 15:52:55 -04:00
Tyler Goodlet a5bc113fde Start a `._rpc` module
Since `._runtime` was getting pretty long (> 2k LOC) and much of the RPC
low-level machinery is fairly isolated to a handful of task-funcs, it
makes sense to re-org the RPC task scheduling and driving msg loop to
its own code space.

The move includes:
- `process_messages()` which is the main IPC business logic.
- `try_ship_error_to_remote()` helper, to box local errors for the wire.
- `_invoke()`, the core task scheduler entrypoing used in the msg loop.
- `_invoke_non_context()`, holds impls for non-`@context` task starts.
- `_errors_relayed_via_ipc()` which does all error catch-n-boxing for
   wire-msg shipment using `try_ship_error_to_remote()` internally.

Also inside `._runtime` improve some `Actor` methods docs.
2025-03-16 15:52:53 -04:00
Tyler Goodlet 4f7823cf55 Move `Portal.open_context()` impl to `._context`
Finally, since normally you need the content from `._context.Context`
and surroundings in order to effectively grok `Portal.open_context()`
anyways, might as well move the impl to the ctx module as
`open_context_from_portal()` and just bind it on the `Portal` class def.

Associated/required tweaks:
- avoid circ import on `.devx` by only import
  `.maybe_wait_for_debugger()` when debug mode is set.
- drop `async_generator` usage, not sure why this hadn't already been
  changed to `contextlib`?
- use `@acm` alias throughout `._portal`
2025-03-16 15:32:13 -04:00
Tyler Goodlet 544cb40533 Attempt at better internal traceback hiding
Previously i was trying to approach this using lots of
`__tracebackhide__`'s in various internal funcs but since it's not
exactly straight forward to do this inside core deps like `trio` and the
stdlib, it makes a bit more sense to optionally catch and re-raise
certain classes of errors from their originals using `raise from` syntax
as per:
https://docs.python.org/3/library/exceptions.html#exception-context

Deats:
- litter `._context` methods with `__tracebackhide__`/`hide_tb` which
  were previously being shown but that don't need to be to application
  code now that cancel semantics testing is finished up.
- i originally did the same but later commented it all out in `._ipc`
  since error catch and re-raise instead in higher level layers
  (above the transport) seems to be a much saner approach.
- add catch-n-reraise-from in `MsgStream.send()`/.`receive()` to avoid
  seeing the depths of `trio` and/or our `._ipc` layers on comms errors.

Further this patch adds some refactoring to use the
same remote-error shipper routine from both the actor-core in the RPC
invoker:
- rename it as `try_ship_error_to_remote()` and call it from
  `._invoke()` as well as it's prior usage.
- make it optionally accept `cid: str` a `remote_descr: str` and of
  course a `hide_tb: bool`.

Other misc tweaks:
- add some todo notes around `Actor.load_modules()` debug hooking.
- tweak the zombie reaper log msg and timeout value ;)
2025-03-16 15:30:08 -04:00
Tyler Goodlet 389b305d3b Add (back) a `tractor._testing` sub-pkg
Since importing from our top level `conftest.py` is not scaleable
or as "future forward thinking" in terms of:
- LoC-wise (it's only one file),
- prevents "external" (aka non-test) example scripts from importing
  content easily,
- seemingly(?) can't be used via abs-import if using
  a `[tool.pytest.ini_options]` in a `pyproject.toml` vs.
  a `pytest.ini`, see:
  https://docs.pytest.org/en/8.0.x/reference/customize.html#pyproject-toml)

=> Go back to having an internal "testing" pkg like `trio` (kinda) does.

Deats:
- move generic top level helpers into pkg-mod including the new
  `expect_ctxc()` (which i needed in the advanced faults testing script.
- move `@tractor_test` into `._testing.pytest` sub-mod.
- adjust all the helper imports to be a `from tractor._testing import <..>`

Rework `test_ipc_channel_break_during_stream()` and backing script:
- make test(s) pull `debug_mode` from new fixture (which is now
  controlled manually from `--tpdb` flag) and drop the previous
  parametrized input.
- update logic in ^ test for "which-side-fails" cases to better match
  recently updated/stricter cancel/failure semantics in terms of
  `ClosedResouruceError` vs. `EndOfChannel` expectations.
- handle `ExceptionGroup`s with expected embedded errors in test.
- better pendantics around whether to expect a user simulated KBI.
- for `examples/advanced_faults/ipc_failure_during_stream.py` script:
  - generalize ipc breakage in new `break_ipc()` with support for diff
    internal `trio` methods and a #TODO for future disti frameworks
  - only make one sub-actor task break and the other just stream.
  - use new `._testing.expect_ctxc()` around ctx block.
  - add a bit of exception handling with `print()`s around ctxc (unused
    except if 'msg' break method is set) and eoc cases.
  - don't break parent side ipc in loop any more then once
    after first break, checked via flag var.
  - add a `pre_close: bool` flag to control whether
    `MsgStreama.aclose()` is called *before* any ipc breakage method.

Still TODO:
- drop `pytest.ini` and add the alt section to `pyproject.py`.
 -> currently can't get `--rootdir=` opt to work.. not showing in
   console header.
 -> ^ also breaks on 'tests' `enable_modules` imports in subactors
   during discovery tests?
2025-03-16 15:28:28 -04:00