Commit Graph

736 Commits (339d787cf8d7164df39b1ae7051bd2a76b7c179d)

Author SHA1 Message Date
Tyler Goodlet a4874a3227 Always set the `parent_exit: trio.Event` on exit 2023-01-29 14:55:02 -05:00
Tyler Goodlet de04bbb2bb Don't raise on a broken IPC-context when sending stop msg 2023-01-29 14:55:02 -05:00
Tyler Goodlet 4f977189c0 Handle broken mem chan on `Actor._push_result()`
When backpressure is used and a feeder mem chan breaks during msg
delivery (usually because the IPC allocating task already terminated)
instead of raising we simply warn as we do for the non-backpressure
case.

Also, add a proper `Actor.is_arbiter` test inside `._invoke()` to avoid
doing an arbiter-registry lookup if the current actor **is** the
registrar.
2023-01-29 14:55:02 -05:00
Tyler Goodlet 121a8cc891 Drop `Optional` usage from root mod 2023-01-26 16:00:08 -05:00
Tyler Goodlet c54b8ca4ba Begin deprecation of `arbiter_addr` -> `registry_addr` 2023-01-26 16:00:08 -05:00
Tyler Goodlet 5b8a87d0f6 Slightly better `xonsh` check hack, fix typing 2023-01-26 15:48:15 -05:00
Tyler Goodlet 2e278ceb74 Add a super hacky check for `xonsh`, smh.. 2023-01-26 15:26:43 -05:00
Tyler Goodlet dba8118553 Always attempt prompt redraw on ctl-c in REPL
The stdlib has all sorts of muckery with ignoring SIGINT in the
`Pdb._cmdloop()` but here we just override all that since we don't trust
their decisions about cancellation handling whatsoever. Adds
a `Lock.repl: MultiActorPdb` attr which is set by any task which
acquires root TTY lock indicating (via actor global state) that the
current actor is using the debugger REPL and can be expected to re-draw
the prompt on SIGINT. Further we mask out log messages from any actor
who also has the `shield_sigint_handler()` enabled to avoid logging
noise when debugging.
2023-01-26 12:44:13 -05:00
Tyler Goodlet fca2e7c10e Simplify closed abruptly log msg 2023-01-26 12:44:13 -05:00
Tyler Goodlet 5ed62c5c54 Add note about intermediary-actor in debug issue 2023-01-26 12:44:13 -05:00
Tyler Goodlet 6c8cacc9d1 Adjust all default is `None` annots (per new `mypy`) 2022-12-12 13:18:22 -05:00
Tyler Goodlet 38326e8c15 Avoid error on context double pops 2022-12-11 23:46:33 -05:00
Tyler Goodlet b5192cca8e Always greedily `list`-cast`mngrs` input sequence 2022-12-11 23:20:58 -05:00
Tyler Goodlet c606be8c64 Passthrough runtime kwargs from `open_actor_cluster()` 2022-12-11 19:56:08 -05:00
Tyler Goodlet f2641c8964 Avoid "task never called `.started()`" runtime erros when cancelling 2022-10-14 19:42:23 -04:00
Tyler Goodlet f39414ce12 Drop error-repacking for `.run_in_actor()`s block
If we pack the nursery parent task's error into the `errors` table
directly in the handler, we don't need to specially handle packing that
same error into any exception group raised while handling sub-actor
cancellation; drops some ugly indentation ;)
2022-10-14 19:42:23 -04:00
Tyler Goodlet e298b70edf Drop added `.pdp()` level msgs used duringn dev 2022-10-14 19:42:23 -04:00
Tyler Goodlet 38f9d35dee Fix errors table type annot 2022-10-14 19:42:23 -04:00
Tyler Goodlet 88448f7281 Fix handler type annot 2022-10-14 19:42:23 -04:00
Tyler Goodlet 0956d5f461 Restore the `trio` SIGINT handler, cancel root lock tasks on no-peers
Pretty sure this is the final touch to alleviate all our debug lock
headaches! Instead of trying to revert to the "last" handler (as `pdb`
does internally in the stdlib) we always just revert to the handler
`trio` registers during startup. Further this seems to allow cancelling
the root-side locking task if it's detected as stale IFF we only do this
when the root actor is in a "no more IPC peers" state.

Deatz:
- (always) set `._debug.Lock._trio_handler` as the `trio` version, not
  some last used handler to make sure we're getting the ctrl-c handling
  we want when not in debug mode.
- assign the trio handler in `open_root_actor()`
  `._runtime._async_main()` to be sure it's applied in subactors as well
  as the root.
- only do debug lock blocking and root-side-locking-task cancels when
  a "no peers" condition is detected in the root actor: i.e. no IPC
  channels are detected by the root meaning it's impossible any actor
  has a sane lock-state ongoing for debug mode.
2022-10-14 18:18:01 -04:00
Tyler Goodlet 33f2234baf Hide some stack layers the user doesn't really need to see 2022-10-14 18:18:01 -04:00
Tyler Goodlet 7521bded3d Pack error from the parent task into the actor nursery 2022-10-14 18:16:51 -04:00
Tyler Goodlet 50fe098e06 First pass, swap `MultiError` for `BaseExceptionGroup` 2022-10-14 18:16:51 -04:00
Tyler Goodlet 98056f6ed7 Move logging context map into `log.py` module 2022-10-12 12:46:20 -04:00
Tyler Goodlet b81b6be98a Drop extra log msgs, some old commented code 2022-10-12 12:35:35 -04:00
Tyler Goodlet fb721f36ef Support debug-lock blocking, use on no-more IPC
This is a lingering debugger locking race case we needed to handle:

- child crashes acquires TTY lock in root and attaches to `pdb`
- child IPC goes down such that all channels to the root are broken
  / non-functional.
- root is stuck thinking the child is still in debug even though it
  can't be contacted and the child actor machinery hasn't been
  cancelled by its parent.
- root get's stuck in deadlock with child since it won't send a cancel
  request until the child is finished debugging, but the child can't
  unlock the debugger bc IPC is down.

To avoid this scenario add debug lock blocking list via
`._debug.Lock._blocked: set[tuple]` which holds actor uids for any actor
that is detected by the root as having no transport channel connections
with said root (of which at least one should exist if this sub-actor at
some point acquired the debug lock). The root consequently checks this
list for any actor that tries to (re)acquire the lock and blocks with
a `ContextCancelled`. When a debug condition is tested in
`._runtime._invoke` the context's `._enter_debugger_on_cancel` which
is set to `False` if the actor is on the block list in which case the
post-mortem entry is skipped.

Further this adds a root-locking-task side cancel scope to
`Lock._root_local_task_cs_in_debug` which can be cancelled by the root
runtime when a stale lock is detected after all IPC channels for the
actor have been torn down. NOTE: right now we're NOT doing this since it
seems to cause test failures likely due because it may cause pre-mature
cancellation and maybe needs a bit more experimenting?
2022-10-11 20:00:05 -04:00
Tyler Goodlet 734d8dd663 Move `trio` scope outside first inter-task-chan receive 2022-10-11 20:00:05 -04:00
Tyler Goodlet 1c480e6c92 Add `Context` cancel message and debug toggle flag
In the case of a callee-side context cancelling itself it can be handy
to let the caller-side task know (even if through logging) that the
cancel was due to some known reason. Make `.cancel()` accept such
a message on the callee side and have it included in the
`._runtime._invoke()` raised `ContextCancelled` emission.

Also add a `Context._trigger_debugger_on_cancel: bool` flag which can be
set to `False` to avoid the debugger post-mortem crash mode from
engaging on cross-context tasks which cancel themselves for a known
reason (as is needed for blocked tasks in the debug TTY-lock machinery).
2022-10-11 20:00:05 -04:00
Tyler Goodlet 44b59f3338 Go back to a `global` single-ton nursery per actor
Turns out the lifetime mgmt of separate nurseries per delegate manager
is tricky; a new nursery can't be naively allocated on cache-misses since
it may get closed by some early terminating task instead of by the "last
using" consumer task. In theory if we allocate using the same logic as
that used for the last-task-triggers-exit then this should work?

For now just go back to a single global nursery per `_Cache` which still
avoids use of the internal actor service nursery.
2022-10-09 21:27:23 -04:00
Tyler Goodlet 7a719ac2a7 Use one nursery per unique manager (signature)
Instead of sticking all `trionics.maybe_open_context()` tasks inside the
actor's (root) service nursery, open a unique one per manager function
instance (id).

Further, accept a callable for the `key` such that a user can have
more flexible control on the caching logic and move the
`maybe_open_nursery()` helper out of the portal mod and into this
trionics "managers" module.
2022-10-09 21:27:23 -04:00
Tyler Goodlet d24fae8381 'Rename mp spawn methods to have a `'mp_'` prefix' 2022-10-09 17:54:55 -04:00
Tyler Goodlet 5ab98513b7 Move `@tractor_test` into `conftest.py` 2022-10-09 17:14:20 -04:00
Tyler Goodlet 90f4912580 Organize process spawning into lookup table
Instead of the logic branching create a table `._spawn._methods`
which is used to lookup the desired backend framework (in this case
still only one of `multiprocessing` or `trio`) and make the top level
`.new_proc()` do the lookup and any common logic. Use a `typing.Literal`
to define the lookup table's key set.

Repair and ignore a bunch of type-annot related stuff todo with `mypy`
updates and backend-specific process typing.
2022-10-09 16:51:21 -04:00
Tyler Goodlet 15047341bd Ignore forserver override attrs with `mypy` 2022-10-09 16:14:11 -04:00
Tyler Goodlet e609183242 Expose lifetime stack as class attr, add base test suite 2022-09-15 23:50:15 -04:00
Tyler Goodlet 10eeda2d2b Use built-ins for all data-structure-type annotations 2022-09-15 23:41:28 -04:00
Tyler Goodlet ad19bf2cf1 Remove `tractor.run()` once and for all
It's been deprecated for a while now and all docs and tests have been
changed.

Closes #183
2022-09-15 23:41:28 -04:00
Tyler Goodlet 9aef03772a Expose `Actor` at pkg level, adjust debug type annots 2022-09-15 23:41:28 -04:00
Tyler Goodlet 7548dba8f2 Change to new doc string style 2022-09-15 23:41:28 -04:00
Tyler Goodlet 208d56af2c Make `async_main()` a module func 2022-09-15 23:41:28 -04:00
Tyler Goodlet a3a5bc267e Make `process_messages()` a mod func 2022-09-15 23:41:28 -04:00
Tyler Goodlet d4084b2032 Rename our core module to `_runtime` 2022-09-15 23:41:28 -04:00
Tyler Goodlet bafd10a260 Make `maybe_open_context()` re-entrant safe, use per factory locks 2022-09-15 19:02:02 -04:00
Tyler Goodlet 5ad540c417 Add debug complete event `None`-guard for when already reset 2022-09-15 19:02:02 -04:00
Tyler Goodlet 8f1fe2376a Simplify all hooks to a common `Lock.release()` 2022-08-02 18:14:05 -04:00
Tyler Goodlet 650313dfef Drop legacy handler blocks factored into `_acquire_debug_lock()` 2022-08-02 12:50:27 -04:00
Tyler Goodlet e4006da6f4 Drop `pdbpp` bug notes, add follow up issue #320 note 2022-08-02 12:48:40 -04:00
Tyler Goodlet 7f6169a050 Drop legacy commented/todo remote debug helper block 2022-08-02 12:43:14 -04:00
Tyler Goodlet 02c3b9a672 Put `pygments` back to default 2022-08-02 12:17:34 -04:00
Tyler Goodlet c5c7a9027c Line len lint and drop rpc log msg level again 2022-08-02 12:17:34 -04:00
Tyler Goodlet 937ed99e39 Factor sigint overriding into lock methods 2022-08-02 12:17:28 -04:00
Tyler Goodlet 91f034a136 Move all module vars into a `Lock` type 2022-08-02 12:17:28 -04:00
Tyler Goodlet 6f01c78122 Disable `pygments` highlighting on ctlc tests 2022-08-02 12:17:28 -04:00
Tyler Goodlet c0cd99e374 Timeout on arbiter ping, avoid TCP SYN hangs in CI? 2022-08-02 12:17:28 -04:00
Tyler Goodlet b01daa5319 Factor lock-state release logic into helper
The common logic to both remove our custom SIGINT handler as well
as signal the actor global event that pdb is complete. Call this
whenever we exit a post mortem call and thus any time some rpc task
get's debugged inside `._actor._invoke()`.

Further, we have to manually print the REPL prompt on 3.9 for some wack
reason, so stick a version guard in the sigint handler for that..
2022-08-02 12:17:28 -04:00
Tyler Goodlet bd362a05f0 Run release hook around `next` repl commands as well 2022-08-02 12:17:28 -04:00
Tyler Goodlet b21f2e16ad Always consider the debugger when exiting contexts
When in an uncertain teardown state and in debug mode a context can be
popped from actor runtime before a child finished debugging (the case
when the parent is tearing down but the child hasn't closed/completed
its tty lock IPC exit phase) and the child sends the "stop" message to
unlock the debugger but it's ignored bc the parent has already dropped
the ctx. Instead we call `._debug.maybe_wait_for_deugger()` before these
context removals to avoid the root getting stuck thinking the lock was
never released.

Further, add special `Actor._cancel_task()` handling code inside
`_invoke()` which continues to execute the method despite the IPC
channel to the caller being broken and thus avoiding potential hangs due
to a target (child) actor task remaining alive.
2022-08-02 12:17:28 -04:00
Tyler Goodlet ba7b355d9c Add note about default behaviour of `fancycompleter` 2022-08-02 12:17:28 -04:00
Tyler Goodlet ef8dc0204c Just drop all longlisting for now and leave comments 2022-08-02 12:17:28 -04:00
Tyler Goodlet a101971027 Go back to original longlist code 2022-08-02 12:17:28 -04:00
Tyler Goodlet 835836123b Just don't call longlist on 3.10+ for now 2022-08-02 12:17:28 -04:00
Tyler Goodlet b9eb601265 General typing fixes for `mypy` 2022-08-02 12:17:27 -04:00
Tyler Goodlet 4dcc21234e Only call `.poll()` if a method on the spawn backend 2022-08-02 12:17:27 -04:00
Tyler Goodlet 8b9f342eef Port to new `.lowlevel.open_process()` API 2022-08-02 12:17:27 -04:00
Tyler Goodlet a90ca4b384 Call longlist normally when on py < 3.10 2022-08-02 12:17:06 -04:00
Tyler Goodlet d0dcd55f47 Only report disconnected actors if proc is still alive? 2022-08-02 12:17:06 -04:00
Tyler Goodlet 519f4c300b I dunno, seems like `breakpoint()` needs this? 2022-08-02 12:17:06 -04:00
Tyler Goodlet ff3f5959e9 Always enable debug level logging if mode enabled 2022-08-02 12:16:58 -04:00
Tyler Goodlet abb00531d3 Add help msg for non `__main__` modules as well 2022-08-02 12:16:58 -04:00
Tyler Goodlet 18c525d2f1 Hack around double long list print issue..
See https://github.com/pdbpp/pdbpp/issues/496
2022-08-02 12:16:58 -04:00
Tyler Goodlet e2453fd3da Add spaces before values in log msg 2022-08-02 12:16:58 -04:00
Tyler Goodlet b29def8b5d Add runtime level msg around channel draining 2022-08-02 12:16:58 -04:00
Tyler Goodlet f07e9dbb2f Always undo SIGINT overrides, cancel detached children
Ensure that even when `pdb` resumption methods are called during a crash
where `trio`'s runtime has already terminated (eg. `Event.set()` will
raise) we always revert our sigint handler to the original. Further
inside the handler if we hit a case where a child is in debug and
(thinks it) has the global pdb lock, if it has no IPC connection to
a parent, simply presume tty sync-coordination is now lost and cancel
the child immediately.
2022-08-02 12:16:49 -04:00
Tyler Goodlet c7035be2fc Tolerate double `.remove()`s of stream on portal teardowns 2022-07-27 11:40:02 -04:00
Tyler Goodlet deaca7d6cc Always propagate SIGINT when no locking peer found
A hopefully significant fix here is to always avoid suppressing a SIGINT
when the root actor can not detect an active IPC connections (via
a connected channel) to the supposed debug lock holding actor. In that
case it is most likely that the actor has either terminated or has lost
its connection for debugger control and there is no way the root can
verify the lock is in use; thus we choose to allow KBI cancellation.

Drop the (by comment) `try`-`finally` block in
`_hijoack_stdin_for_child()` around the `_acquire_debug_lock()` call
since all that logic should now be handled internal to that locking
manager. Try to catch a weird error around the `.do_longlist()` method
call that seems to sometimes break on py3.10 and latest `pdbpp`.
2022-07-27 11:40:02 -04:00
Tyler Goodlet d47d0e7c37 Always call pdb hook even if tty locking fails 2022-07-27 11:40:02 -04:00
Tyler Goodlet 0062c96a3c Log cancels with appropriate level 2022-07-27 11:40:02 -04:00
Tyler Goodlet 4be13b7387 Just warn on IPC breaks 2022-07-27 11:40:02 -04:00
Tyler Goodlet 7bb5addd4c Only warn on `trio.BrokenResourceError`s from `_invoke()` 2022-07-27 11:40:02 -04:00
Tyler Goodlet 89b44f8163 Pre-declare disconnected flag 2022-07-27 11:40:02 -04:00
Tyler Goodlet 2819b6a5b2 Avoid attr error XD 2022-07-27 11:40:02 -04:00
Tyler Goodlet f2671ed026 Type annot updates 2022-07-27 11:40:02 -04:00
Tyler Goodlet 41924c86a6 Drop uneeded backframe traceback hide annotation 2022-07-27 11:40:02 -04:00
Tyler Goodlet 206c7c0720 Make `Actor._process_messages()` report disconnects
The method now returns a `bool` which flags whether the transport died
to the caller and allows for reporting a disconnect in the
channel-transport handler task. This is something a user will normally
want to know about on the caller side especially after seeing
a traceback from the peer (if in tree) on console.
2022-07-27 11:40:02 -04:00
Tyler Goodlet bf0ac3116c Only cancel/get-result from a ctx if transport is up
There's no point in sending a cancel message to the remote linked task
and especially no reason to block waiting on a result from that task if
the transport layer is detected to be disconnected. We expect that the
transport shouldn't go down at the layer of the message loop
(reconnection logic should be handled in the transport layer itself) so
if we detect the channel is not connected we don't bother requesting
cancels nor waiting on a final result message.

Why?

- if the connection goes down in error the caller side won't have a way
  to know "how long" it should block to wait for a cancel ack or result
  and causes a potential hang that may require an additional ctrl-c from
  the user especially if using the debugger or if the traceback is not
  seen on console.
- obviously there's no point in waiting for messages when there's no
  transport to deliver them XD

Further, add some more detailed cancel logging detailing the task and
actor ids.
2022-07-27 11:40:02 -04:00
Tyler Goodlet 74b819a857 Typing fixes, simplify `_set_trace()` 2022-07-27 11:40:02 -04:00
Tyler Goodlet 8892204c84 Add notes around py3.10 stdlib bug from `pdb++`
There's a bug that's triggered in the stdlib without latest `pdb++`
installed; add a note for that.

Further inside `wait_for_parent_stdin_hijack()` don't `.started()` until
the interactor stream has been opened to avoid races when debugging this
`._debug.py` module (at the least) since we usually don't want the
spawning (parent) task to resume until we know for sure the tty lock has
been acquired. Also, drop the random checkpoint we had inside
`_breakpoint()`, not sure it was actually adding anything useful since
we're (mostly) carefully shielded throughout this func.
2022-07-27 11:40:02 -04:00
Tyler Goodlet 8f4bbf1cbf Add and use a pdb instance factory 2022-07-27 11:40:02 -04:00
Tyler Goodlet aea8f63bae Drop all the `@cm.__exit__()` override attempts..
None of it worked (you still will see `.__exit__()` frames on debugger
entry - you'd think this would have been solved by now but, shrug) so
instead wrap the debugger entry-point in a `try:` and put the SIGINT
handler restoration inside `MultiActorPdb` teardown hooks.

This seems to restore the UX as it was prior but with also giving the
desired SIGINT override handler behaviour.
2022-07-27 11:40:02 -04:00
Tyler Goodlet 7964a9f6f8 Try overriding `_GeneratorContextManager.__exit__()`; didn't work..
Using either of `@pdb.hideframe` or `__tracebackhide__` on stdlib
methods doesn't seem to work either.. This all seems to have something
to do with async generator usage I think ?
2022-07-27 11:40:02 -04:00
Tyler Goodlet e5195264a1 Handle a context cancel? Might be a noop 2022-07-27 11:40:02 -04:00
Tyler Goodlet 345573e602 Make `mypy` happy 2022-07-27 11:40:02 -04:00
Tyler Goodlet 4e60c17375 Refine the handler for child vs. root cases
This gets very close to avoiding any possible hangs to do with tty
locking and SIGINT handling minus a special case that will be detailed
below.

Summary of implementation changes:

- convert `_mk_pdb()` -> `with _open_pdb() as pdb:` which implicitly
  handles the `bdb.BdbQuit` case such that debugger teardown hooks are
  always called.
- rename the handler to `shield_sigint()` and handle a variety of new
  cases:
  * the root is in debug but hasn't been cancelled -> call
    `Actor.cancel_soon()`
  * the root is in debug but *has* been called (`Actor.cancel_soon()`
    already called) -> raise KBI
  * a child is in debug *and* has a task locking the debugger -> ignore
    SIGINT in child *and* the root actor.
- if the debugger instance is provided to the handler at acquire time,
  on SIGINT handling completion re-print the last pdb++ REPL output so
  that the user realizes they are still actively in debug.
- ignore the unlock case where a race condition of "no task" holding the
  lock causes the `RuntimeError` normally associated with the "wrong
  task" doing so (not sure if this is a `trio` bug?).
- change debug logs to runtime level.

Unhandled case(s):

- a child is maybe in debug mode but does not itself have any task using
  the debugger.
    * ToDo: we need a way to decide what to do with
      "intermediate" child actors who themselves either are not in
      `debug_mode=True` but have children who *are* such that a SIGINT
      won't cause cancellation of that child-as-parent-of-another-child
      **iff** any of their children are in in debug mode.
2022-07-27 11:40:02 -04:00
Tyler Goodlet 6b7b58346f (facepalm) Reraise `BdbQuit` and discard ownerless lock releases 2022-07-27 11:40:02 -04:00
Tyler Goodlet 3cac323421 Add WIP while-debugger-active SIGINT ignore handler 2022-07-27 11:40:02 -04:00
goodboy 4902e184e9
Merge pull request #318 from goodboy/aio_error_propagation
Add context test that opens an inter-task-channel that errors
2022-07-15 12:42:19 -04:00
Tyler Goodlet 05790a20c1 Slight lint fixes 2022-07-15 11:18:48 -04:00
Tyler Goodlet f0d78e1a6e Use local task ref, fixes `mypy` 2022-07-15 10:39:49 -04:00
Tyler Goodlet 0906559ed9 Drop manual stack construction, fix attr typo 2022-07-14 20:43:17 -04:00
Tyler Goodlet 38d03858d7 Fix `asyncio`-task-sync and error propagation
This fixes an previously undetected bug where if an
`.open_channel_from()` spawned task errored the error would not be
propagated to the `trio` side and instead would fail silently with
a console log error. What was most odd is that it only seems easy to
trigger when you put a slight task sleep before the error is raised
(:eyeroll:). This patch adds a few things to address this and just in
general improve iter-task lifetime syncing:

- add `LinkedTaskChannel._trio_exited: bool` a flag set from the `trio`
  side when the channel block exits.
- add a `wait_on_aio_task: bool` flag to `translate_aio_errors` which
  toggles whether to wait the `asyncio` task termination event on exit.
- cancel the `asyncio` task if the trio side has ended, when
  `._trio_exited == True`.
- always close the `trio` mem channel when the task exits such that
  the `asyncio` side can error on any next `.send()` call.
2022-07-14 16:35:41 -04:00