forked from goodboy/tractor
1
0
Fork 0
Commit Graph

1258 Commits (8559ad69f3336d36805788e103c07242b4590ab3)

Author SHA1 Message Date
Tyler Goodlet 8559ad69f3 Just don't call longlist on 3.10+ for now 2022-07-27 11:40:03 -04:00
Tyler Goodlet e519df1bd2 Add longer delays around ctl-c loop, don't expect longlist 2022-07-27 11:40:02 -04:00
Tyler Goodlet 24fd87d969 Add sleep around ctl-c iteration loop 2022-07-27 11:40:02 -04:00
Tyler Goodlet 91054a8a42 Pin to specific `pdbppp` master commit 2022-07-27 11:40:02 -04:00
Tyler Goodlet cdc7bf6549 General typing fixes for `mypy` 2022-07-27 11:40:02 -04:00
Tyler Goodlet c865d01e85 Only call `.poll()` if a method on the spawn backend 2022-07-27 11:40:02 -04:00
Tyler Goodlet e1caeeb8de Fix loglevel in subactor test; actually pass the level XD 2022-07-27 11:40:02 -04:00
Tyler Goodlet 7c25aa176f Pin to `trio >= 0.20` 2022-07-27 11:40:02 -04:00
Tyler Goodlet 3b7985292f TOSQUASH: add note around delay 2022-07-27 11:40:02 -04:00
Tyler Goodlet e8fc820b92 Port to new `.lowlevel.open_process()` API 2022-07-27 11:40:02 -04:00
Tyler Goodlet b2fdbc44d1 Guard against asyncio canclled logged to console 2022-07-27 11:40:02 -04:00
Tyler Goodlet f7823a46b8 Add slight delay 2nd ctlc round.. 2022-07-27 11:40:02 -04:00
Tyler Goodlet f76c809c39 Call longlist normally when on py < 3.10 2022-07-27 11:40:02 -04:00
Tyler Goodlet 9e56881163 Only report disconnected actors if proc is still alive? 2022-07-27 11:40:02 -04:00
Tyler Goodlet 8291ee09b3 TOSQUASH: more loglevel for debug bs 2022-07-27 11:40:02 -04:00
Tyler Goodlet 4a441f0988 Only do `pdbpp` from `git` install on 3.10+ 2022-07-27 11:40:02 -04:00
Tyler Goodlet df0108a0bb I dunno, seems like `breakpoint()` needs this? 2022-07-27 11:40:02 -04:00
Tyler Goodlet 8537e17251 TOSQUASH: debug mode loglevel 2022-07-27 11:40:02 -04:00
Tyler Goodlet 20acb50d94 Add basic module-not-found when opening a ctx eg. 2022-07-27 11:40:02 -04:00
Tyler Goodlet eab895864f Always enable debug level logging if mode enabled 2022-07-27 11:40:02 -04:00
Tyler Goodlet 65a9f69d6c Add help msg for non `__main__` modules as well 2022-07-27 11:40:02 -04:00
Tyler Goodlet 24b6cc0209 Add basic ctl-c testing cases to suite 2022-07-27 11:40:02 -04:00
Tyler Goodlet f488db6d8d Hack around double long list print issue..
See https://github.com/pdbpp/pdbpp/issues/496
2022-07-27 11:40:02 -04:00
Tyler Goodlet c5d335c057 Show full KBI trace for help with CI hangs 2022-07-27 11:40:02 -04:00
Tyler Goodlet 4594fe3501 Move pydantic-click hang example to new dir, skip in test suite 2022-07-27 11:40:02 -04:00
Tyler Goodlet 5f0262fd98 Add spaces before values in log msg 2022-07-27 11:40:02 -04:00
Tyler Goodlet 59e7f29eed Add runtime level msg around channel draining 2022-07-27 11:40:02 -04:00
Tyler Goodlet e2dfd6e99d Always undo SIGINT overrides , cancel detached children
Ensure that even when `pdb` resumption methods are called during a crash
where `trio`'s runtime has already terminated (eg. `Event.set()` will
raise) we always revert our sigint handler to the original. Further
inside the handler if we hit a case where a child is in debug and
(thinks it) has the global pdb lock, if it has no IPC connection to
a parent, simply presume tty sync-coordination is now lost and cancel
the child immediately.
2022-07-27 11:40:02 -04:00
Tyler Goodlet 2f5a6049a4 Readme formatting tweaks 2022-07-27 11:40:02 -04:00
Tyler Goodlet 418e74eee7 Pin to `pdbpp` upstream master, 3.10 problem?
See issues:
- https://github.com/pdbpp/pdbpp/issues/480
- https://github.com/pdbpp/pdbpp/pull/482
2022-07-27 11:40:02 -04:00
Tyler Goodlet c7035be2fc Tolerate double `.remove()`s of stream on portal teardowns 2022-07-27 11:40:02 -04:00
Tyler Goodlet deaca7d6cc Always propagate SIGINT when no locking peer found
A hopefully significant fix here is to always avoid suppressing a SIGINT
when the root actor can not detect an active IPC connections (via
a connected channel) to the supposed debug lock holding actor. In that
case it is most likely that the actor has either terminated or has lost
its connection for debugger control and there is no way the root can
verify the lock is in use; thus we choose to allow KBI cancellation.

Drop the (by comment) `try`-`finally` block in
`_hijoack_stdin_for_child()` around the `_acquire_debug_lock()` call
since all that logic should now be handled internal to that locking
manager. Try to catch a weird error around the `.do_longlist()` method
call that seems to sometimes break on py3.10 and latest `pdbpp`.
2022-07-27 11:40:02 -04:00
Tyler Goodlet d47d0e7c37 Always call pdb hook even if tty locking fails 2022-07-27 11:40:02 -04:00
Tyler Goodlet 0062c96a3c Log cancels with appropriate level 2022-07-27 11:40:02 -04:00
Tyler Goodlet 4be13b7387 Just warn on IPC breaks 2022-07-27 11:40:02 -04:00
Tyler Goodlet 7bb5addd4c Only warn on `trio.BrokenResourceError`s from `_invoke()` 2022-07-27 11:40:02 -04:00
Tyler Goodlet 4fd924cfd2 Make example a subpkg for `python -m <mod>` testing 2022-07-27 11:40:02 -04:00
Tyler Goodlet fe0fd1a1c1 Add example that triggers bug #302 2022-07-27 11:40:02 -04:00
Tyler Goodlet dd23e78de1 Add back in async gen loop 2022-07-27 11:40:02 -04:00
Tyler Goodlet 89b44f8163 Pre-declare disconnected flag 2022-07-27 11:40:02 -04:00
Tyler Goodlet 2819b6a5b2 Avoid attr error XD 2022-07-27 11:40:02 -04:00
Tyler Goodlet f2671ed026 Type annot updates 2022-07-27 11:40:02 -04:00
Tyler Goodlet 41924c86a6 Drop uneeded backframe traceback hide annotation 2022-07-27 11:40:02 -04:00
Tyler Goodlet 206c7c0720 Make `Actor._process_messages()` report disconnects
The method now returns a `bool` which flags whether the transport died
to the caller and allows for reporting a disconnect in the
channel-transport handler task. This is something a user will normally
want to know about on the caller side especially after seeing
a traceback from the peer (if in tree) on console.
2022-07-27 11:40:02 -04:00
Tyler Goodlet bf0ac3116c Only cancel/get-result from a ctx if transport is up
There's no point in sending a cancel message to the remote linked task
and especially no reason to block waiting on a result from that task if
the transport layer is detected to be disconnected. We expect that the
transport shouldn't go down at the layer of the message loop
(reconnection logic should be handled in the transport layer itself) so
if we detect the channel is not connected we don't bother requesting
cancels nor waiting on a final result message.

Why?

- if the connection goes down in error the caller side won't have a way
  to know "how long" it should block to wait for a cancel ack or result
  and causes a potential hang that may require an additional ctrl-c from
  the user especially if using the debugger or if the traceback is not
  seen on console.
- obviously there's no point in waiting for messages when there's no
  transport to deliver them XD

Further, add some more detailed cancel logging detailing the task and
actor ids.
2022-07-27 11:40:02 -04:00
Tyler Goodlet bb732cefd0 Drop high log level in ctx example 2022-07-27 11:40:02 -04:00
Tyler Goodlet 74b819a857 Typing fixes, simplify `_set_trace()` 2022-07-27 11:40:02 -04:00
Tyler Goodlet 8892204c84 Add notes around py3.10 stdlib bug from `pdb++`
There's a bug that's triggered in the stdlib without latest `pdb++`
installed; add a note for that.

Further inside `wait_for_parent_stdin_hijack()` don't `.started()` until
the interactor stream has been opened to avoid races when debugging this
`._debug.py` module (at the least) since we usually don't want the
spawning (parent) task to resume until we know for sure the tty lock has
been acquired. Also, drop the random checkpoint we had inside
`_breakpoint()`, not sure it was actually adding anything useful since
we're (mostly) carefully shielded throughout this func.
2022-07-27 11:40:02 -04:00
Tyler Goodlet 8f4bbf1cbf Add and use a pdb instance factory 2022-07-27 11:40:02 -04:00
Tyler Goodlet 21dccb2e79 A `.open_context()` example that causes a hang!
Finally! I think this may be the root issue we've been seeing in
production in a client project.

No idea yet why this is happening but the fault-causing sequence seems
to be:
- `.open_context()` in a child actor
- enter the debugger via `tractor.breakpoint()`
- continue from that entry via `c` command in REPL
- raise an error just after inside the context task's body

Looking at logging it appears as though the child thinks it has the tty
but no input is accepted on the REPL and a further `ctrl-c` results in
some teardown but also a further hang where both parent and child become
unresponsive..
2022-07-27 11:40:02 -04:00