tractor

Commit Graph

Author	SHA1	Message	Date
Tyler Goodlet	18c525d2f1	Hack around double long list print issue.. See https://github.com/pdbpp/pdbpp/issues/496	2022-08-02 12:16:58 -04:00
Tyler Goodlet	201c026284	Show full KBI trace for help with CI hangs	2022-08-02 12:16:58 -04:00
Tyler Goodlet	2a61aa099b	Move pydantic-click hang example to new dir, skip in test suite	2022-08-02 12:16:58 -04:00
Tyler Goodlet	e2453fd3da	Add spaces before values in log msg	2022-08-02 12:16:58 -04:00
Tyler Goodlet	b29def8b5d	Add runtime level msg around channel draining	2022-08-02 12:16:58 -04:00
Tyler Goodlet	f07e9dbb2f	Always undo SIGINT overrides, cancel detached children Ensure that even when `pdb` resumption methods are called during a crash where `trio`'s runtime has already terminated (eg. `Event.set()` will raise) we always revert our sigint handler to the original. Further inside the handler if we hit a case where a child is in debug and (thinks it) has the global pdb lock, if it has no IPC connection to a parent, simply presume tty sync-coordination is now lost and cancel the child immediately.	2022-08-02 12:16:49 -04:00
Tyler Goodlet	2f5a6049a4	Readme formatting tweaks	2022-07-27 11:40:02 -04:00
Tyler Goodlet	418e74eee7	Pin to `pdbpp` upstream master, 3.10 problem? See issues: - https://github.com/pdbpp/pdbpp/issues/480 - https://github.com/pdbpp/pdbpp/pull/482	2022-07-27 11:40:02 -04:00
Tyler Goodlet	c7035be2fc	Tolerate double `.remove()`s of stream on portal teardowns	2022-07-27 11:40:02 -04:00
Tyler Goodlet	deaca7d6cc	Always propagate SIGINT when no locking peer found A hopefully significant fix here is to always avoid suppressing a SIGINT when the root actor can not detect an active IPC connections (via a connected channel) to the supposed debug lock holding actor. In that case it is most likely that the actor has either terminated or has lost its connection for debugger control and there is no way the root can verify the lock is in use; thus we choose to allow KBI cancellation. Drop the (by comment) `try`-`finally` block in `_hijoack_stdin_for_child()` around the `_acquire_debug_lock()` call since all that logic should now be handled internal to that locking manager. Try to catch a weird error around the `.do_longlist()` method call that seems to sometimes break on py3.10 and latest `pdbpp`.	2022-07-27 11:40:02 -04:00
Tyler Goodlet	d47d0e7c37	Always call pdb hook even if tty locking fails	2022-07-27 11:40:02 -04:00
Tyler Goodlet	0062c96a3c	Log cancels with appropriate level	2022-07-27 11:40:02 -04:00
Tyler Goodlet	4be13b7387	Just warn on IPC breaks	2022-07-27 11:40:02 -04:00
Tyler Goodlet	7bb5addd4c	Only warn on `trio.BrokenResourceError`s from `_invoke()`	2022-07-27 11:40:02 -04:00
Tyler Goodlet	4fd924cfd2	Make example a subpkg for `python -m <mod>` testing	2022-07-27 11:40:02 -04:00
Tyler Goodlet	fe0fd1a1c1	Add example that triggers bug #302	2022-07-27 11:40:02 -04:00
Tyler Goodlet	dd23e78de1	Add back in async gen loop	2022-07-27 11:40:02 -04:00
Tyler Goodlet	89b44f8163	Pre-declare disconnected flag	2022-07-27 11:40:02 -04:00
Tyler Goodlet	2819b6a5b2	Avoid attr error XD	2022-07-27 11:40:02 -04:00
Tyler Goodlet	f2671ed026	Type annot updates	2022-07-27 11:40:02 -04:00
Tyler Goodlet	41924c86a6	Drop uneeded backframe traceback hide annotation	2022-07-27 11:40:02 -04:00
Tyler Goodlet	206c7c0720	Make `Actor._process_messages()` report disconnects The method now returns a `bool` which flags whether the transport died to the caller and allows for reporting a disconnect in the channel-transport handler task. This is something a user will normally want to know about on the caller side especially after seeing a traceback from the peer (if in tree) on console.	2022-07-27 11:40:02 -04:00
Tyler Goodlet	bf0ac3116c	Only cancel/get-result from a ctx if transport is up There's no point in sending a cancel message to the remote linked task and especially no reason to block waiting on a result from that task if the transport layer is detected to be disconnected. We expect that the transport shouldn't go down at the layer of the message loop (reconnection logic should be handled in the transport layer itself) so if we detect the channel is not connected we don't bother requesting cancels nor waiting on a final result message. Why? - if the connection goes down in error the caller side won't have a way to know "how long" it should block to wait for a cancel ack or result and causes a potential hang that may require an additional ctrl-c from the user especially if using the debugger or if the traceback is not seen on console. - obviously there's no point in waiting for messages when there's no transport to deliver them XD Further, add some more detailed cancel logging detailing the task and actor ids.	2022-07-27 11:40:02 -04:00
Tyler Goodlet	bb732cefd0	Drop high log level in ctx example	2022-07-27 11:40:02 -04:00
Tyler Goodlet	74b819a857	Typing fixes, simplify `_set_trace()`	2022-07-27 11:40:02 -04:00
Tyler Goodlet	8892204c84	Add notes around py3.10 stdlib bug from `pdb++` There's a bug that's triggered in the stdlib without latest `pdb++` installed; add a note for that. Further inside `wait_for_parent_stdin_hijack()` don't `.started()` until the interactor stream has been opened to avoid races when debugging this `._debug.py` module (at the least) since we usually don't want the spawning (parent) task to resume until we know for sure the tty lock has been acquired. Also, drop the random checkpoint we had inside `_breakpoint()`, not sure it was actually adding anything useful since we're (mostly) carefully shielded throughout this func.	2022-07-27 11:40:02 -04:00
Tyler Goodlet	8f4bbf1cbf	Add and use a pdb instance factory	2022-07-27 11:40:02 -04:00
Tyler Goodlet	21dccb2e79	A `.open_context()` example that causes a hang! Finally! I think this may be the root issue we've been seeing in production in a client project. No idea yet why this is happening but the fault-causing sequence seems to be: - `.open_context()` in a child actor - enter the debugger via `tractor.breakpoint()` - continue from that entry via `c` command in REPL - raise an error just after inside the context task's body Looking at logging it appears as though the child thinks it has the tty but no input is accepted on the REPL and a further `ctrl-c` results in some teardown but also a further hang where both parent and child become unresponsive..	2022-07-27 11:40:02 -04:00
Tyler Goodlet	aea8f63bae	Drop all the `@cm.__exit__()` override attempts.. None of it worked (you still will see `.__exit__()` frames on debugger entry - you'd think this would have been solved by now but, shrug) so instead wrap the debugger entry-point in a `try:` and put the SIGINT handler restoration inside `MultiActorPdb` teardown hooks. This seems to restore the UX as it was prior but with also giving the desired SIGINT override handler behaviour.	2022-07-27 11:40:02 -04:00
Tyler Goodlet	7964a9f6f8	Try overriding `_GeneratorContextManager.__exit__()`; didn't work.. Using either of `@pdb.hideframe` or `__tracebackhide__` on stdlib methods doesn't seem to work either.. This all seems to have something to do with async generator usage I think ?	2022-07-27 11:40:02 -04:00
Tyler Goodlet	99c4319940	Fix example name typo	2022-07-27 11:40:02 -04:00
Tyler Goodlet	e5195264a1	Handle a context cancel? Might be a noop	2022-07-27 11:40:02 -04:00
Tyler Goodlet	42f9d10252	Add a pre-started breakpoint example	2022-07-27 11:40:02 -04:00
Tyler Goodlet	345573e602	Make `mypy` happy	2022-07-27 11:40:02 -04:00
Tyler Goodlet	4e60c17375	Refine the handler for child vs. root cases This gets very close to avoiding any possible hangs to do with tty locking and SIGINT handling minus a special case that will be detailed below. Summary of implementation changes: - convert `_mk_pdb()` -> `with _open_pdb() as pdb:` which implicitly handles the `bdb.BdbQuit` case such that debugger teardown hooks are always called. - rename the handler to `shield_sigint()` and handle a variety of new cases: * the root is in debug but hasn't been cancelled -> call `Actor.cancel_soon()` * the root is in debug but has been called (`Actor.cancel_soon()` already called) -> raise KBI * a child is in debug and has a task locking the debugger -> ignore SIGINT in child and the root actor. - if the debugger instance is provided to the handler at acquire time, on SIGINT handling completion re-print the last pdb++ REPL output so that the user realizes they are still actively in debug. - ignore the unlock case where a race condition of "no task" holding the lock causes the `RuntimeError` normally associated with the "wrong task" doing so (not sure if this is a `trio` bug?). - change debug logs to runtime level. Unhandled case(s): - a child is maybe in debug mode but does not itself have any task using the debugger. * ToDo: we need a way to decide what to do with "intermediate" child actors who themselves either are not in `debug_mode=True` but have children who are such that a SIGINT won't cause cancellation of that child-as-parent-of-another-child iff any of their children are in in debug mode.	2022-07-27 11:40:02 -04:00
Tyler Goodlet	6b7b58346f	(facepalm) Reraise `BdbQuit` and discard ownerless lock releases	2022-07-27 11:40:02 -04:00
Tyler Goodlet	3cac323421	Add WIP while-debugger-active SIGINT ignore handler	2022-07-27 11:40:02 -04:00
goodboy	4902e184e9	Merge pull request #318 from goodboy/aio_error_propagation Add context test that opens an inter-task-channel that errors	2022-07-15 12:42:19 -04:00
Tyler Goodlet	05790a20c1	Slight lint fixes	2022-07-15 11:18:48 -04:00
Tyler Goodlet	565c603300	Add nooz	2022-07-15 11:17:57 -04:00
Tyler Goodlet	f0d78e1a6e	Use local task ref, fixes `mypy`	2022-07-15 10:39:49 -04:00
Tyler Goodlet	ce01f6b21c	Increase timeout for CI/windows	2022-07-14 20:44:10 -04:00
Tyler Goodlet	0906559ed9	Drop manual stack construction, fix attr typo	2022-07-14 20:43:17 -04:00
Tyler Goodlet	38d03858d7	Fix `asyncio`-task-sync and error propagation This fixes an previously undetected bug where if an `.open_channel_from()` spawned task errored the error would not be propagated to the `trio` side and instead would fail silently with a console log error. What was most odd is that it only seems easy to trigger when you put a slight task sleep before the error is raised (:eyeroll:). This patch adds a few things to address this and just in general improve iter-task lifetime syncing: - add `LinkedTaskChannel._trio_exited: bool` a flag set from the `trio` side when the channel block exits. - add a `wait_on_aio_task: bool` flag to `translate_aio_errors` which toggles whether to wait the `asyncio` task termination event on exit. - cancel the `asyncio` task if the trio side has ended, when `._trio_exited == True`. - always close the `trio` mem channel when the task exits such that the `asyncio` side can error on any next `.send()` call.	2022-07-14 16:35:41 -04:00
Tyler Goodlet	98de2fab31	Add context test that opens an inter-task-channel that errors	2022-07-14 16:13:12 -04:00
goodboy	80121ed211	Merge pull request #317 from goodboy/drop_msgpack Drop `msgpack`	2022-07-12 13:31:45 -04:00
Tyler Goodlet	41983edc43	Use `str` \| `bytes` union for typing msg dump	2022-07-12 11:59:11 -04:00
Tyler Goodlet	5168700fbf	Tolerate non-decode-able bytes	2022-07-12 11:55:55 -04:00
Tyler Goodlet	673c4a8c66	Decode bytes prior to log msg	2022-07-12 11:55:55 -04:00
Tyler Goodlet	932b841176	Allow up to 4 `msgpsec` decode failures	2022-07-12 11:55:55 -04:00

... 3 4 5 6 7 ...

1436 Commits (203f95615cc9a16cb6280418e2f339b9a36dae07) All Branches Search

1436 Commits (203f95615cc9a16cb6280418e2f339b9a36dae07)

All Branches