tractor

Commit Graph

Author	SHA1	Message	Date
Tyler Goodlet	e1d7004aec	Add a `pytest.ini` config	2025-03-14 22:18:31 -04:00
Tyler Goodlet	a97b45d90b	WIP final impl of ctx-cancellation-semantics	2025-03-14 22:18:31 -04:00
Tyler Goodlet	a388d3185b	Few more log msg tweaks in runtime	2025-03-14 22:18:31 -04:00
Tyler Goodlet	4d0df1bb4a	Call `actor.cancel(None)` from root to avoid mismatch with (any future) meth sig changes	2025-03-14 22:18:31 -04:00
Tyler Goodlet	5eb62b3e9b	Tweak broadcast fanout test to never inf loop Since a bug in the new `MsgStream.aclose()` impl's drain block logic was triggering an actual inf loop (by not ever canceller the streamer child actor), make sure we put a loop limit on the `inf_streamer`()` XD Also add a bit more deats to the test `print()`s in each actor and toss in `debug_mode` fixture support.	2025-03-14 22:18:31 -04:00
Tyler Goodlet	1be296c725	Add note that maybe `Context._eoc` should be set by caller?	2025-03-14 22:18:31 -04:00
Tyler Goodlet	9420ea0c14	Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.	2025-03-14 22:18:29 -04:00
Tyler Goodlet	9194e5774b	Fix overruns test to avoid return-beats-ctxc race Turns out that py3.11 might be so fast that iterating a EoC-ed `MsgStream` 1k times is faster then a `Context.cancel()` msg transmission from a parent actor to it's child (which i guess makes sense). So tweak the test to delay 5ms between stream async-for iteration attempts when the stream is detected to be `.closed: bool` (coming in patch) or `ctx.cancel_called == true`.	2025-03-14 22:16:39 -04:00
Tyler Goodlet	51a3f1bef4	Add `pformat()` of `ActorNursery._children` to logging Such that you see the children entries prior to exit instead of the prior somewhat detail/use-less logging. Also, rename all `anursery` vars to just `an` as is the convention in most examples.	2025-03-14 22:16:37 -04:00
Tyler Goodlet	ca1b8e0224	Set any `._eoc` to the err in `_raise_from_no_key_in_msg()` Since that's what we're now doing in `MsgStream._eoc` internal assignments (coming in future patch), do the same in this exception re-raise-helper and include more extensive doc string detailing all the msg-type-to-raised-error cases. Also expose a `hide_tb: bool` like we have already in `unpack_error()`.	2025-03-14 22:13:14 -04:00
Tyler Goodlet	e403d63eb7	Better logging for cancel requests in IPC msg loop As similarly improved in other parts of the runtime, adds much more pedantic (`.cancel()`) logging content to indicate the src of remote cancellation request particularly for `Actor.cancel()` and `._cancel_task()` cases prior to `._invoke()` task scheduling. Also add detailed case comments and much more info to the "request-to-cancel-already-terminated-RPC-task" log emission to include the `Channel` and `Context.cid` deats. This helped me find the src of a race condition causing a test to fail where a callee ctx task was returning a result before an expected `ctx.cancel()` request arrived B). Adding much more pedantic `.cancel()` msg contents around the requester's deats should ensure these cases are much easier to detect going forward! Also, simplify the `._invoke()` final result/error log msg to only put one of either the final error or returned result above the `Context` pprint.	2025-03-14 22:13:12 -04:00
Tyler Goodlet	3c385c6949	Use `NamespacePath` in `Context` mgmt internals The only case where we can't is in `Portal.run_from_ns()` usage (since we pass a path with `self:<Actor.meth>`) and because `.to_tuple()` internally uses `.load_ref()` which will of course fail on such a path.. So or now impl as, - mk `Actor.start_remote_task()` take a `nsf: NamespacePath` but also offer a `load_nsf: bool = False` such that by default we bypass ref loading (maybe this is fine for perf long run as well?) for the `Actor`/'self:'` case mentioned above. - mk `.get_context()` take an instance `nsf` obvi. More logging msg format tweaks: - change msg-flow related content to show the `Context._nsf`, which, right, is coming follow up commit.. - bunch more `.runtime()` format updates to show `msg: dict` contents and internal primitives with trailing `'\n'` for easier reading. - report import loading `stackscope` in subactors.	2025-03-14 22:11:57 -04:00
Tyler Goodlet	b28df738fe	Drop extra " " when logging actor nursery errors	2025-03-14 21:49:15 -04:00
Tyler Goodlet	5fa040c7db	Add `NamespacePath._ns` todo for `self:<ns.meth>` support	2025-03-14 21:49:15 -04:00
Tyler Goodlet	27b750e907	Emit warning on any `ContextCancelled.canceller == None`	2025-03-14 21:49:15 -04:00
Tyler Goodlet	96150600fb	Make ctx tests support `debug_mode: bool` fixture Such that with `--tpdb` passed (sub)actors will engage the `pdbp` REPL automatically and so that we can use the new `stackscope` support when complex cases hang Bo Also, - simplified some type-annots (ns paths), - doc-ed an inter-peer test func with some ascii msg flows, - added a bottom #TODO for replicating the scenario i hit in `modden` where a separate client actor-tree was hanging on cancelling a `bigd` sub-workspace..	2025-03-14 21:49:15 -04:00
Tyler Goodlet	338ea5529c	.log: more multi-line styling	2025-03-14 16:41:08 -04:00
Tyler Goodlet	6bc67338cf	Better subproc supervisor logging, todo for #320 Given i just similarly revamped a buncha `._runtime` log msg formatting, might as well do something similar inside the spawning machinery such that groking teardown sequences of each supervising task is much more sane XD Mostly this includes doing similar `'<field>: <value>\n'` multi-line formatting when reporting various subproc supervision steps as well as showing a detailed `trio.Process.__repr__()` as appropriate. Also adds a detailed #TODO according to the needs of #320 for which we're going to need some internal mechanism for intermediary parent actors to determine if a given debug tty locker (sub-actor) is one of their (transitive) children and thus stall the normal cancellation/teardown sequence until that locker is complete.	2025-03-14 16:41:06 -04:00
Tyler Goodlet	fd20004757	_supervise: iter nice expanded multi-line `._children` tups with typing	2025-03-14 16:34:17 -04:00
Tyler Goodlet	ddc2e5f0f8	WIP: solved the modden client hang..	2025-03-14 16:34:10 -04:00
Tyler Goodlet	4b0aa5e379	Baboso! fix `chan.send(None)` indent..	2025-03-14 15:49:37 -04:00
Tyler Goodlet	6a303358df	Improved log msg formatting in core As part of solving some final edge cases todo with inter-peer remote cancellation (particularly a remote cancel from a separate actor tree-client hanging on the request side in `modden`..) I needed less dense, more line-delimited log msg formats when understanding ipc channel and context cancels from console logging; this adds a ton of that to: - `._invoke()` which now does, - better formatting of `Context`-task info as multi-line `'<field>: <value>\n'` messages, - use of `trio.Task` (from `.lowlevel.current_task()` for full rpc-func namespace-path info, - better "msg flow annotations" with `<=` for understanding `ContextCancelled` flow. - `Actor._stream_handler()` where in we break down IPC peers reporting better as multi-line `\|_<Channel>` log msgs instead of all jammed on one line.. - `._ipc.Channel.send()` use `pformat()` for repr of packet. Also tweak some optional deps imports for debug mode: - add `maybe_import_gb()` for attempting to import `greenback`. - maybe enable `stackscope` tree pprinter on `SIGUSR1` if installed. Add a further stale-debugger-lock guard before removal: - read the `._debug.Lock.global_actor_in_debug: tuple` uid and possibly `maybe_wait_for_debugger()` when the child-user is known to have a live process in our tree. - only cancel `Lock._root_local_task_cs_in_debug: CancelScope` when the disconnected channel maps to the `Lock.global_actor_in_debug`, though not sure this is correct yet? Started adding missing type annots in sections that were modified.	2025-03-14 15:49:36 -04:00
Tyler Goodlet	c85757aee1	Let `pack_error()` take a msg injected `cid: str\|None`	2025-03-14 15:31:16 -04:00
Tyler Goodlet	9fc9b10b53	Add `StreamOverrun.sender: tuple` for better handling Since it's generally useful to know who is the cause of an overrun (say bc you want your system to then adjust the writer side to slow tf down) might as well pack an extra `.sender: tuple[str, str]` actor uid field which can be relayed through `RemoteActorError` boxing. Add an extra case for the exc-type to `unpack_error()` to match B)	2025-03-14 14:14:54 -04:00
Tyler Goodlet	a86275996c	Offer `unpack_error(hid_tb: bool)` for `pdbp` REPL config	2025-03-14 14:14:54 -04:00
Tyler Goodlet	b5431c0343	Never mask original `KeyError` in portal-error unwrapper, for now?	2025-03-14 14:14:54 -04:00
Tyler Goodlet	cdee6f9354	Try allowing multi-pops of `_Cache.locks` for now?	2025-03-14 14:14:53 -04:00
Tyler Goodlet	a2f1bcc23f	Use `import <blah> as blah` over `__all__` in `.trionics`	2025-03-14 14:14:53 -04:00
Tyler Goodlet	4aa89bf391	Bump timeout on resource cache test a bitty bit.	2025-03-14 14:14:53 -04:00
Tyler Goodlet	45e9cb4d09	`_root`: drop unused `typing` import	2025-03-14 14:14:53 -04:00
Tyler Goodlet	27c5ffe5a7	Move missing-key-in-msg raiser to `._exceptions` Since we use basically the exact same set of logic in `Portal.open_context()` when expecting the first `'started'` msg factor and generalize `._streaming._raise_from_no_yield_msg()` into a new `._exceptions._raise_from_no_key_in_msg()` (as per the lingering todo) which obvi requires a more generalized / optional signature including a caller specific `log` obj. Obvi call the new func from all the other modules X)	2025-03-14 14:14:50 -04:00
Tyler Goodlet	914efd80eb	Fmt repr as multi-line style call	2025-03-14 14:14:11 -04:00
Tyler Goodlet	2d2d1ca1c4	Drop unused walrus assign of `re`	2025-03-14 14:14:11 -04:00
Tyler Goodlet	74aa5aa9cd	`StackLevelAdapter._log(stacklevel: int)` for custom levels.. Apparently (and i don't know if this was always broken [i feel like no?] or is a recent change to stdlib's `logging` stuff) we need increment the `stacklevel` input by one for our custom level methods now? Without this you're going to see the path to the method's-callstack-frame on every emission instead of to the caller's. I first noticed this when debugging the workspace layer spawning in `modden.bigd` and then verified it in other depended projects.. I guess we should add some tests for this as well XD	2025-03-14 14:14:11 -04:00
Tyler Goodlet	44e386dd99	._child: remove some unused imports..	2025-03-14 13:56:25 -04:00
Tyler Goodlet	13fbcc723f	Guarding for IPC failures in `._runtime._invoke()` Took me longer then i wanted to figure out the source of a failed-response to a remote-cancellation (in this case in `modden` where a client was cancelling a workspace layer.. but disconnects before receiving the ack msg) that was triggering an IPC error when sending the error msg for the cancellation of a `Actor._cancel_task()`, but since this (non-rpc) `._invoke()` task was trying to send to a now disconnected canceller it was resulting in a `BrokenPipeError` (or similar) error. Now, we except for such IPC errors and only raise them when, 1. the transport `Channel` is for sure up (bc ow what's the point of trying to send an error on the thing that caused it..) 2. it's definitely for handling an RPC task Similarly if the entire main invoke `try:` excepts, - we only hide the call-stack frame from the debugger (with `__tracebackhide__: bool`) if it's an RPC task that has a connected channel since we always want to see the frame when debugging internal task or IPC failures. - we don't bother trying to send errors to the context caller (actor) when it's a non-RPC request since failures on actor-runtime-internal tasks shouldn't really ever be reported remotely, only maybe raised locally. Also some other tidying, - this properly corrects for the self-cancel case where an RPC context is cancelled due to a local (runtime) task calling a method like `Actor.cancel_soon()`. We now set our own `.uid` as the `ContextCancelled.canceller` value so that other-end tasks know that the cancellation was due to a self-cancellation by the actor itself. We still need to properly test for this though! - add a more detailed module doc-str. - more explicit imports for `trio` core types throughout.	2025-03-14 13:56:23 -04:00
Tyler Goodlet	315f0fc7eb	More thurough hard kill doc strings	2025-03-14 13:48:35 -04:00
Tyler Goodlet	fea111e882	Tons of interpeer test cleanup Drop all the nested `@acm` blocks and defunct comments from initial validations. Add some todos for cases that are still unclear such as whether the caller / streamer should have `.cancelled_caught == True` in it's teardown.	2025-03-14 13:44:09 -04:00
Tyler Goodlet	a1bf4db1e3	Get inter-peer suite passing with all `Context` state checks! Definitely needs some cleaning and refinement but this gets us to stage 1 of being pretty frickin correct i'd say 💃	2025-03-14 13:44:09 -04:00
Tyler Goodlet	bac9523ecf	Adjust test details where `Context.cancel()` is called We can now make asserts on `.cancelled_caught` and `_remote_error` vs. `_local_error`. Expect a runtime error when `Context.open_stream()` is called AFTER `.cancel()` and the remote `ContextCancelled` hasn't arrived (yet). Adjust to `'itself'` string in self-cancel case.	2025-03-14 13:44:09 -04:00
Tyler Goodlet	abe31e9e2c	Fix `Context.result()` call to be in runtime scope	2025-03-14 13:44:09 -04:00
Tyler Goodlet	0222180c11	Tweak `Channel._cancel_called` comment	2025-03-14 13:44:09 -04:00
Tyler Goodlet	7d5fda4485	Be ultra-correct in `Portal.open_context()` This took way too long to get right but hopefully will give us grok-able and correct context exit semantics going forward B) The main fixes were: - always shielding the `MsgStream.aclose()` call on teardown to avoid bubbling a `Cancelled`. - properly absorbing any `ContextCancelled` in cases due to "self cancellation" using the new `Context.canceller` in the logic. - capturing any error raised by the `Context.result()` call in the "normal exit, result received" case and setting it as the `Context._local_error` so that self-cancels can be easily measured via `Context.cancelled_caught` in same way as remote-error caused cancellations. - extremely detailed comments around all of the cancellation-error cases to avoid ever getting confused about the control flow in the future XD	2025-03-14 13:44:08 -04:00
Tyler Goodlet	f5fcd8ca2e	Be mega-pedantic with `ContextCancelled` semantics As part of extremely detailed inter-peer-actor testing, add much more granular `Context` cancellation state tracking via the following (new) fields: - `.canceller: tuple[str, str]` the uuid of the actor responsible for the cancellation condition - always set by `Context._maybe_cancel_and_set_remote_error()` and replaces `._cancelled_remote` and `.cancel_called_remote`. If set, this value should normally always match a value from some `ContextCancelled` raised or caught by one side of the context. - `._local_error` which is always set to the locally raised (and caller or callee task's scope-internal) error which caused any eventual cancellation/error condition and thus any closure of the context's per-task-side-`trio.Nursery`. - `.cancelled_caught: bool` is now always `True` whenever the local task catches (or "silently absorbs") a `ContextCancelled` (a `ctxc`) that indeed originated from one of the context's linked tasks or any other context which raised its own `ctxc` in the current `.open_context()` scope. => whenever there is a case that no `ContextCancelled` was raised in the `.open_context().__aexit__()` (eg. `ctx.result()` called after a call `ctx.cancel()`), we still consider the context's as having "caught a cancellation" since the `ctxc` was indeed silently handled by the cancel requester; all other error cases are already represented by mirroring the state of the `._scope: trio.CancelScope` => IOW there should be no case where an error is not raised in the context's scope and `.cancelled_caught: bool == False`, i.e. no case where `._scope.cancelled_caught == False and ._local_error is not None`! - always raise any `ctxc` from `.open_stream()` if `._cancel_called == True` - if the cancellation request has not already resulted in a `._remote_error: ContextCancelled` we raise a `RuntimeError` to indicate improper usage to the guilty side's task code. - make `._maybe_raise_remote_err()` a sync func and don't raise any `ctxc` which is matched against a `.canceller` determined to be the current actor, aka a "self cancel", and always set the `._local_error` to any such `ctxc`. - `.side: str` taken from inside `.cancel()` and unused as of now since it might be better re-written as a similar `.is_opener() -> bool`? - drop unused `._started_received: bool`.. - TONS and TONS of detailed comments/docs to attempt to explain all the possible cancellation/exit cases and how they should exhibit as either silent closes or raises from the `Context` API! Adjust the `._runtime._invoke()` code to match: - use `ctx._maybe_raise_remote_err()` in `._invoke()`. - adjust to new `.canceller` property. - more type hints. - better `log.cancel()` msging around self-cancels vs. peer-cancels. - always set the `._local_error: BaseException` for the "callee" task just like `Portal.open_context()` now will do B) Prior we were raising any `Context._remote_error` directly and doing (more or less) the same `ContextCancelled` "absorbing" logic (well kinda) in block; instead delegate to the method	2025-03-14 13:42:55 -04:00
Tyler Goodlet	04217f319a	Raise a `MessagingError` from the src error on msging edge cases	2025-03-14 13:42:15 -04:00
Tyler Goodlet	8cb8390201	Move `MessagingError` into `._exceptions` set	2025-03-14 13:42:15 -04:00
Tyler Goodlet	5035617adf	Dump `.msgdata` in `RemoteActorError.__repr__()`	2025-03-14 13:42:15 -04:00
Tyler Goodlet	715348c5c2	Port all tests to new `reg_addr` fixture name	2025-03-14 13:42:15 -04:00
Tyler Goodlet	fdf0c43bfa	Type out the full-fledged streaming ex.	2025-03-14 13:40:19 -04:00
Tyler Goodlet	f895c96600	Add masked super timeout line to `do_hard_kill()` for would-be runtime hackers	2025-03-14 13:40:19 -04:00

... 2 3 4 5 6 ...

1704 Commits (uv_migration_pre_msgspec_in_runtime) All Branches Search

1704 Commits (uv_migration_pre_msgspec_in_runtime)

All Branches