tractor

Commit Graph

Author	SHA1	Message	Date
Tyler Goodlet	fa7e37d6ed	(Event) more pedantic `.cancel_acked: bool` def Changes the condition logic to be more strict and moves it to a private `._is_self_cancelled() -> bool` predicate which can be used elsewhere (instead of having almost similar duplicate checks all over the place..) and allows taking in a specific `remote_error` just for verification purposes (like for tests). Main strictness distinctions are now: - obvi that `.cancel_called` is set (this filters any `Portal.cancel_actor()` or other out-of-band RPC), - the received `ContextCancelled` must have its `.canceller` set to this side's `Actor.uid` (indicating we are the requester). - `.src_actor_uid` must be the same as the `.chan.uid` (so the error must have originated from the opposite side's task. - `ContextCancelled.canceller` should be already set to the `.chan.uid` indicating we received the msg via the runtime calling `._deliver_msg()` -> `_maybe_cancel_and_set_remote_error()` which ensures the error is specifically destined for this ctx-task exactly the same as how `Actor._cancel_task()` sets it from an input `requesting_uid` arg. In support of the above adjust some impl deats: - add `Context._actor: Actor` which is set once in `mk_context()` to avoid issues (particularly in testing) where `current_actor()` raises after the root actor / runtime is already exited. Use `._actor.uid` in both `.cancel_acked` (obvi) and '_maybe_cancel_and_set_remote_error()` when deciding whether to call `._scope.cancel()`. - always cast `.canceller` to `tuple` if not null. - delegate `.cancel_acked` directly to new private predicate (obvi). - always set `._canceller` from any `RemoteActorError.src_actor_uid` or failing over to the `.chan.uid` when a non-remote error (tho that shouldn't ever happen right?). - more extensive doc-string for `.cancel()` detailing the new strictness rules about whether an eventual `.cancel_acked` might be set. Also tossed in even more logging format tweaks by adding a `type_only: bool` to `.repr_outcome()` as desired for simpler output in the `state: <outcome-repr-here>` and `.repr_rpc()` sections of the `.__str__()`.	2024-03-07 20:35:43 -05:00
Tyler Goodlet	364ea91983	Set `._cancel_msg` to RPC `{cmd: 'self._cancel_task', ..}` msg Like how we set `Context._cancel_msg` in `._deliver_msg()` (in which case normally it's an `{'error': ..}` msg), do the same when any RPC task is remotely cancelled via `Actor._cancel_task` where that task doesn't yet have a cancel msg set yet. This makes is much easier to distinguish between ctx cancellations due to some remote error vs. Explicit remote requests via any of `Actor.cancel()`, `Portal.cancel_actor()` or `Context.cancel()`.	2024-03-07 18:24:00 -05:00
Tyler Goodlet	7ae9b5319b	Tweak inter-peer `._scope` state asserts We don't expect `._scope.cancelled_caught` to be set really ever on inter-peer cancellation since no ctx is ever cancelling itself, a peer cancels some other and then bubbles back to all other peers. Also add `ids: lambda` for `error_during_ctxerr_handling` param to `test_peer_canceller()`	2024-03-06 16:09:38 -05:00
Tyler Goodlet	6156ff95f8	Add `shield: bool` support to `.pause()` It's been on the todo for a while and I've given up trying to properly hide the `trio.CancelScope.__exit__()` frame for now instead opting to just `log.pdb()` a big apology XD Users can obvi still just not use the flag and wrap `tractor.pause()` in their own cs block if they want to avoid having to hit `'up'` in the pdb REPL if needed in a cancelled task-scope. Impl deatz: - factor orig `.pause()` impl into new `._pause()` so that we can more tersely wrap the original content depending on `shield: bool` input; only open the cancel-scope when shield is set to avoid aforemented extra strack frame annoyance. - pass through `shield` to underlying `_pause` and `debug_func()` so we can actually know when so log our apology. - add a buncha notes to new `.pause()` wrapper regarding the inability to hide the cancel-scope `.__exit__()`, inluding that overriding the code in `trio._core._run.CancelScope` doesn't seem to solve the issue either.. Unrelated `maybe_wait_for_debugger()` tweaks: - don't read `Lock.global_actor_in_debug` more then needed, rename local read var to `in_debug` (since it can also hold the root actor uid, not just sub-actors). - shield the `await debug_complete.wait()` since ideally we avoid the root cancellation child-actors in debug even when the root calls this func in a cancelled scope.	2024-03-06 14:37:54 -05:00
Tyler Goodlet	9e3f41a5b1	Tweak inter-peer tests for new/refined semantics Buncha subtle details changed mostly to do with when `Context.cancel()` gets called on "real" remote errors vs. (peer requested) cancellation and then local side handling of `ContextCancelled`. Specific changes to make tests pass: - due to raciness with `sleeper_ctx.result()` raising the ctxc locally vs. the child-peers receiving similar ctxcs themselves (and then erroring and propagating back to the root parent), we might not see `._remote_error` set during the sub-ctx loops (except for the sleeper itself obvi). - do not expect `.cancel_called`/`.cancel_caught` to be set on any sub-ctx since currently `Context.cancel()` is only called non-shielded and thus is not in invoked when `._scope.cancel()` is called as part of each root-side ctx ref/block handling the inter-peer ctxc. - do not expect `Context._scope.cancelled_caught` to be set in most cases (even the sleeper) TODO Outstanding adjustments not fixed yet: -[ ] `_scope.cancelled_caught` checks outside the `.open_context()` blocks.	2024-03-06 10:13:41 -05:00
Tyler Goodlet	7c22f76274	Yahh, add `.devx` package to installed subpkgs..	2024-03-06 09:55:05 -05:00
Tyler Goodlet	04c99c2749	Woops, add `.msg` sub-pkg to install set	2024-03-06 09:48:46 -05:00
Tyler Goodlet	e536057fea	`._entry`: use same msg info in start/terminate log	2024-03-05 12:30:34 -05:00
Tyler Goodlet	c6b4da5788	Tweak `._portal` log content to use `Context.repr_outcome()`	2024-03-05 12:26:33 -05:00
Tyler Goodlet	1f7f84fdfa	Mk debugger tests work for arbitrary pre-REPL format Since this was changed as part of overall project wide logging format updates, and i ended up changing the both the crash and pause `.pdb()` msgs to include some multi-line-ascii-"stuff", might as well make the pre-prompt checks in the test suite more flexible to match. As such, this exposes 2 new constants inside the `.devx._debug` mod: - `._pause_msg: str` for the pre `tractor.pause()` header emitted via `log.pdb()` and, - `._crash_msg: str` for the pre `._post_mortem()` equiv when handling errors in debug mode. Adjust the test suite to use these values and thus make us more capable to absorb changes in the future as well: - add a new `in_prompt_msg()` predicate, very similar to `assert_before()` but minus `assert`s which takes in a `parts: list[str]` to match in the pre-prompt stdout. - delegate to `in_prompt_msg()` in `assert_before()` since it was mostly duplicate minus `assert`. - adjust all previous `<patt> in before` asserts to instead use `in_prompt_msg()` with separated pre-prompt-header vs. actor-name `parts`. - use new `._pause/crash_msg` values in all such calls including any `assert_before()` cases.	2024-03-05 12:22:04 -05:00
Tyler Goodlet	a5bdc6db66	Flip rpc tests over to use `ExceptionGroup` on new `trio`	2024-03-05 10:34:32 -05:00
Tyler Goodlet	9a18b57d38	Mega-refactor on `._invoke()` targeting `@context`s Since eventually we want to implement all other RPC "func types" as contexts underneath this starts the rework to move all the other cases into a separate func not only to simplify the main `._invoke()` body but also as a reminder of the intention to do it XD Details of re-factor: - add a new `._invoke_non_context()` which just moves all the old blocks for non-context handling to a single def. - factor what was basically just the `finally:` block handler (doing all the task bookkeeping) into a new `@acm`: `_errors_relayed_via_ipc()` with that content packed into the post-`yield` (also with a `hide_tb: bool` flag added of course). * include a `debug_kbis: bool` for when needed. - since the `@context` block is the only type left in the main `_invoke()` body, de-dent it so it's more grok-able B) Obviously this patch also includes a few improvements regarding context-cancellation-semantics (for the `context` RPC case) on the callee side in order to match previous changes to the `Context` api: - always setting any ctxc as the `Context._local_error`. - using the new convenience `.maybe_raise()` topically (for now). - avoiding any previous reliance on `Context.cancelled_caught` for anything public of meaning. Further included is more logging content updates: - being pedantic in `.cancel()` msgs about whether termination is caused by error or ctxc. - optional `._invoke()` traceback hiding via a `hide_tb: bool`. - simpler log headers throughout instead leveraging new `.__repr__()` on primitives. - buncha `<= <actor-uid>` sent some message emissions. - simplified handshake statuses reporting. Other subsys api changes we need to match: - change to `Channel.transport`. - avoiding any `local_nursery: ActorNursery` waiting when the `._implicit_runtime_started` is set. And yes, lotsa more comments for #TODOs dawg.. since there's always somethin!	2024-03-02 22:12:00 -05:00
Tyler Goodlet	ed10632d97	Avoid `ctx.cancel()` after ctxc rxed in `.open_context()` In the case where the callee side delivers us a ctxc with `.canceller` set we can presume that remote cancellation already has taken place and thus we don't need to do the normal call-`Context.cancel()`-on-error step. Further, in the case where we do call it also handle any `trio.CloseResourceError` gracefully with a `.warning()`. Also, originally I had added a post-`yield`-maybe-raise to attempt handling any remote ctxc the same as for the local case (i.e. raised from `yield` line) wherein if we get a remote ctxc the same handler branch-path would trigger, thus avoiding different behaviour in that case. I ended up masking it out (but can't member why.. ) as it seems the normal `.result()` call and its internal handling gets the same behaviour? I've left in the heavily commented code in case it ends up being the better way to go; likely making the move to having a single code in both cases is better even if it is just a matter of deciding whether to swallow the ctxc or not in the `.cancel_acked` case. Further teensie improvements: - obvi improve/simplify log msg contents as in prior patches. - use the new `maybe_wait_for_debugger(header_msg: str)` if/when waiting to exit in debug mode. - another `hide_tb: bool` frame hider flag. - rando type-annot updates of course :)	2024-03-02 17:18:55 -05:00
Tyler Goodlet	299429a278	Deep `Context` refinements Spanning from the pub API, to instance `repr()` customization (for logging/REPL content), to the impl details around the notion of a "final outcome" and surrounding IPC msg draining mechanics during teardown. A few API and field updates: - new `.cancel_acked: bool` to replace what we were mostly using `.cancelled_caught: bool` for but, for purposes of better mapping the semantics of remote cancellation of parallel executing tasks; it's set only when `.cancel_called` is set and a ctxc arrives with a `.canceller` field set to the current actor uid indicating we requested and received acknowledgement from the other side's task that is cancelled gracefully. - strongly document and delegate (and prolly eventually remove as a pub attr) the `.cancelled_caught` property entirely to the underlying `._scope: trio.CancelScope`; the `trio` semantics don't really map well to the "parallel with IPC msging" case in the sense that for us it breaks the concept of the ctx/scope closure having "caught" something instead of having "received" a msg that the other side has "acknowledged" (i.e. which for us is the completion of cancellation). - new `.__repr__()`/`.__str__()` format that tries to tersely yet comprehensively as possible display everything you need to know about the 3 main layers of an SC-linked-IPC-context: * ipc: the transport + runtime layers net-addressing and prot info. * rpc: the specific linked caller-callee task signature details including task and msg-stream instances. * state: current execution and final outcome state of the task pair. * a teensie extra `.repr_rpc` for a condensed rpc signature. - new `.dst_maddr` to get a `libp2p` style "multi-address" (though right now it's just showing the transport layers so maybe we should move to to our `Channel`?) - new public instance-var fields supporting more granular remote cancellation/result/error state: * `.maybe_error: Exception\|None` for any final (remote) error/ctxc which computes logic on the values of `._remote_error`/`._local_error` to determine the "final error" (if any) on termination. * `.outcome` to the final error or result (or `None` if un-terminated) * `.repr_outcome()` for a console/logging friendly version of the final result or error as needed for the `.__str__()`. - new private interface bits to support all of ^: * a new "no result yet" sentinel value, `Unresolved`, using a module level class singleton that `._result` is set too (instead of `id(self)`) to both determine if and present when no final result from the callee has-yet-been/was delivered (ever). => really we should get rid of `.result()` and change it to `.wait_for_result()` (or something)u * `_final_result_is_set()` predicate to avoid waiting for an already delivered result. * `._maybe_raise()` proto-impl that we should use to replace all the `if re:` blocks it can XD * new `._stream: MsgStream\|None` for when a stream is opened to aid with the state repr mentioned above. Tweaks to the termination drain loop `_drain_to_final_msg()`: - obviously (obvi) use all the changes above when determining whether or not a "final outcome" has arrived and thus breaking from the loop ;) * like the `.outcome` `.maybe_error` and `._final_ctx_is_set()` in the `while` pred expression. - drop the `_recv_chan.receive_nowait()` + guard logic since it seems with all the surrounding (and coming soon) changes to `Portal.open_context()` using all the new API stuff (mentioned in first bullet set above) we never hit the case of inf-block? Oh right and obviously a ton of (hopefully improved) logging msg content changes, commented code removal and detailed comment-docs strewn about!	2024-03-01 22:37:32 -05:00
Tyler Goodlet	28fefe4ffe	Make stream draining status logs `.debug()` level	2024-03-01 19:27:10 -05:00
Tyler Goodlet	08a6a51cb8	Add `._implicit_runtime_started` mark, better logs After some deep logging improvements to many parts of `._runtime`, I realized a silly detail where we are always waiting on any opened `local_nursery: ActorNursery` to signal exit from `Actor._stream_handler()` even in the case of being an implicitly opened root actor (`open_root_actor()` wasn't called by user/app code) via `._supervise.open_nursery()`.. So, to address this add a `ActorNursery._implicit_runtime_started: bool` that can be set and then checked to avoid doing the unnecessary `.exited.wait()` (and any subsequent warn logging on an exit timeout) in that special but most common case XD Matching with other subsys log format refinements, improve readability and simplicity of the actor-nursery supervisory log msgs, including: - simplify and/or remove any content that more or less duplicates msg content found in emissions from lower-level primitives and sub-systems (like `._runtime`, `_context`, `_portal` etc.). - add a specific `._open_and_supervise_one_cancels_all_nursery()` handler block for `ContextCancelled` to log with `.cancel()` level noting that the case is a "remote cancellation". - put the nursery-exit and actor-tree shutdown status into a single msg in the `implicit_runtime` case.	2024-03-01 15:44:01 -05:00
Tyler Goodlet	50465d4b34	Spawn naming and log format tweaks - rename `.soft_wait()` -> `.soft_kill()` - rename `.do_hard_kill()` -> `.hard_kill()` - adjust any `trio.Process.__repr__()` log msg contents to have the little tree branch prefix: `'\|_'`	2024-03-01 11:37:23 -05:00
Tyler Goodlet	4f69af872c	Add field-first subproca `.info()` to `._entry`	2024-02-29 20:01:39 -05:00
Tyler Goodlet	9bc6a61c93	Add "fancier" remote-error `.__repr__()`-ing Our remote error box types `RemoteActorError`, `ContextCancelled` and `StreamOverrun` needed a console display makeover particularly for logging content and `repr()` in higher level primitives like `Context`. This adds a more "dramatic" str-representation to showcase the underlying boxed traceback content more sensationally (via ascii-art emphasis) as well as support a more terse `.reprol()` (representation for one-line) format that can be used for types that track remote errors/cancels like with `Context._remote_error`. Impl deats: - change `RemoteActorError.__repr__()` formatting to show (sub-type specific) `.msgdata` fields in a multi-line format (similar to our new `.msg.types.Struct` style) followed by some ascii accented delimiter lines to emphasize any `.msgdata["tb_str"]` packed by the remote - for rme and subtypes allow picking the specifically relevant fields via a type defined `.reprol_fields: list[str]` and pick for each subtype: \|_ `RemoteActorError.src_actor_uid` \|_ `ContextCancelled.canceller` \|_ `StreamOverrun.sender` - add `.reprol()` to show a `repr()`-on-one-line formatted string that can be used by other multi-line-field-`repr()` styled composite types as needed in (high level) logging info. - toss in some mod level `_body_fields: list[str]` for summary of such fields (if needed). - add some new rae (remote-actor-error) props: - `.type` around a newly named `.boxed_type` - `.type_str: str` - `.tb_str: str`	2024-02-29 18:56:31 -05:00
Tyler Goodlet	23aa97692e	Fix `Channel.__repr__()` safety, renames to `._transport` Hit a reallly weird bug in the `._runtime` IPC msg handling loop where it seems that by `str.format()`-ing a `Channel` before initializing it would put the `._MsgTransport._agen()` in an already started state causing an irrecoverable core startup failure.. I presume it's something to do with delegating to the `MsgpackTCPStream.__repr__()` and, something something.. the `.set_msg_transport(stream)` getting called to too early such that `.msgstream.__init__()` is called thus init-ing the `._agen()` before necessary? I'm sure there's a design lesson to be learned in here somewhere XD This was discovered while trying to add more "fancy" logging throughout said core for the purposes of cobbling together an init attempt at libp2p style multi-address representations for our IPC primitives. Thus I also tinker here with adding some new fields to `MsgpackTCPStream`: - `layer_key`: int = 4 - `name_key`: str = 'tcp' - `codec_key`: str = 'msgpack' Anyway, just changed it so that if `.msgstream` ain't set then we just return a little "null repr" `str` value thinger. Also renames `Channel.msgstream` internally to `._transport` with appropriate pub `@property`s added such that everything else won't break ;p Also drops `Optional` typing vis-a-vi modern union syntax B)	2024-02-29 18:37:04 -05:00
Tyler Goodlet	1e5810e56c	Make `NamespacePath` kinda support methods.. Obviously we can't deterministic-ally call `.load_ref()` (since you'd have to point to an `id()` or something and presume a particular py-runtime + virt-mem space for it to exist?) but it at least helps with the `str` formatting for logging purposes (like `._cancel_rpc_tasks()`) when `repr`-ing ctxs and their specific "rpc signatures". Maybe in the future getting this working at least for singleton types per process (like `Actor` XD ) will be a thing we can support and make some sense of.. Bo	2024-02-29 17:37:02 -05:00
Tyler Goodlet	b54cb6682c	Add #TODO for generating func-sig type-annots as `str` for pprinting	2024-02-29 17:21:43 -05:00
Tyler Goodlet	3ed309f019	Add test for `modden` sub-spawner-server hangs on cancel As per a lot of the recent refinements to `Context` cancellation, add a new test case to replicate the original hang-on-cancel found with `modden` when using a client actor to spawn a subactor in some other tree where despite `Context.cancel()` being called the requesting client would hang on the opened context with the server. The specific scenario added here is to have, - root actor spawns 2 children: a client and a spawn server. - the spawn server opens with a spawn-request serve loop and begins to wait for the client. - client spawns and connects to the sibling spawn server, requests to spawn a sub-actor, the "little bro", connects to it then does some echo streaming, cancels the request with it's sibling (the spawn server) which should in turn cancel the root's-grandchild and result in a cancel-ack back to the client's `.open_context()`. - root ensures that it can also connect to the grandchild (little bro), do the same echo streaming, then ensure everything tears down correctly after cancelling all the children. More refinements to come here obvi in the specific cancellation semantics and possibly causes. Also tweaks the other tests in suite to use the new `Context` properties recently introduced and similarly updated in the previous patch to the ctx-semantics suite.	2024-02-29 15:45:55 -05:00
Tyler Goodlet	d08aeaeafe	Make `@context`-cancelled tests more pedantic In order to match a very significant and coming-soon patch set to the IPC `Context` and `Channel` cancellation semantics with significant but subtle changes to the primitives and runtime logic: - a new set of `Context` state pub meth APIs for checking exact inter-actor-linked-task outcomes such as `.outcome`, `.maybe_error`, and `.cancel_acked`. - trying to move away from `Context.cancelled_caught` usage since the semantics from `trio` don't really map well (in terms of cancel requests and how they result in cancel-scope graceful closure) and `.cancel_acked: bool` is a better approach for IPC req-resp msging. - change test usage to access `._scope.cancelled_caught` directly. - more pedantic ctxc-raising expects around the "type of self cancellation" and final outcome in ctxc cases: - `ContextCancelled` is raised by ctx (`Context.result()`) consumer methods when `Portal.cancel_actor()` is called (since it's an out-of-band request) despite `Channel._cancel_called` being set. - also raised by `.open_context().__aexit__()` on close. - `.outcome` is always `.maybe_error` is always one of `._local/remote_error`.	2024-02-28 19:25:27 -05:00
Tyler Goodlet	c6ee4e5dc1	Add a `pytest.ini` config	2024-02-22 20:37:12 -05:00
Tyler Goodlet	ad5eee5666	WIP final impl of ctx-cancellation-semantics	2024-02-22 18:33:18 -05:00
Tyler Goodlet	fc72d75061	Support `maybe_wait_for_debugger(header_msg: str)` Allow callers to stick in a header to the `.pdb()` level emitted msg(s) such that any "waiting status" content is only shown if the caller actually get's blocked waiting for the debug lock; use it inside the `._spawn` sub-process reaper call. Also, return early if `Lock.global_actor_in_debug == None` and thus only enter the poll loop when actually needed, consequently raise if we fall through the loop without acquisition.	2024-02-22 15:08:10 -05:00
Tyler Goodlet	de1843dc84	Few more log msg tweaks in runtime	2024-02-22 15:06:39 -05:00
Tyler Goodlet	930d498841	Call `actor.cancel(None)` from root to avoid mismatch with (any future) meth sig changes	2024-02-22 14:45:08 -05:00
Tyler Goodlet	5ea112699d	Tweak broadcast fanout test to never inf loop Since a bug in the new `MsgStream.aclose()` impl's drain block logic was triggering an actual inf loop (by not ever canceller the streamer child actor), make sure we put a loop limit on the `inf_streamer`()` XD Also add a bit more deats to the test `print()`s in each actor and toss in `debug_mode` fixture support.	2024-02-22 14:41:28 -05:00
Tyler Goodlet	e244747bc3	Add note that maybe `Context._eoc` should be set by caller?	2024-02-22 14:22:45 -05:00
Tyler Goodlet	5a09ccf459	Tweak `Actor` cancel method signatures Besides improving a bunch more log msg contents similarly as before this changes the cancel method signatures slightly with different arg names: for `.cancel()`: - instead of `requesting_uid: str` take in a `req_chan: Channel` since we can always just read its `.uid: tuple` for logging and further we can then offer the `chan=None` case indicating a "self cancel" (since there's no "requesting channel"). - the semantics of "requesting" here better indicate that the IPC connection is an IPC peer and further (eventually) will allow permission checking against given peers for cancellation requests. - when `chan==None` we also define a meth-internal `requester_type: str` differently for logging content :) - add much more detailed `.cancel()` content around the requester, its type, and any debugger related locking steps. for `._cancel_task()`: - change the `chan` arg to `parent_chan: Channel` since "parent" correctly indicates that the channel is the parent of the locally spawned rpc task to cancel; in fact no other chan should be able to cancel tasks parented/spawned by other channels obvi! - also add more extensive meth-internal `.cancel()` logging with a #TODO around showing only the "relevant/lasest" `Context` state vars in such logging content. for `.cancel_rpc_tasks()`: - shorten `requesting_uid` -> `req_uid`. - add `parent_chan: Channel` to be similar as above in `._cancel_task()` (since it's internally delegated to anyway) which replaces the prior `only_chan` and use it to filter to only tasks spawned by this channel (thus as their "parent") as before. - instead of `if tasks:` to enter, invert and `return` early on `if not tasks`, for less indentation B) - add WIP str-repr format (for `.cancel()` emissions) to show a multi-address (maddr) + task func (via the new `Context._nsf`) and report all cancel task targets with it a "tree"; include #TODO to finalize and implement some utils for all this! To match ensure we adjust `process_messages()` self/`Actor` cancel handling blocks to provide the new `kwargs` (now with `dict`-merge syntax) to `._invoke()`.	2024-02-22 14:22:08 -05:00
Tyler Goodlet	ce1bcf6d36	Fix overruns test to avoid return-beats-ctxc race Turns out that py3.11 might be so fast that iterating a EoC-ed `MsgStream` 1k times is faster then a `Context.cancel()` msg transmission from a parent actor to it's child (which i guess makes sense). So tweak the test to delay 5ms between stream async-for iteration attempts when the stream is detected to be `.closed: bool` (coming in patch) or `ctx.cancel_called == true`.	2024-02-21 13:53:25 -05:00
Tyler Goodlet	28ba5e5435	Add `pformat()` of `ActorNursery._children` to logging Such that you see the children entries prior to exit instead of the prior somewhat detail/use-less logging. Also, rename all `anursery` vars to just `an` as is the convention in most examples.	2024-02-21 13:21:28 -05:00
Tyler Goodlet	10adf34be5	Set any `._eoc` to the err in `_raise_from_no_key_in_msg()` Since that's what we're now doing in `MsgStream._eoc` internal assignments (coming in future patch), do the same in this exception re-raise-helper and include more extensive doc string detailing all the msg-type-to-raised-error cases. Also expose a `hide_tb: bool` like we have already in `unpack_error()`.	2024-02-21 13:17:37 -05:00
Tyler Goodlet	82dcaff8db	Better logging for cancel requests in IPC msg loop As similarly improved in other parts of the runtime, adds much more pedantic (`.cancel()`) logging content to indicate the src of remote cancellation request particularly for `Actor.cancel()` and `._cancel_task()` cases prior to `._invoke()` task scheduling. Also add detailed case comments and much more info to the "request-to-cancel-already-terminated-RPC-task" log emission to include the `Channel` and `Context.cid` deats. This helped me find the src of a race condition causing a test to fail where a callee ctx task was returning a result before an expected `ctx.cancel()` request arrived B). Adding much more pedantic `.cancel()` msg contents around the requester's deats should ensure these cases are much easier to detect going forward! Also, simplify the `._invoke()` final result/error log msg to only put one of either the final error or returned result above the `Context` pprint.	2024-02-21 13:05:22 -05:00
Tyler Goodlet	621b252b0c	Use `NamespacePath` in `Context` mgmt internals The only case where we can't is in `Portal.run_from_ns()` usage (since we pass a path with `self:<Actor.meth>`) and because `.to_tuple()` internally uses `.load_ref()` which will of course fail on such a path.. So or now impl as, - mk `Actor.start_remote_task()` take a `nsf: NamespacePath` but also offer a `load_nsf: bool = False` such that by default we bypass ref loading (maybe this is fine for perf long run as well?) for the `Actor`/'self:'` case mentioned above. - mk `.get_context()` take an instance `nsf` obvi. More logging msg format tweaks: - change msg-flow related content to show the `Context._nsf`, which, right, is coming follow up commit.. - bunch more `.runtime()` format updates to show `msg: dict` contents and internal primitives with trailing `'\n'` for easier reading. - report import loading `stackscope` in subactors.	2024-02-20 16:15:48 -05:00
Tyler Goodlet	20a089c331	Drop extra " " when logging actor nursery errors	2024-02-20 15:58:11 -05:00
Tyler Goodlet	df50d78042	Fix `.devx.maybe_wait_for_debugger()` polling deats When entered by the root actor avoid excessive polling cycles by, - blocking on the `Lock.no_remote_has_tty: trio.Event` and breaking immediately when set (though we should really also lock it from the root right?) to avoid extra loops.. - shielding the `await trio.sleep(poll_delay)` call to avoid any local cancellation causing the (presumably root-actor task) caller to move on (possibly to cancel its children) and instead to continue poll-blocking until the lock is actually released by its user. - `break` the poll loop immediately if no remote locker is detected. - use `.pdb()` level for reporting lock state changes. Also add a #TODO to handle calls by non-root actors as it pertains to	2024-02-20 15:57:31 -05:00
Tyler Goodlet	114ec36436	Add `stackscope` as dep, drop legacy `pdb` issue cruft	2024-02-20 15:29:31 -05:00
Tyler Goodlet	179d7d2b04	Add `NamespacePath._ns` todo for `self:<ns.meth>` support	2024-02-20 15:28:11 -05:00
Tyler Goodlet	f568fca98f	Emit warning on any `ContextCancelled.canceller == None`	2024-02-20 15:26:14 -05:00
Tyler Goodlet	6c9bc627d8	Make ctx tests support `debug_mode: bool` fixture Such that with `--tpdb` passed (sub)actors will engage the `pdbp` REPL automatically and so that we can use the new `stackscope` support when complex cases hang Bo Also, - simplified some type-annots (ns paths), - doc-ed an inter-peer test func with some ascii msg flows, - added a bottom #TODO for replicating the scenario i hit in `modden` where a separate client actor-tree was hanging on cancelling a `bigd` sub-workspace..	2024-02-20 15:14:58 -05:00
Tyler Goodlet	1d7cf7d1dd	Enable `stackscope` render via root in debug mode If `stackscope` is importable and debug_mode is enabled then we by default call and report `.devx.enable_stack_on_sig()` is set B) This makes debugging unexpected (SIGINT ignoring) hangs a cinch!	2024-02-20 13:23:16 -05:00
Tyler Goodlet	54a0a0000d	.log: more multi-line styling	2024-02-20 13:22:44 -05:00
Tyler Goodlet	0268b2ce91	Better subproc supervisor logging, todo for #320 Given i just similarly revamped a buncha `._runtime` log msg formatting, might as well do something similar inside the spawning machinery such that groking teardown sequences of each supervising task is much more sane XD Mostly this includes doing similar `'<field>: <value>\n'` multi-line formatting when reporting various subproc supervision steps as well as showing a detailed `trio.Process.__repr__()` as appropriate. Also adds a detailed #TODO according to the needs of #320 for which we're going to need some internal mechanism for intermediary parent actors to determine if a given debug tty locker (sub-actor) is one of their (transitive) children and thus stall the normal cancellation/teardown sequence until that locker is complete.	2024-02-20 13:12:51 -05:00
Tyler Goodlet	81f8e2d4ac	_supervise: iter nice expanded multi-line `._children` tups with typing	2024-02-20 09:18:22 -05:00
Tyler Goodlet	bf0739c194	Add `stackscope` tree pprinter triggered by SIGUSR1 Can be optionally enabled via a new `enable_stack_on_sig()` which will swap in the SIGUSR1 handler. Much thanks to @oremanj for writing this amazing project, it's thus far helped me fix some very subtle hangs inside our new IPC-context cancellation machinery that would have otherwise taken much more manual pdb-ing and hair pulling XD Full credit for `dump_task_tree()` goes to the original project author with some minor tweaks as was handed to me via the trio-general matrix room B) Slight changes from orig version: - use a `log.pdb()` emission to pprint to console - toss in an ex sh CLI cmd to trigger the dump from another terminal using `kill` + `pgrep`.	2024-02-20 09:05:34 -05:00
Tyler Goodlet	5fe3f58ea9	Add a `debug_mode: bool` fixture via `--tpdb` flag Allows tests (including any `@tractor_test`s) to subscribe to a CLI flag `--tpdb` (for "tractor python debugger") which the session can provide to tests which can then proxy the value to `open_root_actor()` (via `open_nursery()`) when booting the runtime - thus enabling our debug mode globally to any subscribers B) This is real handy if you have some failures but can't determine the root issue without jumping into a `pdbp` REPL inside a (sub-)actor's spawned-task.	2024-02-20 08:53:37 -05:00
Tyler Goodlet	3e1d033708	WIP: solved the modden client hang..	2024-02-19 17:00:46 -05:00

... 2 3 4 5 6 ...

1787 Commits (c5a0cfc6399530a4b0aecb93c62db02eb9c0120e) All Branches Search

1787 Commits (c5a0cfc6399530a4b0aecb93c62db02eb9c0120e)

All Branches