tractor

Commit Graph

Author	SHA1	Message	Date
Tyler Goodlet	c3f455a8ec	Mask tpt-closed handling of `chan.send(return_msg)` A partial revert of commit `c05d08e426` since it seem we already suppress tpt-closed errors lower down in `.ipc.Channel.send()`; given that i'm pretty sure this new handler code should basically never run? Left in a todo to remove the masked content once i'm done more thoroughly testing under `piker`.	2026-02-12 00:51:50 -05:00
Tyler Goodlet	f78e842fba	More `TransportClosed`-handling around IPC-IO For IPC-disconnects-during-teardown edge cases, augment some `._rpc` machinery, - in `._invoke()` around the `await chan.send(return_msg)` where we suppress if the underlying `Channel` already disconnected. - add a disjoint handler in `_errors_relayed_via_ipc()` which just reports-n-reraises the exc (same as prior behaviour). * originally i thought it needed to be handled specially (to avoid being crash handled) but turns out that isn't necessary? * hence the also-added-bu-masked-out `debug_filter` / guard expression around the `await debug._maybe_enter_pm()` line. - show the `._invoke()` frame for the moment.	2026-02-12 00:51:50 -05:00
Gud Boi	2ed9e65530	Clear rtvs state on root shutdown.. Fixes the bug discovered in last test update, not sure how this wasn't caught already XD	2026-02-11 22:17:26 -05:00
Gud Boi	8aee24e83f	Fix when root-actor addrs is set as rtvs Move `_root_addrs` assignment to after `async_main()` unblocks (via `.started()`) which now delivers the bind addrs , ensuring correct `UnwrappedAddress` propagation into `._state._runtime_vars` for non-registar root actors.. Previously for non-registrar root actors the `._state._runtime_vars` entries were being set as `Address` values which ofc IPC serialize incorrectly rn vs. the unwrapped versions, (well until we add a msgspec for their structs anyway) and thus are passed in incorrect form to children/subactors during spawning.. This fixes the issue by waiting for the `.ipc.*` stack to bind-and-resolve any randomly allocated addrs (by the OS) until after the initial `Actor` startup is complete. Deats, - primarily, mv `_root_addrs` assignment from before `root_tn.start()` to after, using started(-ed) `accept_addrs` now delivered from `._runtime.async_main()`.. - update `task_status` type hints to match. - unpack and set the `(accept_addrs, reg_addrs)` tuple from `root_tn.start()` call into `._state._runtime_vars` entries. - improve and embolden comments distinguishing registrar vs non-registrar init paths, ensure typing reflects wrapped vs. unwrapped addrs. Also, - add a masked `mk_pdb().set_trace()` for debugging `raddrs` values being "off". - add TODO about using UDS on linux for root mailbox - rename `trans_bind_addrs` -> `tpt_bind_addrs` for clarity. - expand comment about random port allocation for non-registrar case (this commit msg was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code	2026-02-11 22:17:26 -05:00
Gud Boi	3f0bde1bf8	Use bare `get_logger()` in `.to_asyncio`	2026-02-11 22:02:41 -05:00
Gud Boi	fa1a15dce8	Cleaups per copilot PR review	2026-02-11 21:51:40 -05:00
Gud Boi	ff02939213	Toss in some `colorlog` alts to try	2026-02-11 21:05:16 -05:00
Gud Boi	0b0c83e9da	Drop `name=__name__` from all `get_logger()` calls Use new implicit module-name detection throughout codebase to simplify logger creation and leverage auto-naming from caller mod . Main changes, - drop `name=__name__` arg from all `get_logger()` calls (across 29 modules). - update `get_console_log()` calls to include `name='tractor'` for enabling root logger in test harness and entry points; this ensures logic in `get_logger()` triggers so that all `tractor`-internal logging emits to console. - add info log msg in test `conftest.py` showing test-harness log level Also, - fix `.actor.uid` ref to `.actor.aid.uid` in `._trace`. - adjust a `._context` log msg formatting for clarity. - add TODO comments in `._addr`, `._uds` for when we mv to using `multiaddr`. - add todo for `RuntimeVars` type hint TODO in `.msg.types` (once we eventually get that all going obvi!) (this commit msg was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code	2026-02-11 21:04:49 -05:00
Gud Boi	5e7c0f264d	Rework `.get_logger()`, better autonaming, deduping Overhaul of the automatic-calling-module-name detection and sub-log creation logic to avoid (at least warn) on duplication(s) and still handle the common usage of a call with `name=__name__` from a mod's top level scope. Main `get_logger()` changes, - refactor auto-naming logic for implicit `name=None` case such that we handle at least `tractor` internal "bare" calls from internal submods. - factor out the `get_caller_mod()` closure (still inside `get_logger()`)for introspecting caller's module with configurable frame depth. - use `.removeprefix()` instead of `.lstrip()` for stripping pkg-name from mod paths - mv root-logger creation before sub-logger name processing - improve duplicate detection for `pkg_name` in `name` - add `_strict_debug=True`-only-emitted warnings for duplicate pkg/leaf-mod names. - use `print()` fallback for warnings when no actor runtime is up at call time. Surrounding tweaks, - add `.level` property to `StackLevelAdapter` for getting current emit level as lowercase `str`. - mv `_proj_name` def to just above `get_logger()` - use `_curr_actor_no_exc` partial in `_conc_name_getters` to avoid runtime errors - improve comments/doc-strings throughout - keep some masked `breakpoint()` calls for future debugging (this commit msg was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code	2026-02-11 21:04:29 -05:00
Tyler Goodlet	de24bfe052	Mv `load_module_from_path()` to a new `._code_load` submod	2026-02-11 21:03:29 -05:00
Tyler Goodlet	dea4b9fd93	Implicitly name sub-logs by caller's mod That is when no `name` is passed to `get_logger()`, try to introspect the caller's `module.__name__` and use it to infer/get the "namespace path" to that module the same as if using `name=__name__` as in the most common usage. Further, change the `_root_name` to be `pkg_name: str`, a public and more obvious param name, and deprecate the former. This obviously adds the necessary impl to make the new `test_sys_log::test_implicit_mod_name_applied_for_child` test pass. Impl detalles for `get_logger()`, - add `pkg_name` and deprecate `_root_name`, include failover logic and a warning. - implement calling module introspection using `inspect.stack()/getmodule()` to get both the `.__name__` and `.__package__` info alongside adjusted logic to set the `name` when not provided but only when a new `mk_sublog: bool` is set. - tweak the `name` processing for implicitly set case, - rename `sub_name` -> `pkg_path: str` which is the path to the calling module minus that module's name component. - only partition `name` if `pkg_name` is `in` it. - use the `_root_log` for `pkg_name` duplication warnings. Other/related, - add types to various public mod vars missing them. - rename `.log.log` -> `.log._root_log`.	2026-02-11 21:03:07 -05:00
Tyler Goodlet	0e3229f16d	Start a logging-sys unit-test module To start ensuring that when `name=__name__` is passed we try to de-duplicate the `_root_name` and any `leaf_mod: str` since it's already included in the headers as `{filename}`. Deats, - heavily document the de-duplication `str.partition()`s in `.log.get_logger()` and provide the end fix by changing the predicate, `if rname == 'tractor':` -> `if rname == _root_name`. * also toss in some warnings for when we still detect duplicates. - add todo comments around logging "filters" (vs. our "adapter"). - create the new `test_log_sys.test_root_pkg_not_duplicated()` which runs green with the fixes from ^. - add a ton of test-suite todos both for existing and anticipated logging sys feats in the new mod.	2026-02-11 21:03:07 -05:00
Tyler Goodlet	0c6d512ba4	Solve another OoB cancellation case, the bg task one Such that we are able to (finally) detect when we should `Context._scope.cancel()` specifically when the `.parent_task` is not blocking on receiving from the underlying `._rx_chan`, since if the task is blocking on `.receive()` it will call `.cancel()` implicitly. This is a lot to explain with very little code actually needed for the implementation (are we like `trio` yet anyone?? XD) but the main jist is that `Context._maybe_cancel_and_set_remote_error()` needed the additional case of calling `._scope.cancel()` whenever we know that a remote-error/ctxc won't be immediately handled, bc user code is doing non `Context`-API things, and result in a similar outcome as if that task was waiting on `Context.wait_for_result()` or `.__aexite__()`. Impl details, - add a new `._is_blocked_on_rx_chan()` method which predicates whether the (new) `.parent_task` is blocking on `._rx_chan.receive()`. * see various stipulations about the current impl and how we might need to adjust for the future given `trio`'s commitment to the `Task.custom_sleep_data` attr.. - add `.parent_task`, a pub wrapper for `._task`. - check for `not ._is_blocked_on_rx_chan()` before manually cancelling the local `.parent_task` - minimize the surrounding branch case expressions. Other, - tweak a couple logs. - add a new `.cancel()` pre-started msg. - mask the `.cancel_called` setter, it's only (been) used for tracing. - todos around maybe moving the `._nursery` allocation "around" the `.start_remote_task()` call and various subsequent tweaks therein.	2025-09-11 13:12:52 -04:00
Tyler Goodlet	fc130d06b8	Check off REPL-ing todo add masked usage in `drain_to_final_msg()`	2025-09-11 10:13:04 -04:00
Tyler Goodlet	b1f2a6b394	Rename var for and hide the `_open_and_supervise_one_cancels_all_nursery` frame	2025-09-11 10:13:04 -04:00
Tyler Goodlet	62a364a1d3	Tweaks from copilot, type fix, typos, language.	2025-09-11 10:01:25 -04:00
Tyler Goodlet	9aebe7d8f9	Only read `_mask_cases` if truthy, allow disabling for xfails	2025-09-05 22:23:51 -04:00
Tyler Goodlet	e9f3689191	Add "ignore-case-handling" to exc unmasker Since it turns out there's even case(s) in `trio` core that are guilty (of implementing things like checkpoints in exc handlers), this adds facility for ignoring explicit cases via `inspect.FrameInfo` field matching from the unmasked `exc_ctx` within `maybe_raise_from_masking_exc()`. Impl deats, - use `inspect.getinnerframes()/getmodule()` to extract the equivalent "guilty place in code" which raised the masked error which we'd like to ignore and not unmask. - start a `_mask_cases: dict` which describes the entries to ignore by matching against a specific `FrameInfo`'s fields from indexed from `getinnerframes()`. - describe in that table the case i hit with `trio.WouldBlock` being always masked by a `Cancelled` due to way `trio.Lock.acquire()` implements the blocking case in the would-block handler.. - always call into a new `is_expected_masking_case()` predicate (from `maybe_raise_from_masking_exc()`) on matching `exc_ctx` types.	2025-09-05 14:54:54 -04:00
Tyler Goodlet	93aa39db07	Always pop `._Cache.resources` AFTER `mng.__aexit__()` The correct ordering is to de-alloc the surrounding `service_n` + `trio.Event` after the `mng` teardown ensuring the `mng.__aexit__()` never can hit a ref-error if it touches either (like if a `tn` is passed to `maybe_open_context()`!	2025-09-05 14:54:41 -04:00
Tyler Goodlet	5ab642bdf0	Drop more `typing.Optional` usage	2025-08-20 12:45:49 -04:00
Tyler Goodlet	ed18ecd064	Drop `tn` arg to `maybe_raise_from_masking_exc()` in `._rpc`	2025-08-20 12:45:49 -04:00
Tyler Goodlet	cec0282953	Add `never_warn_on: dict` support to unmasker Such that key->value pairs can be defined which should never be unmasked where values of - the keys are exc-types which might be masked, and - the values are exc-types which masked the equivalent key. For example, the default includes: - KBI->taskc: a kbi should never be unmasked from its masking `trio.Cancelled`. For the impl, a new `do_warn: bool` in the fn-body determines the primary guard for whether a warning or re-raising is necessary.	2025-08-20 12:45:49 -04:00
Tyler Goodlet	25c5847f2e	Drop `tn` input from `maybe_raise_from_masking_exc()` Including all caller usage throughout. Moving to a non-`except*` impl means it's never needed as a signal from the caller - we can just catch the beg outright (like we should have always been doing)..	2025-08-20 12:45:49 -04:00
Tyler Goodlet	ba793fadd9	Pass `tuple` from `._invoke()` unmasker usage To match the `maybe_raise_from_masking_exc()` sig change.	2025-08-20 12:45:49 -04:00
Tyler Goodlet	6c361a9564	Drop `except*` usage from `._taskc` unmasker That is from `maybe_raise_from_masking_exc()` thus minimizing us to a single `except BaseException` block with logic branching for the beg vs. `unmask_from` exc cases. Also, - raise val-err when `unmask_from` is not a `tuple`. - tweak the exc-note warning format. - drop all pausing from dev work.	2025-08-20 12:45:49 -04:00
Tyler Goodlet	548855b4f5	Comment/docs tweaks per copilot reivew Add a micro glossary to clarify questioned terms and refine out some patch specific comment regions.	2025-08-20 12:36:08 -04:00
Tyler Goodlet	5322861d6d	Clean out old-commented tn-opens and ipc-server settings checks	2025-08-20 11:35:31 -04:00
Tyler Goodlet	46a2fa7074	Always pass a `tn` to `._server._serve_ipc_eps()` Turns out we weren't despite the optional `stream_handler_nursery` input to `Server.listen_on()`; fail over to the `Server._stream_handler_tn` allocated during server setup in those cases.	2025-08-20 11:30:58 -04:00
Tyler Goodlet	bfe5b2dde6	Hide `collapse_eg()` frame as used from `open_root_actor()`	2025-08-20 10:44:42 -04:00
Tyler Goodlet	a9f06df3fb	Heh, add back `Actor._root_tn`, it has purpose.. Turns out I didn't read my own internals docs/comments and despite it not being used previously, this adds the real use case: a root, per-actor, scope which ensures parent comms are the last conc-thing to be cancelled. Also, the impl changes here make the test from 6410e45 (or wtv it's rebased to) pass, i.e. we can support crash handling in the root actor despite the root-tn having been (self) cancelled. Superficial adjustments, - rename `Actor._service_n` -> `._service_tn` everywhere. - add asserts to `._runtime.async_main()` which ensure that the any `.trionics.maybe_open_nursery()` calls against optionally passed `._[root/service]_tn` are allocated-if-not-provided (the `._service_tn`-case being an i-guess-prep-for-the-future-anti-pattern Bp). - obvi adjust all internal usage to match new naming. Serious/real-use-case changes, - add (back) a `Actor._root_tn` which sits a scope "above" the service-tn and is either, + assigned in `._runtime.async_main()` for sub-actors OR, + assigned in `._root.open_root_actor()` for the root actor. THE primary reason to keep this "upper" tn is that during a full-`Actor`-cancellation condition (more details below) we want to ensure that the IPC connection with a sub-actor's parent is the last thing to be cancelled; this is most simply implemented by ensuring that the `Actor._parent_chan: .ipc.Channel` is handled in an upper scope in `_rpc.process_messages()`-subtask-terms. - for the root actor this `root_tn` is allocated in `.open_root_actor()` body and assigned as such. - extend `Actor.cancel_soon()` to be cohesive with this entire teardown "policy" by scheduling a task in the `._root_tn` which, * waits for the `._service_tn` to complete and then, * cancels the `._root_tn.cancel_scope`, * includes "sclangy" console logging throughout.	2025-08-20 10:18:52 -04:00
Tyler Goodlet	561954594e	Add attempt at non-root-parent REPL guarding I masked it bc it doesn't seem to actually work for the case I was testing (`emsd` clobbering a `paperboi` in `piker`..) but figured I'd leave it as a reminder for solving this problem more generally (#320) since this is likely the place in the code for a soln. When i tested it in my case it just resulted in a hang around the `with debug.acquire_debug_lock()` for some reason? Can't remember if the child ended up being able to REPL without issue though..	2025-08-19 14:15:14 -04:00
Tyler Goodlet	28a6354e81	Set `shield` when `.cancel_called` for root crashes Such that we handle them despite a cancellation condition. This is almost always the case, that `root_tn.cancel_scope.cancel_called` is set, by the time the `debug._maybe_enter_pm()` hits. Previous I guess we just weren't actually ever REPL-debugging such cases? TODO, still needs a test obvi!	2025-08-19 14:14:38 -04:00
Tyler Goodlet	d1599449e7	Mk `pause_from_sync()` raise `InternalError` on no `greenback` init	2025-08-19 14:14:27 -04:00
Tyler Goodlet	2d27c94dec	Hide `_maybe_enter_pm()` frame (again?)	2025-08-19 14:14:27 -04:00
Tyler Goodlet	0fafd25f0d	Comment tweaks per copilot review	2025-08-19 12:33:47 -04:00
Tyler Goodlet	961504b657	Support `chan.started_nowait()` in `.open_channel_from()` target That is the `target` can declare a `chan: LinkedTaskChannel` instead of `to_trio`/`from_aio`. To support it, - change `.started()` -> the more appropriate `.started_nowait()` which can be called sync from the aio child task. - adjust the `provide_channels` assert to accept either fn sig declaration (for now). Still needs test(s) obvi..	2025-08-18 22:32:51 -04:00
Tyler Goodlet	bd148300c5	Relay `asyncio` errors via EoC and raise from rent Makes the newly added `test_aio_side_raises_before_started` test pass by ensuring errors raised by any `.to_asyncio.open_channel_from()` spawned child-`asyncio.Task` are relayed by any caught `trio.EndOfChannel` by checking for a new `LinkedTaskChannel._closed_by_aio_task: bool`. Impl deats, - obvi add `LinkedTaskChannel._closed_by_aio_task: bool = False` - in `translate_aio_errors()` always check for the new flag on EOC conditions and in such cases set `chan._trio_to_raise = aio_err` such that the `trio`-parent-task always raises the child's exception directly, OW keep original EoC passthrough in place. - include very detailed per-case comments around the extended handler. - adjust re-raising logic with a new `raise_from` where we only give the `aio_err` priority if it's not already set as to `trio_to_raise`. Also, - hide the `_run_asyncio_task()` frame by def.	2025-08-18 22:32:51 -04:00
Tyler Goodlet	5c7d930a9a	Drop unused `Actor._root_n`..	2025-08-18 22:16:03 -04:00
Tyler Goodlet	c46986504d	Switch nursery to `CancelScope`-status properties Been meaning to do this forever and a recent test hang finally drove me to it Bp Like it sounds, adopt the "cancel-status" properties on `ActorNursery` use already on our `Context` and derived from `trio.CancelScope`: - add new private `._cancel_called` (set in the head of `.cancel()`) & `._cancelled_caught` (set in the tail) instance vars with matching read-only `@properties`. - drop the instance-var and instead delegate a `.cancelled: bool` property to `._cancel_called` and add a usage deprecation warning (since removing it breaks a buncha tests).	2025-08-18 22:16:03 -04:00
Tyler Goodlet	e05a4d3cac	Enforce named-args only to `.open_nursery()`	2025-08-18 22:16:03 -04:00
Tyler Goodlet	5021514a6a	Disable shm resource tracker via flag on 3.13+ As per the newly added support, https://docs.python.org/3/library/multiprocessing.shared_memory.html	2025-08-18 22:04:40 -04:00
Tyler Goodlet	331921f612	Hmm disable CRE case for now, causes test fails So i need to either adjust the tests or figure out if/why this is needed to avoid the crashing in `pikerd` i found when killin the chart during a long backfill with `binance` backend..	2025-08-18 21:30:48 -04:00
Tyler Goodlet	df0d00abf4	Translate CRE's due to socket-close to tpt-closed Just like in the BRE case (for UDS) it seems when a peer closes the (UDS?) socket `trio` instead raises a `ClosedResourceError` which we now catch and re-raise as a `TransportClosed`. This again results in `tpt.send()` calls from the rpc-runtime not raising when it's known that the IPC channel is disconnected.	2025-08-18 21:30:48 -04:00
Tyler Goodlet	a72d1e6c48	Multi-line-style up the UDS fast-connect handler Shift around comments and expressions for better reading, assign `tpt_closed` for easier introspection from REPL during debug oh and fix the `MsgpackTransport.pformat()` to render '\|_peers: 1' .. XD	2025-08-18 21:30:48 -04:00
Tyler Goodlet	5931c59aef	Log "out-of-layer" cancellation in `._rpc._invoke()` Similar to what was just changed for `Context.repr_state`, when the child task is cancelled but by a different "layer" of the runtime (i.e. a `Portal.cancel_actor()` / `SIGINT`-to-process canceller) we don't dump a traceback instead just `log.cancel()` emit.	2025-08-18 21:30:48 -04:00
Tyler Goodlet	ba08052ddf	Handle "out-of-layer" remote `Context` cancellation Such that if the local task hasn't resolved but is `trio.Cancelled` and a `.canceller` was set, we report a `'actor-cancelled'` from `.repr_state: str`. Bit of formatting to avoid needless newlines too!	2025-08-18 21:30:48 -04:00
Tyler Goodlet	00112edd58	UDS: implicitly create `Address.bindspace: Path` Since it's merely a local-file-sys subdirectory and there should be no reason file creation conflicts with other bind spaces. Also add 2 test suites to match, - `tests/ipc/test_each_tpt::test_uds_bindspace_created_implicitly` to verify the dir creation when DNE. - `..test_uds_double_listen_raises_connerr` to ensure a double bind raises a `ConnectionError` from the src `OSError`.	2025-08-18 21:30:48 -04:00
Tyler Goodlet	1d706bddda	Rm `assert` from `Channel.from_addr()`, for UDS we re-created to extract the peer PID	2025-08-18 21:30:48 -04:00
Tyler Goodlet	3c30c559d5	`ipc._uds`: assign `.l/raddr` in `.connect_to()` Using `.get_stream_addrs()` such that we always (can) assign the peer end's PID in the `._raddr`. Also factor common `ConnectionError` re-raising into a `_reraise_as_connerr()`-@cm.	2025-08-18 21:30:48 -04:00
Tyler Goodlet	599020c2c5	Rename all lingering ctx-side bits As before but more thoroughly in comments and var names finally changing all, - caller -> parent - callee -> child	2025-08-18 21:30:48 -04:00

1 2 3 4 5 ...

1342 Commits (c3f455a8ecf735c6205b40551a6ae54bb8b94856)