tractor

jc211

tractor

Author	SHA1	Message	Date
Tyler Goodlet	548855b4f5	Comment/docs tweaks per copilot reivew Add a micro glossary to clarify questioned terms and refine out some patch specific comment regions.	2025-08-20 12:36:08 -04:00
Tyler Goodlet	5322861d6d	Clean out old-commented tn-opens and ipc-server settings checks	2025-08-20 11:35:31 -04:00
Tyler Goodlet	46a2fa7074	Always pass a `tn` to `._server._serve_ipc_eps()` Turns out we weren't despite the optional `stream_handler_nursery` input to `Server.listen_on()`; fail over to the `Server._stream_handler_tn` allocated during server setup in those cases.	2025-08-20 11:30:58 -04:00
Tyler Goodlet	bfe5b2dde6	Hide `collapse_eg()` frame as used from `open_root_actor()`	2025-08-20 10:44:42 -04:00
Tyler Goodlet	a9f06df3fb	Heh, add back `Actor._root_tn`, it has purpose.. Turns out I didn't read my own internals docs/comments and despite it not being used previously, this adds the real use case: a root, per-actor, scope which ensures parent comms are the last conc-thing to be cancelled. Also, the impl changes here make the test from 6410e45 (or wtv it's rebased to) pass, i.e. we can support crash handling in the root actor despite the root-tn having been (self) cancelled. Superficial adjustments, - rename `Actor._service_n` -> `._service_tn` everywhere. - add asserts to `._runtime.async_main()` which ensure that the any `.trionics.maybe_open_nursery()` calls against optionally passed `._[root/service]_tn` are allocated-if-not-provided (the `._service_tn`-case being an i-guess-prep-for-the-future-anti-pattern Bp). - obvi adjust all internal usage to match new naming. Serious/real-use-case changes, - add (back) a `Actor._root_tn` which sits a scope "above" the service-tn and is either, + assigned in `._runtime.async_main()` for sub-actors OR, + assigned in `._root.open_root_actor()` for the root actor. THE primary reason to keep this "upper" tn is that during a full-`Actor`-cancellation condition (more details below) we want to ensure that the IPC connection with a sub-actor's parent is the last thing to be cancelled; this is most simply implemented by ensuring that the `Actor._parent_chan: .ipc.Channel` is handled in an upper scope in `_rpc.process_messages()`-subtask-terms. - for the root actor this `root_tn` is allocated in `.open_root_actor()` body and assigned as such. - extend `Actor.cancel_soon()` to be cohesive with this entire teardown "policy" by scheduling a task in the `._root_tn` which, * waits for the `._service_tn` to complete and then, * cancels the `._root_tn.cancel_scope`, * includes "sclangy" console logging throughout.	2025-08-20 10:18:52 -04:00
Tyler Goodlet	ee32bc433c	Add a root-already-cancelled crash handling test Such that we audit the `shield=root_tn.cancel_scope.cancel_called,` passed to `await debug._maybe_enter_pm()` in the `open_root_actor()` exit handler block.	2025-08-20 10:18:52 -04:00
Tyler Goodlet	561954594e	Add attempt at non-root-parent REPL guarding I masked it bc it doesn't seem to actually work for the case I was testing (`emsd` clobbering a `paperboi` in `piker`..) but figured I'd leave it as a reminder for solving this problem more generally (#320) since this is likely the place in the code for a soln. When i tested it in my case it just resulted in a hang around the `with debug.acquire_debug_lock()` for some reason? Can't remember if the child ended up being able to REPL without issue though..	2025-08-19 14:15:14 -04:00
Tyler Goodlet	28a6354e81	Set `shield` when `.cancel_called` for root crashes Such that we handle them despite a cancellation condition. This is almost always the case, that `root_tn.cancel_scope.cancel_called` is set, by the time the `debug._maybe_enter_pm()` hits. Previous I guess we just weren't actually ever REPL-debugging such cases? TODO, still needs a test obvi!	2025-08-19 14:14:38 -04:00
Tyler Goodlet	d1599449e7	Mk `pause_from_sync()` raise `InternalError` on no `greenback` init	2025-08-19 14:14:27 -04:00
Tyler Goodlet	2d27c94dec	Hide `_maybe_enter_pm()` frame (again?)	2025-08-19 14:14:27 -04:00
Tyler Goodlet	6e4c76245b	Add LoC pattern matches for `test_post_mortem_api`	2025-08-19 14:14:27 -04:00
Bd	a6f599901c	Merge pull request #395 from goodboy/to_asyncio_eoc_signal `to_asyncio` eoc signal: use `trio.EndOfChannel` to indicate (maybe non-graceful) `asyncio.Task` termination	2025-08-19 12:45:23 -04:00
Tyler Goodlet	0fafd25f0d	Comment tweaks per copilot review	2025-08-19 12:33:47 -04:00
Tyler Goodlet	b74e93ee55	Change one infected-aio test to use `chan` in fn sig	2025-08-18 22:32:51 -04:00
Tyler Goodlet	961504b657	Support `chan.started_nowait()` in `.open_channel_from()` target That is the `target` can declare a `chan: LinkedTaskChannel` instead of `to_trio`/`from_aio`. To support it, - change `.started()` -> the more appropriate `.started_nowait()` which can be called sync from the aio child task. - adjust the `provide_channels` assert to accept either fn sig declaration (for now). Still needs test(s) obvi..	2025-08-18 22:32:51 -04:00
Tyler Goodlet	bd148300c5	Relay `asyncio` errors via EoC and raise from rent Makes the newly added `test_aio_side_raises_before_started` test pass by ensuring errors raised by any `.to_asyncio.open_channel_from()` spawned child-`asyncio.Task` are relayed by any caught `trio.EndOfChannel` by checking for a new `LinkedTaskChannel._closed_by_aio_task: bool`. Impl deats, - obvi add `LinkedTaskChannel._closed_by_aio_task: bool = False` - in `translate_aio_errors()` always check for the new flag on EOC conditions and in such cases set `chan._trio_to_raise = aio_err` such that the `trio`-parent-task always raises the child's exception directly, OW keep original EoC passthrough in place. - include very detailed per-case comments around the extended handler. - adjust re-raising logic with a new `raise_from` where we only give the `aio_err` priority if it's not already set as to `trio_to_raise`. Also, - hide the `_run_asyncio_task()` frame by def.	2025-08-18 22:32:51 -04:00
Tyler Goodlet	4a7491bda4	Add "raises-pre-started" `open_channel_from()` test Verifying that if any exc is raised pre `chan.send_nowait()` (our currentlly shite version of a `chan.started()`) then that exc is indeed raised through on the `trio`-parent task side. This case was reproduced from a `piker.brokers.ib` issue with a similar embedded `.trionics.maybe_open_context()` call. Deats, - call the suite `test_aio_side_raises_before_started`. - mk the `@context` simply `maybe_open_context(acm_func=open_channel_from)` with a `target=raise_before_started` which, - simply sleeps then immediately raises a RTE. - expect the RTE from the aio-child-side to propagate all the way up to the root-actor's task right up through the `trio.run()`.	2025-08-18 22:32:51 -04:00
Bd	62415518fc	Merge pull request #394 from goodboy/nursery_cleaning A bit of (actor) nursery cleaning	2025-08-18 22:32:19 -04:00
Tyler Goodlet	5c7d930a9a	Drop unused `Actor._root_n`..	2025-08-18 22:16:03 -04:00
Tyler Goodlet	c46986504d	Switch nursery to `CancelScope`-status properties Been meaning to do this forever and a recent test hang finally drove me to it Bp Like it sounds, adopt the "cancel-status" properties on `ActorNursery` use already on our `Context` and derived from `trio.CancelScope`: - add new private `._cancel_called` (set in the head of `.cancel()`) & `._cancelled_caught` (set in the tail) instance vars with matching read-only `@properties`. - drop the instance-var and instead delegate a `.cancelled: bool` property to `._cancel_called` and add a usage deprecation warning (since removing it breaks a buncha tests).	2025-08-18 22:16:03 -04:00
Tyler Goodlet	e05a4d3cac	Enforce named-args only to `.open_nursery()`	2025-08-18 22:16:03 -04:00
Bd	a9aa5ec04e	Merge pull request #392 from goodboy/introspect_ipc Introspect-ipc: some `.ipc` subpkg iface refinements for reading cancel statuses and `Address.__repr__()`	2025-08-18 22:15:40 -04:00
Tyler Goodlet	5021514a6a	Disable shm resource tracker via flag on 3.13+ As per the newly added support, https://docs.python.org/3/library/multiprocessing.shared_memory.html	2025-08-18 22:04:40 -04:00
Tyler Goodlet	79f502034f	Don't hard code runtime-dir, read it with `._state.get_rt_dir()`	2025-08-18 21:30:48 -04:00
Tyler Goodlet	331921f612	Hmm disable CRE case for now, causes test fails So i need to either adjust the tests or figure out if/why this is needed to avoid the crashing in `pikerd` i found when killin the chart during a long backfill with `binance` backend..	2025-08-18 21:30:48 -04:00
Tyler Goodlet	df0d00abf4	Translate CRE's due to socket-close to tpt-closed Just like in the BRE case (for UDS) it seems when a peer closes the (UDS?) socket `trio` instead raises a `ClosedResourceError` which we now catch and re-raise as a `TransportClosed`. This again results in `tpt.send()` calls from the rpc-runtime not raising when it's known that the IPC channel is disconnected.	2025-08-18 21:30:48 -04:00
Tyler Goodlet	a72d1e6c48	Multi-line-style up the UDS fast-connect handler Shift around comments and expressions for better reading, assign `tpt_closed` for easier introspection from REPL during debug oh and fix the `MsgpackTransport.pformat()` to render '\|_peers: 1' .. XD	2025-08-18 21:30:48 -04:00
Tyler Goodlet	5931c59aef	Log "out-of-layer" cancellation in `._rpc._invoke()` Similar to what was just changed for `Context.repr_state`, when the child task is cancelled but by a different "layer" of the runtime (i.e. a `Portal.cancel_actor()` / `SIGINT`-to-process canceller) we don't dump a traceback instead just `log.cancel()` emit.	2025-08-18 21:30:48 -04:00
Tyler Goodlet	ba08052ddf	Handle "out-of-layer" remote `Context` cancellation Such that if the local task hasn't resolved but is `trio.Cancelled` and a `.canceller` was set, we report a `'actor-cancelled'` from `.repr_state: str`. Bit of formatting to avoid needless newlines too!	2025-08-18 21:30:48 -04:00
Tyler Goodlet	00112edd58	UDS: implicitly create `Address.bindspace: Path` Since it's merely a local-file-sys subdirectory and there should be no reason file creation conflicts with other bind spaces. Also add 2 test suites to match, - `tests/ipc/test_each_tpt::test_uds_bindspace_created_implicitly` to verify the dir creation when DNE. - `..test_uds_double_listen_raises_connerr` to ensure a double bind raises a `ConnectionError` from the src `OSError`.	2025-08-18 21:30:48 -04:00
Tyler Goodlet	1d706bddda	Rm `assert` from `Channel.from_addr()`, for UDS we re-created to extract the peer PID	2025-08-18 21:30:48 -04:00
Tyler Goodlet	3c30c559d5	`ipc._uds`: assign `.l/raddr` in `.connect_to()` Using `.get_stream_addrs()` such that we always (can) assign the peer end's PID in the `._raddr`. Also factor common `ConnectionError` re-raising into a `_reraise_as_connerr()`-@cm.	2025-08-18 21:30:48 -04:00
Tyler Goodlet	599020c2c5	Rename all lingering ctx-side bits As before but more thoroughly in comments and var names finally changing all, - caller -> parent - callee -> child	2025-08-18 21:30:48 -04:00
Tyler Goodlet	50f6543ee7	Add `Channel.closed/.cancel_called` I.e. the public properties for the private instance var equivs; improves expected introspection usage.	2025-08-18 21:30:48 -04:00
Tyler Goodlet	c0854fd221	Set `Channel._cancel_called` via `chan` var In `Portal.cancel_actor()` that is, at the least to make it easier to ref search from an editor Bp	2025-08-18 21:30:48 -04:00
Tyler Goodlet	e875b62869	Add `.ipc._shm` todo-idea for `@actor_fixture` API	2025-08-18 21:30:48 -04:00
Tyler Goodlet	3ab7498893	Add todo for py3.13+ `.shared_memory`'s new `track=False` support.. finally they added it XD	2025-08-18 21:30:48 -04:00
Bd	dd041b0a01	Merge pull request #393 from goodboy/trionics_tweaks Trionics tweaks: some `._mngrs` refinements and fix a `test_resource_cache` hang	2025-08-18 21:20:33 -04:00
Tyler Goodlet	4e252526b5	Accept `tn` to `gather_contexts()/maybe_open_context()` Such that the caller can be responsible for their own (nursery) scoping as needed and, for the latter fn's case with a `trio.Nursery.CancelStatus.encloses()` check to ensure the `tn` is a valid parent-ish. Some deats, - in `gather_contexts()`, mv the `try/finally` outside the nursery block to ensure we always do the `parent_exit`. - for `maybe_open_context()` we do a naive task-tree hierarchy audit to ensure the provided scope is not too child-ish (with what APIs `trio` gives us, see above), OW go with the old approach of using the actor's private service nursery. Also, * better report `trio.Cancelled` around the cache-miss `yield` cases and ensure we never unmask triggering key-errors. * report on any stale-state with the mutex in the `finally` block.	2025-08-18 21:07:12 -04:00
Tyler Goodlet	4ba3590450	Add `.trionics.maybe_open_context()` locking test Call it `test_lock_not_corrupted_on_fast_cancel()` and includes a detailed doc string to explain. Implemented it "cleverly" by having the target `@acm` cancel its parent nursery after a peer, cache-hitting task, is already waiting on the task mutex release.	2025-08-18 21:07:12 -04:00
Tyler Goodlet	f1ff79a4e6	Always `finally` invoke cache-miss `lock.release()`s Since the `await service_n.start()` on key-err can be cancel-masked (checkpoint interrupted before `_Cache.run_ctx` completes), we need to always `lock.release()` in to avoid lock-owner-state corruption and/or inf-hangs in peer cache-hitting tasks. Deats, - add a `try/except/finally` around the key-err triggered cache-miss `service_n.start(_Cache.run_ctx, ..)` call, reporting on any taskc and always `finally` unlocking. - fill out some log msg content and use `.debug()` level.	2025-08-18 21:07:12 -04:00
Tyler Goodlet	70664b98de	Well then, I guess it just needed, a checkpoint XD Here I was thinking the bcaster (usage) maybe required a rework but, NOPE it's just bc a checkpoint was needed in the parent task owning the `tn` which spawns `get_sub_and_pull()` tasks to ensure the bg allocated `an`/portal is eventually cancel-called.. Ah well, at least i started a patch for `MsgStream.subscribe()` to make it multicast revertible.. XD Anyway, I tossed in some checks & notes related to all that unnecessary effort since I do think i'll move forward implementing it: - for the `cache_hit` case always verify that the `bcast` clone is unregistered from the common state subs after `.subscribe().__aexit__()`. - do a light check that the implicit `MsgStream._broadcaster` is always the only bcrx instance left-leaked into that state.. that is until i get the proper de-allocation/reversion from multicast -> unicast working. - put in mega detailed note about the required parent-task checkpoint.	2025-08-18 21:07:12 -04:00
Tyler Goodlet	1c425cbd22	Tool-up `test_resource_cache.test_open_local_sub_to_stream` Since I recently discovered a very subtle race-case that can sometimes cause the suite to hang, seemingly due to the `an: ActorNursery` allocated behind the `.trionics.maybe_open_context()` usage; this can result in never cancelling the 'streamer' subactor despite the `main()` timeout-guard? This led me to dig in and find that the underlying issue was 2-fold, - our `BroadcastReceiver` termination-mgmt semantics in `MsgStream.subscribe()` can result in the first subscribing task to always keep the `MsgStream._broadcaster` instance allocated; it's never `.aclose()`ed, which makes it tough to determine (and thus trace) when all subscriber-tasks are actually complete and exited-from-`.subscribe()`.. - i was shield waiting `.ipc._server.Server.wait_for_no_more_peers()` in `._runtime.async_main()`'s shutdown sequence which would then compound the issue resulting in a SIGINT-shielded hang.. the worst kind XD Actual changes here are just styling, printing, and some mucking with passing the `an`-ref up to the parent task in the root-actor where i was doing a conditional `ActorNursery.cancel()` to mk sure that was actually the problem. Presuming this is fixed the `.pause()` i left unmasked should never hit.	2025-08-18 21:07:06 -04:00
Tyler Goodlet	edc2211444	Go multi-line-style tuples in `maybe_enter_context()` Allows for an inline comment of the first "cache hit" bool element.	2025-08-18 20:55:18 -04:00
Bd	b05abea51e	Merge pull request #390 from goodboy/strict_egs_everywhere Strict egs everywhere: drop use of `strict_exception_groups=False` throughout!	2025-08-18 14:15:49 -04:00
Tyler Goodlet	88c1c083bd	Add timeout to inf-streamer test	2025-08-18 13:31:15 -04:00
Tyler Goodlet	b096867d40	Remove lingering seg=False-flags from tests	2025-08-18 12:03:32 -04:00
Tyler Goodlet	a3c9822602	Remove lingering seg=False-flags from examples	2025-08-18 12:03:10 -04:00
Tyler Goodlet	e3a542f2b5	Never shield-wait `ipc_server.wait_for_no_more_peers()` As mentioned in prior testing commit, it can cause the worst kind of hangs, the SIGINT ignoring kind.. Pretty sure there was never any reason outside some esoteric multi-actor debugging case, and pretty sure that already was solved?	2025-08-18 10:46:37 -04:00
Tyler Goodlet	0ffcea1033	Adjust `test_trio_prestarted_task_bubbles()` suite to expect non-eg raises	2025-08-18 10:46:37 -04:00

1 2 3 4 5 ...

2268 Commits (548855b4f527620ab0c28506d981816385b1f7c8) All Branches Search

2268 Commits (548855b4f527620ab0c28506d981816385b1f7c8)

All Branches