Add `expect_timeout: float` param to `_spawn()`
so individual tests can tune `pexpect` timeouts
instead of relying on the hard-coded 3/10 split.
Deats,
- default to 4s, bump by +6 on non-linux CI.
- use walrus `:=` to capture resolved timeout and assert
`spawned.timeout == timeout` for sanity.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
I started getting annoyed by all the warnings from `pytest` during work
on macos suport in CI, so this replaces all `Actor.uid`/`Channel.uid`
accesses with `.aid.uid` (or `.aid.reprol()` for log msgs) across the
core runtime and IPC subsystems to avoid the noise.
This also provides incentive to start the adjustment to all
`.uid`-holding/tracking internal `dict`-tables/data-structures to
instead use `.msg.types.Aid`. Hopefully that will come a (vibed?) follow
up shortly B)
Deats,
- `._context`: swap all `self._actor.uid`, `self.chan.uid`,
and `portal.actor.uid` refs to `.aid.uid`; use
`.aid.reprol()` for log/error formatting.
- `._rpc`: same treatment for `actor.uid`, `chan.uid` in
log msgs and cancel-scope handling; fix `str(err)` typo
in `ContextCancelled` log.
- `._runtime`: update `chan.uid` -> `chan.aid.uid` in ctx
cache lookups, RPC `Start` msg, registration and
cancel-request handling; improve ctxc log formatting.
- `._spawn`: replace all `subactor.uid` with
`.aid.uid` for child-proc tracking, IPC peer waiting,
debug-lock acquisition, and nursery child dict ops.
- `._supervise`: same for `subactor.uid` in cancel and
portal-wait paths; use `actor.aid.uid` for error dict.
- `._state`: fix `last.uid` -> `last.aid.uid` in
`current_actor()` error msg.
Also,
- `._chan`: make `Channel.aid` a proper `@property` backed
by `._aid` so we can add validation/typing later.
- `.log`: use `current_actor().aid.uuid` instead of
`.uid[1]` for actor-uid log field.
- `.msg.types`: add TODO comment for `Start.aid` field
conversion.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Event on linux i was noticing lotsa false negatives based on sub
teardown race conditions, so this tries to both make way for
(eventually?) expanding the set of suite cases and ensure the current
ones are more reliable on every run.
The main change is to hange the `error_in_child=False` case to use
parent-side-cancellation via a new `trio.move_on_after(timeout)` instead
of `actor.cancel_soon()` (which is now toggled by a new `self_cancel:
bool` but unused rn), and add better teardown assertions.
Low level deats,
- add `rent_cancel`/`self_cancel` params to
`crash_and_clean_tmpdir()` for different cancel paths;
default to `rent_cancel=True` which just sleeps forever
letting the parent's timeout do the work.
- use `trio.move_on_after()` with longer timeouts per
case: 1.6s for error, 1s for cancel.
- use the `.move_on_after()` cancel-scope to assert `.cancel_called`
pnly when `error_in_child=False`, indicating we
parent-graceful-cancelled the sub.
- add `loglevel` fixture, pass to `open_nursery()`.
- log caught `RemoteActorError` via console logger.
- add `ids=` to parametrize for readable test names.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Add a 6s timeout guard around `test_streaming_to_actor_cluster()`
to catch hangs, and nest the `async with` block inside it.
Found this when running `pytest tests/ --tpt-proto uds`.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Sync the inline "infected asyncio" echo-server example
with the new `LinkedTaskChannel` iface from prior commits.
- `to_trio`/`from_trio` params -> `chan: LinkedTaskChannel`
- use `chan.started_nowait()`, `.send_nowait()`, `.get()`
- swap yield order to `(chan, first)`
- update blurb to describe the new unified channel API
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Deliver `(LinkedTaskChannel, Any)` instead of the prior `(first, chan)`
order from `open_channel_from()` to match the type annotation and be
consistent with `trio.open_*_channel()` style where the channel obj
comes first.
- flip `yield first, chan` -> `yield chan, first`
- update type annotation + docstring to match
- swap all unpack sites in tests and examples
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Address valid findings from copilot's PR #413 review
(https://github.com/goodboy/tractor/pull/413
#pullrequestreview-3925876037):
- `.get()` docstring referenced non-existent
`._from_trio` attr, correct to `._to_aio`.
- `.send()` docstring falsely claimed error-raising
on missing `from_trio` arg; reword to describe the
actual `.put_nowait()` enqueue behaviour.
- `.open_channel_from()` return type annotation had
`tuple[LinkedTaskChannel, Any]` but `yield` order
is `(first, chan)`; fix annotation + docstring to
match actual `tuple[Any, LinkedTaskChannel]`.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Convert every remaining `to_trio`/`from_trio` fn-sig style
to the new unified `chan: LinkedTaskChannel` iface added in
prior commit (c46e9ee8).
Deats,
- `to_trio.send_nowait(val)` (1st call) -> `chan.started_nowait(val)`
- `to_trio.send_nowait(val)` (subsequent) -> `chan.send_nowait(val)`
- `await from_trio.get()` -> `await chan.get()`
Converted fns,
- `sleep_and_err()`, `push_from_aio_task()` in
`tests/test_infected_asyncio.py`
- `sync_and_err()` in `tests/test_root_infect_asyncio.py`
- `aio_streamer()` in
`tests/test_child_manages_service_nursery.py`
- `aio_echo_server()` in
`examples/infected_asyncio_echo_server.py`
- `bp_then_error()` in `examples/debugging/asyncio_bp.py`
Also,
- drop stale comments referencing old param names.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
This change is masked out now BUT i'm leaving it in for reference.
I was debugging a multi-actor fault where the primary source actor was
an infected-aio-subactor (`brokerd.ib`) and it seemed like the REPL was only
entering on the `trio` side (at a `.open_channel_from()`) and not
eventually breaking in the `asyncio.Task`. But, since (changing
something?) it seems to be working now, it's just that the `trio` side
seems to sometimes handle before the (source/causing and more
child-ish) `asyncio`-task, which is a bit odd and not expected..
We could likely refine (maybe with an inter-loop-task REPL lock?) this
at some point and ensure a child-`asyncio` task which errors always
grabs the REPL **first**?
Lowlevel deats/further-todos,
- add (masked) `maybe_open_crash_handler()` block around
`asyncio.Task` execution with notes about weird parent-addr
delivery bug in `test_sync_pause_from_aio_task`
* yeah dunno what that's about but made a bug; seems to be IPC
serialization of the `TCPAddress` struct somewhere??
- add inter-loop lock TODO for avoiding aio-task clobbering
trio-tasks when both crash in debug-mode
Also,
- change import from `tractor.devx.debug` to `tractor.devx`
- adjust `get_logger()` call to use new implicit mod-name detection
added to `.log.get_logger()`, i.e. sin `name=__name__`.
- some teensie refinements to `open_channel_from()`:
* swap return type annotation for to `tuple[LinkedTaskChannel, Any]`
(was `Any`).
* update doc-string to clarify started-value delivery
* add err-log before `.pause()` in what should be an unreachable path.
* add todo to swap the `(first, chan)` pair to match that of ctx..
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
With methods to comms similar to those that exist for the `trio` side,
- `.get()` which proxies verbatim to the `._to_aio: asyncio.Queue`,
- `.send_nowait()` which thin-wraps to `._to_trio: trio.MemorySendChannel`.
Obviously the more correct design is to break up the channel type into
a pair of handle types, one for each "side's" task in each event-loop,
that's hopefully coming shortly in a follow up patch B)
Also,
- fill in some missing doc strings, tweak some explanation comments and
update todos.
- adjust the `test_aio_errors_and_channel_propagates_and_closes()` suite
to use the new `chan` fn-sig-API with `.open_channel_from()` including
the new methods for msg comms; ensures everything added here works e2e.
Reorganize existing msg-related test suites under
a new `tests/msg/` subdir (matching `tests/devx/`
and `tests/ipc/` convention) and add unit tests for
the `_`-prefixed field filtering in `pformat()`.
Deats,
- `git mv` `test_ext_types_msgspec` and `test_pldrx_limiting` into
`tests/msg/`.
- add `__init__.py` + `conftest.py` for the new test sub-pkg.
- add new `test_pretty_struct.py` suite with 8 unit tests:
- parametrized field visibility (public shown, `_`-private hidden,
mixed)
- direct `iter_struct_ppfmt_lines()` assertion
- nested struct recursion filtering
- empty struct edge case
- real `MsgDec` via `mk_dec()` hiding `_dec`
- `repr()` integration via `Struct.__repr__()`
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Skip fields starting with `_` in pretty-printed struct output
to avoid cluttering displays with internal/private state (and/or accessing
private properties which have errors Bp).
Deats,
- add `if k[0] == '_': continue` check to skip private fields
- change nested `if isinstance(v, Struct)` to `elif` since we
now have early-continue for private fields
- mv `else:` comment to clarify it handles top-level fields
- fix indentation of `yield` statement to only output
non-private, non-nested fields
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Drop the separate `testing-macos` job and add
`macos-latest` to the existing OS matrix; bump
timeout to 16 min to accommodate macOS runs.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
- add `"Operating System :: MacOS"` classifier.
- add macOS bullet to README's TODO/status section.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
It seems something is up with their VM-img or wtv bc i keep increasing
the subproc timeout and nothing is changing. Since i can't try
a `-xlarge` one without paying i'm just muting this test for now.
- convert all doc-strings to `'''` multiline style.
- rename `nursery` -> `an`, `n` -> `tn` to match
project-wide conventions.
- add type annotations to fn params (fixtures, test
helpers).
- break long lines into multiline style for fn calls,
assertions, and `parametrize` decorator lists.
- add `ids=` to `@pytest.mark.parametrize`.
- use `'` over `"` for string literals.
- add `from typing import Callable` import.
- drop spurious blank lines inside generators.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Via ensuring `all(mark.args)` on wtv expressions are arg-passed to the
mark decorator; use it to skip the `test_subactor_breakpoint` suite when
`ctlc=True` since it seems too unreliable in CI.
There's a very sloppy registrar-actor-bootup syncing approach used in
this fixture (basically just guessing how long to sleep to wait for it
to init and bind the registry socket) using a `global _PROC_SPAWN_WAIT`
that needs to be made more reliable. But, for now i'm just playing along
with what's there to try and make less CI runs flaky by,
- sleeping *another* 1s when run from non-linux CI.
- reporting stdout (if any) alongside stderr on teardown.
- not strictly requiring a `proc.returncode == -2` indicating successful
graceful cancellation via SIGINT; instead we now error-log and only
raise the RTE on `< 0` exit code.
* though i can't think of why this would happen other then an
underlying crash which should propagate.. but i don't think any test
suite does this intentionally rn?
* though i don't think it should ever happen, having a CI run
"error"-fail bc of this isn't all that illuminating, if there is
some weird `.returncode == 0` termination case it's likely not
a failure?
For later, see the new todo list; we should sync to some kind of "ping"
polling of the tpt address if possible which is already easy enough for
TCP reusing an internal closure from `._root.open_root_actor()`.
Namely, after trying to get `test_multi_daemon_subactors` to work for
the `ctlc=True` case (for way too long), give up on that (see
todo/comments) and skip it; the normal case works just fine. Also tweak
the `test_ctxep_pauses_n_maybe_ipc_breaks` pattern matching for
non-`'UDS'` per the previous script commit; we can't use UDS alongside
`pytest`'s tmp dir generation, mega lulz.
To be a null default and set to `0.1` when not passed by the caller so
as to avoid having to pass `0.1` if you wanted the
param-defined-default.
Also,
- in the `spawn()` fixtures's `unset_colors()` closure, add in a masked
`os.environ['NO_COLOR'] = '1'` since i found it while trying to debug
debugger tests.
- always return the `child.before` content from `assert_before()`
helper; again it comes in handy when debugging console matching tests.
It's explained in the comment and i really think it's getting more
hilarious the more i learn about the arbitrary limitations of user space
with this tina platform.