Commit Graph

1110 Commits (moar_timeoutz)

Author SHA1 Message Date
Tyler Goodlet c5c3f7e789 Use `tractor.Context` throughout the runtime core
Instead of tracking feeder mem chans per RPC dialog, store `Context`
instances which (now) hold refs to the underlying RPC-task feeder chans
and track them inside a `Actor._contexts` map. This begins a transition
to making the "context" idea the primitive abstraction for representing
messaging dialogs between tasks in different memory domains (i.e.
usually separate processes).

A slew of changes made this possible:
- change `Actor.get_memchans()` -> `.get_context()`.
- Add new `Context._send_chan` and `._recv_chan` vars.
- implicitly create a new context on every `Actor.send_cmd()` call.
- use the context created by `.send_cmd()` in `Portal.open_context()`
  instead of manually creating one.
- call `Actor.get_context()` inside tasks run from `._invoke()`
  such that feeder chans are implicitly created for callee tasks
  thus fixing the bug #265.

NB: We might change some of the internal semantics to do with *when* the
feeder chans are actually created to denote whether or not a far end
task is actually *read to receive* messages. For example, in the cases
where it **never** will be ready to receive messages (one-way streaming,
a context that never opens a stream, etc.) we will likely want some kind
of error or at least warning to the caller that messages can't be sent
(yet).
2021-12-03 14:49:55 -05:00
Tyler Goodlet 3f6099f161 Add a double started error checking test 2021-12-03 10:08:55 -05:00
Tyler Goodlet 568902a5a9 Add test for #265: "msg sent before stream opened"
This always triggered the mentioned race condition.
We need to figure out the best approach to avoid this case.
2021-12-03 10:08:55 -05:00
Tyler Goodlet f4793af2b9 Error on mal-use of `Context.started()`
Previously we were ignoring a race where the callee an opened task
context could enter `Context.open_stream()` before calling `.started().
Disallow this as well as calling `.started()` more then once.
2021-12-03 10:08:55 -05:00
goodboy ae6d751d71
Merge pull request #267 from goodboy/acked_remote_cancels
Acked remote cancels
2021-12-03 09:51:41 -05:00
Tyler Goodlet 94a3cc532c Add nooz 2021-12-02 18:09:07 -05:00
Tyler Goodlet 08e9593306 Suppress broken resources errors in `Portal.cancel_actor()` 2021-12-02 15:29:04 -05:00
Tyler Goodlet 14f84571fb Don't cancel receive streams inside `.cancel_actor()`
We don't need to any more presuming you get ideal remote cancellation
conditions where the remote actor should teardown and kill the streams
from its end.
2021-12-02 15:29:04 -05:00
Tyler Goodlet e561a4908f Appease mypy 2021-12-02 15:29:04 -05:00
Tyler Goodlet a29924f330 Don't assume exception order from nursery 2021-12-02 08:45:58 -05:00
Tyler Goodlet 46070f99de Factor soft-wait logic into a helper, use with mp 2021-12-02 08:18:04 -05:00
Tyler Goodlet d81eb1a51e Finally, deterministic remote cancellation support
On msg loop termination we now check and see if a channel is associated
with a child-actor registered in some local task's nursery. If so, we
attempt to wait on channel closure initiated from the child side (by
draining the underlying msg stream) so as to avoid closing it too early
resulting in the child not relaying its termination status response. This
means we now support the ideal case in 2-general's where we get back the
ack to the closure request instead of just ignoring it and timing out XD

The main implementation detail is that when `Portal.cancel_actor()`
remotely calls `Actor.cancel()` we actually wait for the RPC response
from that request before allowing the channel shutdown sequence to
engage. The new msg stream draining support enables this.

Also, factor child-to-parent error propagation logic into a helper func
and improve some docs (yeah yeah y'all don't like the ''', i don't
care - it makes my eyes not hurt).
2021-12-02 08:18:04 -05:00
Tyler Goodlet d817f1a658 Add a nursery "exited" signal
Use a `trio.Event` to enable nursery closure detection such that core
runtime tasks can be notified when a local nursery exits and allow
shutdown protocols to operate without close-before-terminate issues
(such as IPC channel closure during remote peer cancellation).
2021-12-02 08:18:04 -05:00
Tyler Goodlet a23afb0bb8 Set channel cancel called flag on cancel requests 2021-12-02 08:18:04 -05:00
Tyler Goodlet 1976e61d1a Add `.drain()` support to msg streams
Enables "draining" the last set of messages after a channel/stream has
been terminated mostly for the purposes of receiving a final ACK to
a remote cancel command. Also, add an internal `Channel._cancel_called`
flag which can be set by `Portal.cancel_actor()`.
2021-12-02 08:18:04 -05:00
Tyler Goodlet 0ac3397dbb Only soft-acquire debug lock if a proc was spawned 2021-12-02 08:17:03 -05:00
Tyler Goodlet 62b2867e07 Tweak doc strings 2021-12-02 08:16:49 -05:00
Tyler Goodlet bf6958cdbe Handle cancelled-before-proc-created spawn case
It's definitely possible to have a nursery spawn task be cancelled
before a `trio.Process` handle is ever returned; we now handle this
case as a cancelled-during-spawn scenario. Zombie collection logic
also is bypassed in this case.
2021-12-02 08:16:05 -05:00
goodboy d05885d650
Merge pull request #266 from goodboy/faster_daemon_cancels
Faster graceful daemon cancels
2021-11-30 09:29:13 -05:00
Tyler Goodlet 77fc705b1f Add nooz 2021-11-29 22:52:19 -05:00
Tyler Goodlet 16a3321a38 Increase timeout for windows.. 2021-11-29 21:52:30 -05:00
Tyler Goodlet 7eb465a699 Graceful cancel actors before hard reaping 2021-11-29 16:03:23 -05:00
Tyler Goodlet 121f7fd844 Draft test that shows a slow daemon cancellation
Currently if the spawn task is waiting on a daemon actor it is likely in
`await proc.wait()`, however, if the actor nursery is subsequently
cancelled this checkpoint will be abandoned and the hard proc reaping
sequence will execute which results in a up to 3 second wait before
a "hard" system signal is sent to the child.  Ideally such
a cancelled-during-daemon-actor-wait condition is instead handled by
first trying to cancel the remote actor using `Portal.cancel_actor()` (a
"graceful" remote cancel request) which should (presuming normal runtime
operation) result in an immediate collection of the process after normal
actor (remotely triggered) runtime cancellation.
2021-11-29 16:03:14 -05:00
goodboy ac821bdd94
Merge pull request #264 from goodboy/runinactor_none_result
Fix `Portal.run_in_actor()` returns `None` result
2021-11-29 09:21:24 -05:00
Tyler Goodlet f6de7e0afd Factor out msg unwrapping into a func 2021-11-29 08:46:35 -05:00
Tyler Goodlet 0e7234aa68 Cache the return message instead of the value
Thanks to @richardsheridan for pointing out the limitations of using
*any* kind of value as the result-cached-flag and how it might cause
problems for anyone returning pickled blob-data. This changes the
`Portal` internal result value tracking to stash the full message from
which the value can be retrieved by any `Portal.result()` caller.
The internal change is that `Portal._return_once()` now returns a tuple
of the message *and* its value.
2021-11-29 07:44:44 -05:00
Tyler Goodlet 83da92d4cb Add nooz 2021-11-28 19:16:47 -05:00
Tyler Goodlet 57e98b25e7 Increase timeout, windows... 2021-11-20 13:08:19 -05:00
Tyler Goodlet 095c94b1d2 Fix `Portal.run_in_actor()` returns `None` bug
Fixes the issue where if the main remote task returns `None`,
`Portal.result()` would erroneously wait again on the underlying feeder
mem chan since `None` was being used as the cache flag. Instead set the
flag as the channel uid and consider the result collected when set to
anything else (since it would be odd to return that value from a remote
task when you already can read it as part of portal/channel apis).
2021-11-20 13:02:08 -05:00
Tyler Goodlet f32ccd76aa Add `Portal.result()` is None test case
This demonstrates a bug where if the remote `.run_in_actor()` task
returns `None` then multiple calls to `Portal.result()` will hang
forever...
2021-11-20 13:02:08 -05:00
goodboy b527fdbe1a
Merge pull request #263 from goodboy/early_deth_fixes
Early deth fixes
2021-11-08 21:23:09 -05:00
Tyler Goodlet 6b0366fe04 Guard against TCP server never started on cancel 2021-11-07 23:49:32 -05:00
Tyler Goodlet dbe5d96d66 Fix missing yield in lock acquirer 2021-11-07 23:48:05 -05:00
goodboy 08fa55a8c3
Merge pull request #260 from goodboy/clusters_and_hot_tips
Clusters and hot tips
2021-11-04 12:02:14 -04:00
Tyler Goodlet 546e1b2fa3 Drop unecessary partial 2021-11-04 10:41:25 -04:00
Tyler Goodlet 94a6fefede Add `open_actor_cluster()` eg. to readme 2021-11-02 15:42:19 -04:00
Tyler Goodlet 74f460eba7 Make auto generated child names <parent_name>.<name> 2021-11-02 15:40:15 -04:00
Tyler Goodlet 4cbb8641de Add an `open_actor_cluster()` usage example 2021-11-02 15:37:36 -04:00
Tyler Goodlet 7efb7da300 Start a hot tips for devs doc 2021-11-02 15:08:20 -04:00
goodboy 2c12d39617
Merge pull request #259 from goodboy/alpha3
Alpha3
2021-11-02 14:47:28 -04:00
Tyler Goodlet 6a063b3814 Bump release date by a day 2021-11-02 12:59:26 -04:00
Tyler Goodlet 9da1abeecd Super naive attempt to skip 3.10 on windows 2021-11-02 12:19:58 -04:00
Tyler Goodlet 3452e18e6d Toss 3.10 into CI 2021-11-01 14:12:42 -04:00
Tyler Goodlet 8fdc548676 Alpha3 version bump and release notes 2021-11-01 14:02:45 -04:00
goodboy 5dbe8e4b14
Merge pull request #241 from goodboy/trionics
Trionics
2021-10-27 13:09:11 -04:00
goodboy 9c13827a14
Merge pull request #256 from overclockworked64/241-news-fragment
Add a news fragment
2021-10-27 12:38:15 -04:00
overclockworked64 6da76949fd
Fix the syntax and point to the new package 2021-10-27 17:03:25 +02:00
overclockworked64 49dd230b4f
Add a newline 2021-10-25 20:01:21 +02:00
overclockworked64 c7f59bd483
Add a news fragment 2021-10-25 19:17:42 +02:00
Tyler Goodlet 083b73ad4a Test: don't grab debug lock if not in mode 2021-10-25 10:22:41 -04:00