Commit Graph

715 Commits (a5d27ebcf5950e22f26ff7ff26252fd58bf88e98)

Author SHA1 Message Date
goodboy 4da16325f3
Merge pull request #144 from goodboy/dereg_on_channel_aclose
Fix for dereg failure on manual stream close leading to an internal nursery composition rework.
2020-08-13 13:56:47 -04:00
Tyler Goodlet 451170bb63 Pass explicit kwargs to new discovery test funcs 2020-08-13 13:26:08 -04:00
Tyler Goodlet ec5d443ee5 Always log actor errors 2020-08-13 11:55:22 -04:00
Tyler Goodlet 863a4b7933 Update copyright date 2020-08-13 11:55:03 -04:00
Tyler Goodlet 0c8dcd0ec5 Use allocated arbiter port in local reg test 2020-08-13 11:54:37 -04:00
Tyler Goodlet 1ae0efb033 Make rpc_module_paths a list 2020-08-13 11:53:45 -04:00
Tyler Goodlet 8a995beb6a Docs fixes 2020-08-08 22:29:57 -04:00
Tyler Goodlet 292513b353 Module define default accept addr 2020-08-08 20:58:04 -04:00
Tyler Goodlet b3eba00c3a Appease the great mypy 2020-08-08 20:57:43 -04:00
Tyler Goodlet 42be410076 Handle mp accept_addr 2020-08-08 20:27:43 -04:00
Tyler Goodlet acd5b80f4c Add close channel test with remote arbiter 2020-08-08 15:17:04 -04:00
Tyler Goodlet c821690834 Actor cancellation is now more latent; loosen timeing 2020-08-08 15:16:10 -04:00
Tyler Goodlet 7f74182a8a Never allow more then info logging in daemon; causes blocking 2020-08-08 15:15:43 -04:00
Tyler Goodlet 8477d21499 Restructure actor runtime nursery scoping
In an effort acquire more deterministic actor cancellation,
this adds a clearer and more resilient (whilst possibly a bit
slower) internal nursery structure with explicit semantics for
clarifying the task-scope shutdown sequence.

Namely, on cancellation, the explicit steps are now:
- cancel all currently running rpc tasks and wait
  for them to complete
- cancel the channel server and wait for it to complete
- cancel the msg loop for the channel with the immediate parent
- de-register with arbiter if possible
- wait on remaining connections to release
- exit process

To accomplish this add a new nursery called the "service nursery" which
spawns all rpc tasks **instead of using** the "root nursery". The root
is now used solely for async launching the msg loop for the primary
channel with the parent such that it is (nearly) the last thing torn
down on cancellation.

In the future it should also be possible to have `self.cancel()` return
a result to the parent once the runtime is sure that the rest of the
shutdown is atomic; this would allow for a true unbounded shield in
`Portal.cancel_actor()`. This will likely require that the error
handling blocks in `Actor._async_main()` are moved "inside" the root
nursery block such that the msg loop with the parent truly is the last
thing to terminate.
2020-08-08 14:55:41 -04:00
Tyler Goodlet 90c7fa6963 Allow shielding in `open_portal()` 2020-08-08 14:47:52 -04:00
Tyler Goodlet 532429aec9 Harden `trio` spawner process waiting
Always shield waiting for he process and always run
``trio.Process.__aexit__()`` on teardown. This enforces
that shutdown happens to due cancellation triggered inside
the sub-actor instead of the process being killed externally
by the parent.
2020-08-08 14:43:25 -04:00
Tyler Goodlet fe45d99f65 Allow opening a portal through an existing channel 2020-08-07 12:02:06 -04:00
Tyler Goodlet ae8488a578 Always shield de-register step with arbiter 2020-08-07 11:36:26 -04:00
Tyler Goodlet 3a868fec30 Cancel root nursery to trigger failure
The real issue is if the root nursery gets cancelled prior to
de-registration with the arbiter. This doesn't seem easy to
reproduce by side effect of a KBI however that is how it was
discovered in practise.
2020-08-07 11:34:17 -04:00
Tyler Goodlet d2d8860dad Add test for dereg failure on manual stream close
There was code from the last de-registration fix PR that I had commented
(to do with shielding arbiter dereg steps in `Actor._async_main()`) because
the block didn't seem to make a difference under infinite streaming
tests. Turns out it **for sure** is needed under certain conditions (likely
if the actor's root nursery is cancelled prior to actor nursery exit).
This was an attempt to simulate the failure mode if you manually close the
stream **before** cancelling the containing **actor**.

More tests to come I guess.
2020-08-07 09:16:01 -04:00
Guillermo Rodriguez 8da45eedf4
Merge pull request #143 from goodboy/ensure_deregister
Ensure actors de-register with arbiter when cancelled during infitinite streaming.
2020-08-04 12:19:02 -03:00
Tyler Goodlet 09ae51900d Better clarify uid comment 2020-08-04 09:52:49 -04:00
Tyler Goodlet 4f92cfe74f Don't `.aclose` `trio` processes until the very end
Trio will kill subprocesses via `Process.__aexit__()` using a `finally:`
block (which, yes, will get triggered on cancellation) so we avoid that
until true process "tear down" since subactors do many things during
graceful shutdown (such as de-registering from the name discovery
system). Oddly this only seems to be an issue during cancellation of
infinite stream consumption.

Resolves #141
2020-08-03 18:57:00 -04:00
Tyler Goodlet ae9016c06a Log on KBI cancelled termination 2020-08-03 18:46:18 -04:00
Tyler Goodlet a24c6bfdd2 Correctly catch cancelled nursery case (purely for logging) 2020-08-03 18:44:50 -04:00
Tyler Goodlet 56b81f07e5 Return `Dict[Tuple, Tuple]` from `.get_registry()` 2020-08-03 18:42:23 -04:00
Tyler Goodlet fbd68d2d91 Allow for tuple keys with std `msgpack` 2020-08-03 18:41:21 -04:00
Tyler Goodlet a5279f80a7 Actually reproduce the de-registration problem
This truly reproduces #141. It turns out the problem only occurs when
we're cancelled in the middle of consuming "infinite streams".
Good news is this tests a lot of edge cases :)
2020-08-03 18:28:09 -04:00
Tyler Goodlet 699bfd1857 Run unreg on cancel tests with remote arbiter as well 2020-08-03 15:41:41 -04:00
Tyler Goodlet 639299e6eb Expose a `.get_registry()` method on the arbiter 2020-08-03 15:40:41 -04:00
Tyler Goodlet 2ccaa94c60 Move daemon fixture up to conftest 2020-08-03 15:39:54 -04:00
Tyler Goodlet 0d9483376d Test cancel with SIGINT on non-windows as well 2020-08-03 13:01:56 -04:00
Tyler Goodlet cd2d8c217a Test that subactors deregister on cancel 2020-08-03 12:53:03 -04:00
goodboy a399bd3033
Merge pull request #133 from guilledk/drop_cloudpickle
Drop cloudpickle dependency
2020-07-29 18:24:27 -04:00
Guillermo Rodriguez 3e29fcf1ea
Docstring to the top\!, and redundant spaces goodbye\! 2020-07-29 15:39:38 -03:00
Guillermo Rodriguez a565d38251
Merge pull request #2 from goodboy/start_up_sequence_trickery
Start up sequence trickery
2020-07-29 15:02:51 -03:00
Tyler Goodlet da56d0f043 Add slight delays to SIGINT tests on mp 2020-07-29 13:27:15 -04:00
Tyler Goodlet 8f17c89cf9 Skip **every** quad test for mp on ci 2020-07-29 10:26:19 -04:00
Tyler Goodlet 9a40291d4a Repair startup sequence around parent state transfer
In order to have reliable subactor startup we need the following
sequence to take place:
- connect to the parent actor, handshake and receive runtime state
- load exposed modules into memory
- start the channel server up fully using the provided bind address
- finally, start processing new messages from the parent

Add a bunch more comments to clarify all this.
2020-07-28 22:25:22 -04:00
Guillermo Rodriguez 0a5691e0a8
Removed arbiter_addr local, and bind_addr is now passed through channel, in early child actor init. 2020-07-28 11:55:11 -03:00
Guillermo Rodriguez 8b44ec7a5d
Actually dropping the cloudpickle dependency from setup.py 2020-07-27 21:10:04 -03:00
Guillermo Rodriguez ef053eb070
Added named arguments to child init, and now passing less of them. 2020-07-27 21:05:00 -03:00
Guillermo Rodriguez e5dbf14ec3
Onlt await params in trio mode 2020-07-27 15:20:55 -03:00
Guillermo Rodriguez 2a407be532
Now passing additional initialization parameters through channel early after handshake. 2020-07-27 14:55:37 -03:00
goodboy 2cc4d7ce04
Merge pull request #135 from goodboy/fix_win_ci_again
Fix windows CI, again.
2020-07-27 13:19:01 -04:00
Tyler Goodlet 5715fd4599 Skip streaming tests 2020-07-27 12:20:46 -04:00
Tyler Goodlet e8a38e4d15 Fix cancelled type handling 2020-07-27 11:15:05 -04:00
goodboy ed96672136
Merge pull request #128 from goodboy/flaky_tests
Drop trio-run-in-process,  use pure trio process spawner, test out of channel ctrl-c subactor cancellation
2020-07-26 23:59:58 -04:00
Tyler Goodlet 3c7ec72f8e Fix SIGINT test names 2020-07-26 23:37:44 -04:00
Tyler Goodlet 5a27065a10 Finally tame the super flaky tests
- ease up on first stream test run deadline
- skip streaming tests in CI for mp backend, period
- give up on > 1 depth nested spawning with mp
- completely give up on slow spawning on windows
2020-07-26 22:53:40 -04:00