As mentioned in prior commits there's currently a bug in Python that
make async gens **not** task safe. Since this is the core cause of almost
all recent problems, instead implement our own async iterator derivative of
`trio.abc.ReceiveChannel` by wrapping a `trio._channel.MemoryReceiveChannel`.
This fits more natively with the memory channel API in ``trio`` and adds
potentially more flexibility for possible bidirectional inter-actor streaming
in the future.
Huge thanks to @oremanj and of course @njsmith for guidance on this one!
For now stop `.aclose()`-ing all async gens on portal close since it can
cause hangs and other weird behaviour if another task operates on the
same instance.
See https://bugs.python.org/issue32526.
Turns out you get a bad situation if the target actor who's task you're
trying to cancel has already died (eg. from an external
`KeyboardInterrupt` or other error) and so we need to eventually bail on
the RPC request. Also don't bother closing the channel created in
`open_portal()` manually since the cancel scope should take care of all
that.
Use the new `Actor.cancel_task()` api to remotely cancel streaming
tasks spawned by a portal. This guarantees that if an actor is
cancelled all its (remote) portal spawned tasks will be as well.
On portal teardown only cancel all async
generator calls (though we should cancel all RPC requests in general
eventually) and don't close the channel since it may have been passed
in from some other context that wishes to keep it connected. In
`open_portal()` run the message loop shielded so that if the local
task is cancelled, messaging will continue until the internal scope
is cancelled at end of block.
Use the new custom error types throughout the actor and portal
primitives and set a few new rules:
- internal errors are any error not raised by an rpc task and are
**not** forwarded to portals but instead are raised directly in
the msg loop.
- portals always re-raise a "main task" error for every call to
``Portal.result()``.
This is purely for documentation purposes for now as it should be
obvious a bunch of the signatures aren't using the correct "generics"
syntax (i.e. the use of `(str, int)` instead of `typing.Tuple[str, int])`)
in a bunch of places. We're also not using a type checker yet and besides,
`trio` doesn't really expose a lot of its internal types very well.
2SQASH
This ensures that internal errors received from a remote actor are
indeed raised even in the `MainProcess` **before** comms tasks are
cancelled. Internal error in this case means any error packet received
on a channel that doesn't have a `cid` header. RPC errors (which **do**
have a `cid` header) are still forwarded to the consuming caller as usual.
Start a forkserver once in the main (parent-most) process
and pass ipc info (fds) to subprocesses manually such that embedded
calls to `multiprocessing.Process.start()` just work. Note that this
relies on our overridden version of the stdlib's
`multiprocessing.forkserver` module.
Resolves#6
Stop worrying about a "main task" in each actor and instead add an
additional `ActorNursery.run_in_actor()` method which wraps calls
to create an actor and run a lone RPC task inside it. Note this
adjusts the public API of `ActorNursery.start_actor()` to drop
its `main` kwarg.
The dirty deats of making this possible:
- each spawned RPC task is now tracked with a specific cancel scope such
that when the actor is cancelled all ongoing responders are cancelled
before any IPC/channel machinery is closed (turns out that spawning
new actors from `outlive_main=True` actors was probably borked before
finally getting this working).
- make each initial RPC response be a packet which describes the
`functype` (eg. `{'functype': 'asyncfunction'}`) allowing for async
calls/submissions by client actors (this was required to make
`run_in_actor()` work - `Portal._submit()` is the new async method).
- hooray we can stop faking "main task" results for daemon actors
- add better handling/raising of internal errors caught in the bowels of
the `Actor` itself.
- drop the rpc spawning nursery; just use the `Actor._root_nursery`
- only wait on `_no_more_peers` if there are existing peer channels that
are actually still connected.
- an `ActorNursery.__aexit__()` now implicitly waits on `Portal.result()` on close
for each `run_in_actor()` spawned actor.
- handle cancelling partial started actors which haven't yet connected
back to the parent
Resolves#24