This took a ton of tinkering and a rework of the actor nursery tear down
logic. The main changes include:
- each subprocess is now spawned from inside a trio task
from one of two containing nurseries created in the body of
`tractor.open_nursery()`: one for `run_in_actor()` processes and one for
`start_actor()` "daemons". This is to address the need for
`trio-run-in_process.open_in_process()` opening a nursery which must
be closed from the same task that opened it. Using this same approach
for `multiprocessing` seems to work well. The nurseries are waited in
order (rip actors then daemon actors) during tear down which allows
for avoiding the recursive re-entry of `ActorNursery.wait()` handled
prior.
- pull out all the nested functions / closures that were in
`ActorNursery.wait()` and move into the `_spawn` module such that
that process shutdown logic takes place in each containing task's
code path. This allows for vastly simplifying `.wait()` to just contain an
event trigger which initiates process waiting / result collection.
Likely `.wait()` should just be removed since it can no longer be used
to synchronously wait on the actor nursery.
- drop `ActorNursery.__aenter__()` / `.__atexit__()` and move this
"supervisor" tear down logic into the closing block of `open_nursery()`.
This not only cleans makes the code more comprehensible it also
makes our nursery implementation look more like the one in `trio`.
Resolves#93
Get a few more things working:
- fail reliably when remote module loading goes awry
- do a real hacky job of module loading using `sys.path` stuffsies
- we're still totally borked when trying to spin up and quickly cancel
a bunch of subactors...
It's a small move forward I guess.
Prepend the actor and task names in each log emission. This makes
debugging much more sane since you can see from which process and
running task the log message originates from!
Resolves#13
If a nursery fails to cancel (some sub-actors presumably) then hard kill
the whole process tree to avoid hangs during a catastrophic failure.
This logic may get factored out (and changed) as we introduce custom
supervisor strategies.
`trio.MultiError` isn't an `Exception` (derived instead from
`BaseException`) so we have to specially catch it in the task
invocation machinery and ship it upwards (like regular errors)
since nurseries running in sub-actors can raise them.
Add `@tractor.stream` which must be used to denote non async generator
streaming functions which use the `tractor.Context` API to push values.
This enforces a more explicit denotation as well as allows enforcing the
declaration of the `ctx` argument in definitions.
This begins moving toward explicitly decorated "streaming functions"
instead of checking for a `ctx` arg in the signature.
- provide each context with its task's top level `trio.CancelScope`
such that tasks can cancel themselves explictly if needed via calling
`Context.cancel_scope()`
- make `Actor.cancel_task()` a private method (`_cancel_task()`) and
handle remote rpc calls specially such that the caller does not need
to provide the `chan` argument; non-primitive types can't be passed on
the wire and we don't want the client actor be require knowledge of
the channel instance the request is associated with. This also ties into
how we're tracking tasks right now (`Actor._rpc_tasks` is keyed by the
call id, a UUID, *plus* the channel).
- make `_do_handshake` a private actor method
- use UUID version 4
Add full support for using the "spawn" process starting method as per:
https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods
Add a `spawn_method` argument to `tractor.run()` for specifying the
desired method explicitly. By default use the "fastest" method available.
On *nix systems this is the original "forkserver" method.
This should be the solution to getting windows support!
Resolves#60
As mentioned in prior commits there's currently a bug in Python that
make async gens **not** task safe. Since this is the core cause of almost
all recent problems, instead implement our own async iterator derivative of
`trio.abc.ReceiveChannel` by wrapping a `trio._channel.MemoryReceiveChannel`.
This fits more natively with the memory channel API in ``trio`` and adds
potentially more flexibility for possible bidirectional inter-actor streaming
in the future.
Huge thanks to @oremanj and of course @njsmith for guidance on this one!
For now stop `.aclose()`-ing all async gens on portal close since it can
cause hangs and other weird behaviour if another task operates on the
same instance.
See https://bugs.python.org/issue32526.
Use an inner function / closure to properly process required arguments
at call time as is recommended in the `wrap` docs. Do async gen and
arg introspection at decorate time and raise appropriate type errors.
Turns out you get a bad situation if the target actor who's task you're
trying to cancel has already died (eg. from an external
`KeyboardInterrupt` or other error) and so we need to eventually bail on
the RPC request. Also don't bother closing the channel created in
`open_portal()` manually since the cancel scope should take care of all
that.
- when calling the async gen func provided by the user wrap it in
`@async_generator.aclosing` to ensure correct teardown at cancel time
- expect the gen to yield a dict with topic keys and data values
- add a `packetizer` function argument to the api allowing a user
to format the data to be published in whatever way desired
- support using the decorator without the parentheses (using default
arguments)
- use a `wrapt` "adapter" to override the signature presented to the
`_actor._invoke` inspection machinery
- handle the default case where `tasks` isn't provided; allow only one
concurrent publisher task
- store task locks in an actor local variable
- add a comprehensive doc string
Use the new `Actor.cancel_task()` api to remotely cancel streaming
tasks spawned by a portal. This guarantees that if an actor is
cancelled all its (remote) portal spawned tasks will be as well.
On portal teardown only cancel all async
generator calls (though we should cancel all RPC requests in general
eventually) and don't close the channel since it may have been passed
in from some other context that wishes to keep it connected. In
`open_portal()` run the message loop shielded so that if the local
task is cancelled, messaging will continue until the internal scope
is cancelled at end of block.
Enable cancelling specific tasks from a peer actor such that when
a actor task or the actor itself is cancelled, remotely spawned tasks
can also be cancelled. In much that same way that you'd expect a node
(task) in the `trio` task tree to cancel any subtasks, actors should
be able to cancel any tasks they spawn in separate processes.
To enable this:
- track rpc tasks in a flat dict keyed by (chan, cid)
- store a `is_complete` event to enable waiting on specific
tasks to complete
- allow for shielding the msg loop inside an internal cancel scope
if requested by the caller; there was an issue with `open_portal()`
where the channel would be torn down because the current task was
cancelled but we still need messaging to continue until the portal
block is exited
- throw an error if the arbiter tries to find itself for now