Add a `--spawn-backend` option which can be set to one of {'mp',
'trio_run_in_process'} which will either run the test suite using the
`multiprocessing` or `trio-run-in-process` backend respectively.
Currently trying to run both in the same session can result in hangs
seemingly due to a lack of cleanup of forkservers / resource trackers
from `multiprocessing` which cause broken pipe errors on occasion (no
idea on the details).
For `test_cancellation.py::test_nested_multierrors`, use less nesting
when mp is used since it breaks if we push it too hard with the
whole recursive subprocess spawning thing...
Set `trio-run-in-process` as the default on *nix systems and
`multiprocessing`'s spawn method on Windows. Enable overriding the
default choice using `tractor._spawn.try_set_start_method()`. Allows
for easy runs of the test suite using a user chosen backend.
It seems that mixing the two backends in the test suite results in hangs
due to lingering forkservers and resource managers from
`multiprocessing`? Likely we'll need either 2 separate CI runs to work
or someway to be sure that these lingering servers are killed in between
tests.
This took a ton of tinkering and a rework of the actor nursery tear down
logic. The main changes include:
- each subprocess is now spawned from inside a trio task
from one of two containing nurseries created in the body of
`tractor.open_nursery()`: one for `run_in_actor()` processes and one for
`start_actor()` "daemons". This is to address the need for
`trio-run-in_process.open_in_process()` opening a nursery which must
be closed from the same task that opened it. Using this same approach
for `multiprocessing` seems to work well. The nurseries are waited in
order (rip actors then daemon actors) during tear down which allows
for avoiding the recursive re-entry of `ActorNursery.wait()` handled
prior.
- pull out all the nested functions / closures that were in
`ActorNursery.wait()` and move into the `_spawn` module such that
that process shutdown logic takes place in each containing task's
code path. This allows for vastly simplifying `.wait()` to just contain an
event trigger which initiates process waiting / result collection.
Likely `.wait()` should just be removed since it can no longer be used
to synchronously wait on the actor nursery.
- drop `ActorNursery.__aenter__()` / `.__atexit__()` and move this
"supervisor" tear down logic into the closing block of `open_nursery()`.
This not only cleans makes the code more comprehensible it also
makes our nursery implementation look more like the one in `trio`.
Resolves#93
Get a few more things working:
- fail reliably when remote module loading goes awry
- do a real hacky job of module loading using `sys.path` stuffsies
- we're still totally borked when trying to spin up and quickly cancel
a bunch of subactors...
It's a small move forward I guess.
Well...after enough `# type: ignore`s `mypy` is happy, and after enough clicking of *rerun build* the windows CI passed so I think this is prolly gtg peeps!
Prepend the actor and task names in each log emission. This makes
debugging much more sane since you can see from which process and
running task the log message originates from!
Resolves#13
Another step toward having a complete test for #89.
Subactor breadth still seems to cause the most havoc and is why I've
kept that value to just 2 for now.
If a nursery fails to cancel (some sub-actors presumably) then hard kill
the whole process tree to avoid hangs during a catastrophic failure.
This logic may get factored out (and changed) as we introduce custom
supervisor strategies.
Add a test to verify that `trio.MultiError`s are properly propagated up
a simple actor nursery tree. We don't have any exception marshalling
between processes (yet) so we can't validate much more then a simple
2-depth tree. This satisfies the final bullet in #43.
Note I've limited the number of subactors per layer to around 5 since
any more then this seems to break the `multiprocessing` forkserver;
zombie subprocesses seem to be blocking teardown somehow...
Also add a single depth fast fail test just to verify that it's the
nested spawning that triggers this forkserver bug.
`trio.MultiError` isn't an `Exception` (derived instead from
`BaseException`) so we have to specially catch it in the task
invocation machinery and ship it upwards (like regular errors)
since nurseries running in sub-actors can raise them.