Commit Graph

453 Commits (f72eabd42aa32426c4db087ac325ace26bf5cde5)

Author SHA1 Message Date
Tyler Goodlet 14114547e8 Expose `@context` decorator at top level 2021-07-06 08:23:29 -04:00
Tyler Goodlet e3955bb62b Add initial bi-directional streaming
This mostly adds the api described in
https://github.com/goodboy/tractor/issues/53#issuecomment-806258798

The first draft summary:
- formalize bidir steaming using the `trio.Channel` style interface
  which we derive as a `MsgStream` type.
- add `Portal.open_context()` which provides a `trio.Nursery.start()`
  remote task invocation style for setting up and tearing down tasks
  contexts in remote actors.
- add a distinct `'started'` message to the ipc protocol to facilitate
  `Context.start()` with a first return value.
- for our `ReceiveMsgStream` type, don't cancel the remote task in
  `.aclose()`; this is now done explicitly by the surrounding `Context`
   usage: `Context.cancel()`.
- streams in either direction still use a `'yield'` message keeping the
  proto mostly symmetric without having to worry about which side is the
  caller / portal opener.
- subtlety: only allow sending a `'stop'` message during a 2-way
  streaming context from `ReceiveStream.aclose()`, detailed comment
  with explanation is included.

Relates to #53
2021-07-06 08:23:29 -04:00
Tyler Goodlet 6aab16f877 Drop added logging around root cancel 2021-07-04 11:00:08 -04:00
Tyler Goodlet caa70245e0 Try remapping all broken errs wholesale on windows 2021-07-04 10:47:15 -04:00
Tyler Goodlet 3f75732b02 Remap windows specific connection reset error 2021-07-04 10:25:19 -04:00
Tyler Goodlet 1edf5c2f06 Specially remap TCP 104-connection-reset to `TransportClosed`
Since we currently have no real "discovery protocol" between process
trees, the current naive approach is to check via a connect and drop to
see if a TCP server is bound to a particular address during root actor
startup. This was a historical decision and had no real grounding beyond
taking a simple approach to get something working when the project
was first started.

This  is obviously problematic from an error handling perspective since
we need to be able to avoid such quick connect-and-drops from cancelling
an "arbiter"'s (registry actor's) channel-msg loop machinery (which
would propagate and cancel the actor).

For now we map this particular TCP error, which gets remapped by `trio`
as a `trio.BrokenResourceError` to our own internal `TransportClosed`
which is swallowed by channel message loop processing and indicates
a graceful teardown of the far end actor.
2021-07-03 18:57:54 -04:00
Tyler Goodlet a2d400583f Fix tuple type 2021-07-02 18:10:06 -04:00
Tyler Goodlet 32b4ae0603 Accept transport closed error during handshake and msg loop 2021-07-02 11:38:24 -04:00
Tyler Goodlet 80e100f818 Add our own "transport closed" signal
This change some super old (and bad) code from the project's very early
days. For some redic reason i must have thought masking `trio`'s
internal stream / transport errors and a TCP EOF as `StopAsyncIteration`
somehow a good idea. The reality is you probably
want to know the difference between an unexpected transport error
and a simple EOF lol. This begins to resolve that by adding our own
special `TransportClosed` error to signal the "graceful" termination of
a channel's underlying transport. Oh, and this builds on the `msgspec`
integration which helped shed light on the core issues here B)
2021-07-02 11:36:22 -04:00
Tyler Goodlet 73e123bac7 Fix line length 2021-05-07 11:21:40 -04:00
Tyler Goodlet 1584c547cd Drop run and rpc_module_paths from discovery tests 2021-05-07 11:21:40 -04:00
Tyler Goodlet 87971de1d9 Re-raise any sidestepped `trio.Cancelled` 2021-05-06 12:05:17 -04:00
Tyler Goodlet 9f38406e85 Appease mypy 2021-05-06 12:05:17 -04:00
Tyler Goodlet c4b42000eb Shield around root actor cancel 2021-05-06 12:05:17 -04:00
Tyler Goodlet 607c48f1ac Distinctly separate and harden mp spawning
It's clear now that special attention is needed to handle the case where
a spawned `multiprocessing` proc is started but then the parent is
cancelled before the child can connect back; in this case we need to be
sure to kill the near-zombie child asap. This may end up being the
solution to other resiliency issues seen around mp with nested process
trees too. More testing is needed to be sure.

Relates to #84 #89 #134 #146
2021-05-06 12:05:17 -04:00
Tyler Goodlet fc36e73628 Comment out `MsgStream` for now 2021-04-28 16:40:38 -04:00
Tyler Goodlet f59346d854 Add func type checking to `.run_in_actor()` 2021-04-28 12:23:08 -04:00
Tyler Goodlet 86fc418050 Error on bad registry pops 2021-04-28 12:23:08 -04:00
Tyler Goodlet 83af295b45 Fix func type checking 2021-04-28 12:23:08 -04:00
Tyler Goodlet ad9256bcdb Drop stream exhaustion; no longer needed 2021-04-28 12:23:08 -04:00
Tyler Goodlet 3e19fd311b Move debugger locking to new stream api 2021-04-28 12:23:08 -04:00
Tyler Goodlet 80c96cab01 Add a warning for soon to be deprecated `ctx` use in `@stream` func 2021-04-28 12:23:08 -04:00
Tyler Goodlet 36251357b3 Add a new one-way stream API
NB: this is a breaking change removing support for `Portal.run()` being
able to invoke remote streaming functions and instead replacing the
method call with an async context manager api `Portal.open_stream_from()`
This style explicitly defines stream teardown at the call site instead
of expecting the user to handle tricky things correctly themselves: eg.
`async_geneartor.aclosing()`. Going forward `Portal.run()` can be used
only for invoking async functions.
2021-04-28 12:23:08 -04:00
Tyler Goodlet 81f3558494 Formatting 2021-04-28 12:23:08 -04:00
Tyler Goodlet 897ab79946 Add a no runtime error 2021-04-28 12:23:08 -04:00
Tyler Goodlet 7f38b7225d Aggregate and organize streaming components
Move receive stream into streaming modules and rebrand as a "message
stream".  Factor out cancellation mechanics in `.aclose()` into the
`Context` type which will soon provide the api for for cancelling portal
invocations.  Comment-stage a few methods on both types in anticipation
of a new bi-directional streaming api.  Add a `MsgStream` bidirectional
channel type which will be the eventual type yielded from
`Context.open_stream()`.  Adjust the response/dialog types to be the set
`{'asyncfun', 'asyncgen', 'context'}`. OH, and add async func checking
in `Portal.run()` to catch and error on sync funcs early.
2021-04-28 12:23:08 -04:00
Tyler Goodlet d0eacc3fd6 Appease mypy 2021-04-27 12:08:30 -04:00
Tyler Goodlet 89ce1a63e4 Only accept asyncfunc response type 2021-04-27 12:08:30 -04:00
Tyler Goodlet 5798ef6796 Enforce async funcs on callee side, convert arbiter methods 2021-04-27 12:08:30 -04:00
Tyler Goodlet c2a1612bf5 Drop sync function support
You can always wrap a sync function in an async one and there seems to
be no good reason to support invoking them directly especially since
cancellation won't work without some thread hackery. If it's requested
we'll point users to `trio-parallel`.

Resolves #77
2021-04-27 12:08:30 -04:00
Tyler Goodlet be22a2526a Add `Actor.cancel_soon()` for sync self destruct
Add a sync method that can be used to cancel the current actor from
a synchronous context. This is useful in debugging situations where
sync debugger code may need to kill the process tree.

Also, make the internal "lifetime stack" a global var; easier to manage
from client code that may was to add callbacks prior to the actor
runtime being fully setup.
2021-04-27 11:35:28 -04:00
Tyler Goodlet 47565cfbf3 Use root as default name from `tractor.run()` 2021-02-25 08:51:28 -05:00
Tyler Goodlet cd636b270e Update debug tests to expect 'root' actor name 2021-02-24 13:38:20 -05:00
Tyler Goodlet 983e66b31b Add second implicit-runtime-boot branch 2021-02-24 13:13:45 -05:00
Tyler Goodlet b285db4c58 Factor OCA supervisor into new func 2021-02-24 13:13:38 -05:00
Tyler Goodlet 5ffd2d2ab3 Ignore type checks on stdlib overrides 2021-02-21 14:08:23 -05:00
Tyler Goodlet 7888ef6f01 Fix more stdlib typing issues with latest mypy 2021-02-21 12:48:03 -05:00
Tyler Goodlet 109066dda9 Support sync code breakpointing via built-in
Override `breakpoint()` for sync code making it work
properly with `trio` as per:

https://github.com/python-trio/trio/issues/1155#issuecomment-742964018

Relates to #193
2021-02-21 12:36:00 -05:00
Tyler Goodlet 9f4e497b9c Don't shield proc waits 2021-01-14 18:21:26 -05:00
Tyler Goodlet e546ead2ff Pub sub internals type fixes 2021-01-14 18:20:59 -05:00
Tyler Goodlet 3df001f3a9 Fix msg pub global lock sharing
Using `None` as the default key for a `@msg.pub` can cause conflicts if
there is more then one "taskless" (no tasks={,} passed) pub offered on
an actor... So instead use the first trio "task name" (usually just the
function name) instead thus avoiding this very hard to debug and
understand problem.

Probably should throw in a test but I'm super lazy today.
2021-01-14 18:20:49 -05:00
Tyler Goodlet 5ed5d18ccb Begin rpc_module_paths deprecation 2021-01-08 22:08:45 -05:00
Tyler Goodlet 32b10681a1 Drop tractor.run() from @tractor_test 2021-01-08 20:56:03 -05:00
Tyler Goodlet 41a4de5af2 Use actual task name lel 2021-01-08 20:55:42 -05:00
Tyler Goodlet 59421d9f3a Fix some borked tests 2021-01-08 20:55:11 -05:00
Tyler Goodlet 333ddcf93f Can we ever really appease mypy? 2021-01-03 11:18:31 -05:00
Tyler Goodlet 0bb2163b0c Implicitly open root actor on first nursery use. 2021-01-02 21:39:30 -05:00
Tyler Goodlet bd3059f01b Allow for error bypass 2021-01-02 21:39:30 -05:00
Tyler Goodlet 803152ead5 Use explicit named args 2021-01-02 21:39:30 -05:00
Tyler Goodlet e6245671b0 Use runtime level on attach 2021-01-02 21:38:55 -05:00
goodboy bfe500060f
Merge pull request #181 from goodboy/drop_tractor_run
Deprecate `tractor.run()`
2020-12-28 12:53:04 -05:00
Tyler Goodlet 723fb17394 Add deprecation warning to run() 2020-12-27 13:29:30 -05:00
Tyler Goodlet f05534e472 Re-org root actor startup into context manager
This begins the move to dropping support for `tractor.run()` which we
don't really need since the runtime is started (as it always has been)
from a new sub-task / nursery. Instead this introduces starting the
actor tree through a `open_root_actor()` async context manager which
we'll likely implicitly call (from the root) on the first use of an
actor nursery.

Drop `_actor._start_actor()` and factor its contents into this new api.
Make `run()` and `run_daemon()` use `open_root_actor()` until we decide
to remove them.

Relates to #168 and #177
2020-12-27 13:29:30 -05:00
Tyler Goodlet b040cdc0c9 Add null byte guard from mainline 2020-12-27 13:28:54 -05:00
Tyler Goodlet 6b650c0fe6 Add a "runtime" log level 2020-12-26 15:45:45 -05:00
Tyler Goodlet 0d05a727b6 Use error log level by default 2020-12-25 15:28:32 -05:00
Tyler Goodlet c28ffd8b1c Don't exception log multi-cancels 2020-12-25 15:23:59 -05:00
Tyler Goodlet 5d7a4e2b12 Denoise some common teardown "errors" to warnings. 2020-12-25 15:10:20 -05:00
Tyler Goodlet 8522f90000 Add type annots to exceptions mod
Also add a `is_multi_cancelled()` predicate to test for
`trio.MultiError`s that contain entirely cancel signals.

Resolves #125
2020-12-25 15:07:36 -05:00
Tyler Goodlet 4bf9b27f57 Drop all .statespace refs; it was a silly idea 2020-12-22 19:33:16 -05:00
Tyler Goodlet 9fd3c42eb1 Port inter-process method calls to `Portal.run_from_ns()` 2020-12-22 10:39:47 -05:00
Tyler Goodlet 7134f35d6e Add `Portal.run_from_ns()`
It turns out in order to maintain our sneaky little "call an `Actor`
method in this remote process" we still need the ability to invoke
functions from a namespace. We're currently using a "self" namespace as
a way to do this for internal inter-process method calling.  Either way,
I see no reason not to keep a public method for this invoke style (we
just won't market it) since it is still how the machinery works
underneath.
2020-12-22 10:39:47 -05:00
Tyler Goodlet a668f714d5 Allow passing function refs to `Portal.run()`
This resolves and completes #69 allowing all RPC invocation APIs to pass
function references directly instead of explicit `str` names for the
target namespace and function (this is still done implicitly
underneath).  This brings us closer to `trio`'s task running API as well
as acknowledges that any inter-host RPC system (and API) will likely
need to be implemented on top of local RPC primitives anyway. Even if
this ends up **not** being true we can always go to "function stubs" as
part of our IAC protocol or, add a new method to do explicit namespace
calls: `.run_from_module()` or whatever everyone votes on.

Resolves #69

Further, this commit drops `Actor.statespace` from the entire system
since a user can easily get this same functionality using module
level variables. Fix docs to match all these changes (luckily mostly
already done due to example scripts referencing).
2020-12-21 09:09:55 -05:00
Tyler Goodlet 0d67ce4abc Fix collections type import for py3.10 2020-12-18 17:58:07 -05:00
Tyler Goodlet 797bcc1df2 Handle early timeouts on last debugger test 2020-12-17 13:35:45 -05:00
Tyler Goodlet 201771a521 'Fix mypy, change interal type name to `ReceiveStream`, settle on `.shield()`' 2020-12-17 12:01:49 -05:00
Tyler Goodlet 15ead6b561 Add a way to shield a stream's underlying channel
Add a ``tractor._portal.StreamReceiveChannel.shield_channel()`` context
manager which allows for avoiding the closing of an IPC stream's
underlying channel for the purposes of task re-spawning. Sometimes you
might want to cancel a task consuming a stream but not tear down the IPC
between actors (the default). A common use can might be where the task's
"setup" work might need to be redone but you want to keep the
established portal / channel in tact despite the task restart.

Includes a test.
2020-12-16 21:42:28 -05:00
Tyler Goodlet d497078eb7 Appease 3.8 mypy 2020-12-11 20:04:56 -05:00
Tyler Goodlet e51c2620e5 End the `pdb` SIGINT handling madness
Turns out this is a lower level issue in terms of the stdlib's default
`pdb.Pdb` settings and how they conflict with `trio`s cancellation and
KBI handling. The details are hashed out more thoroughly in
python-trio/trio#1155. Maybe we can get a fix in trio so things are
solved under our feet :)
2020-12-11 00:15:09 -05:00
Tyler Goodlet 12f425137c Drop duplicate project-package name in msg header 2020-11-03 12:15:49 -05:00
Tyler Goodlet 1580cc6fa0 Add explanation to module load error 2020-10-15 23:16:56 -04:00
Tyler Goodlet 5822d38ae4 Set _is_root runtime var in _main() 2020-10-15 23:16:54 -04:00
Tyler Goodlet 3b8684f655 Always call `Actor.cancel()` at end of root's main task
It's simpler and the only real logical difference is logging messages.
This should also give us an overall consistent tear down sequence.
2020-10-14 13:59:57 -04:00
Tyler Goodlet 02a9cac557 Drop remaining warn()s 2020-10-14 13:48:14 -04:00
Tyler Goodlet f60321a35a Always cancel service nursery last
The channel server should be torn down *before* the rpc
task/service nursery. Do this explicitly even in the root's main task
to avoid a strange hang I found in the pubsub tests. Start dropping
the `warnings.warn()` usage.
2020-10-14 13:46:05 -04:00
goodboy 7115d6c3bd
Merge pull request #129 from goodboy/multiproc_debug
Wen? Multiprocessing-native debugger now!
2020-10-14 09:14:03 -04:00
Tyler Goodlet e3c26943ba Support debug mode only on the trio backend 2020-10-13 14:20:44 -04:00
Tyler Goodlet 08ff989631 Add some comments 2020-10-13 11:59:18 -04:00
Tyler Goodlet 573b8fef73 Add better actor cancellation tracking
Add `Actor._cancel_called` and `._cancel_complete` making it possible to
determine whether the actor has started the cancellation sequence and
whether that sequence has fully completed. This allows for blocking in
internal machinery tasks as necessary. Also, always trigger the end of
ongoing rpc tasks even if the last task errors; there's no guarantee the
trio cancellation semantics will guarantee us a nice internal "state"
without this.
2020-10-13 11:48:52 -04:00
Tyler Goodlet c375a2d028 mypy fixes 2020-10-13 11:03:55 -04:00
Tyler Goodlet c41e5c8313 Fix missing await 2020-10-13 00:45:29 -04:00
Tyler Goodlet 79c38b04e7 Report `trio.Cancelled` when exhausting portals..
For reliable remote cancellation we need to "report" `trio.Cancelled`s
(just like any other error) when exhausting a portal such that the
caller can make decisions about cancelling the respective actor if need
be.

Resolves #156
2020-10-12 23:28:36 -04:00
Tyler Goodlet 07112089d0 Add mention subactor uid during locking 2020-10-07 05:53:26 -04:00
Tyler Goodlet d43d367153 Facepalm: tty locking from root doesn't require an extra task 2020-10-05 11:58:58 -04:00
Tyler Goodlet 83a45119e9 Add "root mailbox" contact info passing
Every subactor in the tree now receives the socket (or whatever the
mailbox type ends up being) during startup and can call the new
`tractor._discovery.get_root()` function to get a portal to the current
root actor in their tree. The main reason for adding this atm is to
support nested child actors gaining access to the root's tty lock for
debugging.

Also, when a channel disconnects from a message loop, might as well kill
all its rpc tasks.
2020-10-05 11:58:58 -04:00
Tyler Goodlet a2151cdd4d Allow re-entrant breakpoints during pdb stepping 2020-10-05 11:58:58 -04:00
Tyler Goodlet 9067bb2a41 Shorten arbiter contact timeout 2020-10-05 11:58:58 -04:00
Tyler Goodlet 29ed065dc4 Ack our inability to hard kill sub-procs 2020-09-28 13:56:42 -04:00
Tyler Goodlet fc2cb610b9 Make "hard kill" just a `Process.terminate()`
It's not like any of this code is really being used anyway since we
aren't indefinitely blocking for cancelled subactors to terminate (yet).
Drop the `do_hard_kill()` bit for now and just rely on the underlying
process api. Oh, and mark the nursery as cancelled asap.
2020-09-28 13:49:45 -04:00
Tyler Goodlet 5dd2d35fc5 Huh, maybe we don't need to block SIGINT
Seems like the request task cancel scope is actually solving all the
deadlock issues and masking SIGINT isn't changing much behaviour at all.
I think let's keep it unmasked for now in case it does turn out useful
in cancelling from unrecoverable states while in debug.
2020-09-28 13:11:22 -04:00
Tyler Goodlet 25e93925b0 Add a cancel scope around child debugger requests
This is needed in order to avoid the deadlock condition where
a child actor is waiting on the root actor's tty lock but it's parent
(possibly the root) is waiting on it to terminate after sending a cancel
request. The solution is simple: create a cancel scope around the
request in the child and always cancel it when a cancel request from the
parent arrives.
2020-09-28 13:02:33 -04:00
Tyler Goodlet 363498b882 Disable SIGINT handling in child processes
There seems to be no good reason not too since our cancellation
machinery/protocol should do this work when the root receives the
signal. This also (hopefully) helps with some debugging race condition
stuff.
2020-09-28 09:24:36 -04:00
Tyler Goodlet f1b242f913 Block SIGINT handling while in the debugger
This seems to prevent a certain class of bugs to do with the root actor
cancelling local tasks and getting into deadlock while children are
trying to acquire the tty lock. I'm not sure it's the best idea yet
since you're pretty much guaranteed to get "stuck" if a child activates
the debugger after the root has been cancelled (at least "stuck" in
terms of SIGINT being ignored). That kinda race condition seems to still
exist somehow: a child can "beat" the root to activating the tty lock
and the parent is stuck waiting on the child to terminate via its
nursery.
2020-09-28 08:54:21 -04:00
Tyler Goodlet 76e1c83161 Add matrix room link 2020-09-24 11:12:45 -04:00
Tyler Goodlet 9e1d9a8ce1 Add an internal context stack
This aids with tearing down resources **after** the crash handling and
debugger have completed. Leaving this internal for now but should
eventually get a public convenience function like
`tractor.context_stack()`.
2020-09-24 10:12:33 -04:00
Tyler Goodlet 09daba4c9c Explicitly handle `debug_mode` flag correctly 2020-09-24 10:12:33 -04:00
Tyler Goodlet 8b6e9f5530 Port to new debug api, set `_is_root` state flag on startup 2020-09-24 10:12:33 -04:00
Tyler Goodlet 150179bfe4 Support entering post mortem on crashes in root actor 2020-09-24 10:12:33 -04:00
Tyler Goodlet 291ecec070 Maybe not sticky by default 2020-09-24 10:12:33 -04:00
Tyler Goodlet bd157e05ef Port to service nursery 2020-09-24 10:12:33 -04:00
Tyler Goodlet fd5fb9241a Sparsen some lines 2020-09-24 10:12:33 -04:00
Tyler Goodlet ebb21b9ba3 Support re-entrant breakpoints
Keep an actor local (bool) flag which determines if there is already
a running debugger instance for the current process. If another task
tries to enter in this case, simply ignore it since allowing entry may
result in a deadlock where the new task will be sync waiting on the
parent stdio lock (a case that will never arrive due to the current
debugger's active use of it).

In the future we may want to allow FIFO queueing of local tasks where
instead of ignoring re-entrant breakpoints we allow tasks to async wait
for debugger release, though not sure the implications of that since
you'd likely want to support switching the debugger to the new task and
that could cause deadlocks where tasks are inter-dependent. It may be
more sane to just error on multiple breakpoint requests within an actor.
2020-09-24 10:12:33 -04:00
Tyler Goodlet f9ef3fc5de Cleanups and more comments 2020-09-24 10:12:33 -04:00
Tyler Goodlet 68773d51fd Always expose the debug module 2020-09-24 10:12:33 -04:00
Tyler Goodlet abaa2f5da0 Drop uneeded `parent_chan_cs()` cancel call 2020-09-24 10:12:33 -04:00
Tyler Goodlet 8eb9a742dd Add multi-process debugging support using `pdbpp`
This is the first step in addressing #113 and the initial support
of #130. Basically this allows (sub)processes to engage the `pdbpp`
debug machinery which read/writes the root actor's tty but only in
a FIFO semaphored way such that no two processes are using it
simultaneously. That means you can have multiple actors enter a trace or
crash and run the debugger in a sensible way without clobbering each
other's access to stdio. It required adding some "tear down hooks" to
a custom `pdbpp.Pdb` type such that we release a child's lock on the
parent on debugger exit (in this case when either of the "continue" or
"quit" commands are issued to the debugger console).

There's some code left commented in anticipation of full support for
issue #130 where we're need to actually capture and feed stdin to the
target (remote) actor which won't necessarily being running on the same
host.
2020-09-24 10:12:10 -04:00
Tyler Goodlet b06d4b023e Add support for "debug mode"
When enabled a crashed actor will connect to the parent with `pdb`
in post mortem mode.
2020-09-24 10:12:10 -04:00
Tyler Goodlet b11e91375c Initial attempt at multi-actor debugging
Allow entering and attaching to a `pdb` instance in a child process.
The current hackery is to have the child make an rpc to the parent and
ask it to hijack stdin, once complete the child enters a `pdb` blocking
method. The parent then relays all stdin input to the child thus
controlling the "remote" debugger.

A few things were added to accomplish this:
- tracking the mapping of subactors to their parent nurseries
- in the root actor, cancelling all nurseries under the root `trio` task
  on cancellation (i.e. `Actor.cancel()`)
- pass a "runtime vars" map down the actor tree for propagating global state
2020-09-24 10:12:10 -04:00
Tyler Goodlet 8c97f7bbb3 Create runtime variables 2020-09-24 10:12:10 -04:00
Tyler Goodlet ec5d443ee5 Always log actor errors 2020-08-13 11:55:22 -04:00
Tyler Goodlet 1ae0efb033 Make rpc_module_paths a list 2020-08-13 11:53:45 -04:00
Tyler Goodlet 8a995beb6a Docs fixes 2020-08-08 22:29:57 -04:00
Tyler Goodlet 292513b353 Module define default accept addr 2020-08-08 20:58:04 -04:00
Tyler Goodlet b3eba00c3a Appease the great mypy 2020-08-08 20:57:43 -04:00
Tyler Goodlet 42be410076 Handle mp accept_addr 2020-08-08 20:27:43 -04:00
Tyler Goodlet 8477d21499 Restructure actor runtime nursery scoping
In an effort acquire more deterministic actor cancellation,
this adds a clearer and more resilient (whilst possibly a bit
slower) internal nursery structure with explicit semantics for
clarifying the task-scope shutdown sequence.

Namely, on cancellation, the explicit steps are now:
- cancel all currently running rpc tasks and wait
  for them to complete
- cancel the channel server and wait for it to complete
- cancel the msg loop for the channel with the immediate parent
- de-register with arbiter if possible
- wait on remaining connections to release
- exit process

To accomplish this add a new nursery called the "service nursery" which
spawns all rpc tasks **instead of using** the "root nursery". The root
is now used solely for async launching the msg loop for the primary
channel with the parent such that it is (nearly) the last thing torn
down on cancellation.

In the future it should also be possible to have `self.cancel()` return
a result to the parent once the runtime is sure that the rest of the
shutdown is atomic; this would allow for a true unbounded shield in
`Portal.cancel_actor()`. This will likely require that the error
handling blocks in `Actor._async_main()` are moved "inside" the root
nursery block such that the msg loop with the parent truly is the last
thing to terminate.
2020-08-08 14:55:41 -04:00
Tyler Goodlet 90c7fa6963 Allow shielding in `open_portal()` 2020-08-08 14:47:52 -04:00
Tyler Goodlet 532429aec9 Harden `trio` spawner process waiting
Always shield waiting for he process and always run
``trio.Process.__aexit__()`` on teardown. This enforces
that shutdown happens to due cancellation triggered inside
the sub-actor instead of the process being killed externally
by the parent.
2020-08-08 14:43:25 -04:00
Tyler Goodlet fe45d99f65 Allow opening a portal through an existing channel 2020-08-07 12:02:06 -04:00
Tyler Goodlet ae8488a578 Always shield de-register step with arbiter 2020-08-07 11:36:26 -04:00
Tyler Goodlet 09ae51900d Better clarify uid comment 2020-08-04 09:52:49 -04:00
Tyler Goodlet 4f92cfe74f Don't `.aclose` `trio` processes until the very end
Trio will kill subprocesses via `Process.__aexit__()` using a `finally:`
block (which, yes, will get triggered on cancellation) so we avoid that
until true process "tear down" since subactors do many things during
graceful shutdown (such as de-registering from the name discovery
system). Oddly this only seems to be an issue during cancellation of
infinite stream consumption.

Resolves #141
2020-08-03 18:57:00 -04:00
Tyler Goodlet ae9016c06a Log on KBI cancelled termination 2020-08-03 18:46:18 -04:00
Tyler Goodlet a24c6bfdd2 Correctly catch cancelled nursery case (purely for logging) 2020-08-03 18:44:50 -04:00
Tyler Goodlet 56b81f07e5 Return `Dict[Tuple, Tuple]` from `.get_registry()` 2020-08-03 18:42:23 -04:00
Tyler Goodlet fbd68d2d91 Allow for tuple keys with std `msgpack` 2020-08-03 18:41:21 -04:00
Tyler Goodlet 639299e6eb Expose a `.get_registry()` method on the arbiter 2020-08-03 15:40:41 -04:00
Guillermo Rodriguez 3e29fcf1ea
Docstring to the top\!, and redundant spaces goodbye\! 2020-07-29 15:39:38 -03:00
Tyler Goodlet 9a40291d4a Repair startup sequence around parent state transfer
In order to have reliable subactor startup we need the following
sequence to take place:
- connect to the parent actor, handshake and receive runtime state
- load exposed modules into memory
- start the channel server up fully using the provided bind address
- finally, start processing new messages from the parent

Add a bunch more comments to clarify all this.
2020-07-28 22:25:22 -04:00
Guillermo Rodriguez 0a5691e0a8
Removed arbiter_addr local, and bind_addr is now passed through channel, in early child actor init. 2020-07-28 11:55:11 -03:00
Guillermo Rodriguez ef053eb070
Added named arguments to child init, and now passing less of them. 2020-07-27 21:05:00 -03:00
Guillermo Rodriguez e5dbf14ec3
Onlt await params in trio mode 2020-07-27 15:20:55 -03:00
Guillermo Rodriguez 2a407be532
Now passing additional initialization parameters through channel early after handshake. 2020-07-27 14:55:37 -03:00
Tyler Goodlet 3c7ec72f8e Fix SIGINT test names 2020-07-26 23:37:44 -04:00
Tyler Goodlet dddbeb0e71 Run Windows on trio and mp backends
The new pure trio spawning backend uses `subprocess` internally which is
also supported on windows so let's run it in CI.
2020-07-25 13:41:48 -04:00
Tyler Goodlet 7c3928f0bf Oh mypy.. 2020-07-24 17:31:24 -04:00
Tyler Goodlet d3acb8d061 Wait on proc before killing stdio 2020-07-24 17:08:52 -04:00
Tyler Goodlet efde3a5773 Simplify the `_child.py` script
We don't really need stdin for anything but passing the entry point and
detaching it seemed to just cause errors on cancellation teardown.
2020-07-24 17:08:52 -04:00
Tyler Goodlet aa620fe61d Use `trio.Process.__aexit__()` and pass the actor uid
Using the context manager interface does some extra teardown beyond simply
calling `.wait()`. Pass the subactor's "uid" on the exec line for
debugging purposes when monitoring the process tree from the OS.
Hard code the child script module path to avoid a double import warning.
2020-07-24 17:08:52 -04:00
Tyler Goodlet 4516febe26 Make sure to wait trio processes on teardown 2020-07-24 17:08:52 -04:00
Tyler Goodlet 0b305fd78a Change spawn method name in `Actor.load_modules()` 2020-07-24 17:08:52 -04:00
Tyler Goodlet 0936bdc592 Add back subactor logging 2020-07-24 17:08:52 -04:00
Guillermo Rodriguez 56463a08df First attempt at removing trip & updating hazmat -> lowlevel 2020-07-24 17:08:52 -04:00
Tyler Goodlet 7c73775474 Force keyword only args in actor spawn methods 2020-07-24 17:06:43 -04:00
Tyler Goodlet 8fbdfd6a3a Add an obnoxious error message on internal failures 2020-07-24 17:06:23 -04:00
Tyler Goodlet 1706791313 Drop entrypoints from `Actor` 2020-07-24 17:04:22 -04:00
Tyler Goodlet 8e32199509 Get entry points reorg without asyncio compat
This is an edit to factor out changes needed for the `asyncio` in guest mode
integration (which currently isn't tested well) so that later more pertinent
changes (which are tested well) can be rebased off of this branch and
merged into mainline sooner. The *infect_asyncio* branch will need to be
rebased onto this branch as well before merge to mainline.
2020-07-24 17:02:03 -04:00
Tyler Goodlet 8054bc7c70 Support "infected asyncio" actors
This is an initial solution for #120.

Allow spawning `asyncio` based actors which run `trio` in guest
mode. This enables spawning `tractor` actors on top of the `asyncio`
event loop whilst still leveraging the SC focused internal actor
supervision machinery. Add a `tractor.to_syncio.run()` api to allow
spawning tasks on the `asyncio` loop from an embedded (remote) `trio`
task and return or stream results all the way back through the `tractor`
IPC system using a very similar api to portals.

One outstanding problem is getting SC around calls to
`asyncio.create_task()`. Currently a task that crashes isn't able to
easily relay the error to the embedded `trio` task without us fully
enforcing the portals based message protocol (which seems superfluous
given the error ref is in process). Further experiments using `anyio`
task groups may alleviate this.
2020-07-24 16:48:06 -04:00
Tyler Goodlet 30f8dd8be4 Pass a `Channel` to `LocalPortal` for compat purposes 2020-02-09 01:59:39 -05:00
Tyler Goodlet 596aca8097 Alias __mp_main__ at import time 2020-02-09 01:07:14 -05:00
Tyler Goodlet 00fc734580 Fix missing `_ctx` define when on Windows 2020-02-07 20:01:41 -05:00
Tyler Goodlet e671cb4f3b Fixup _spawn.py comments to incorporate trip 2020-01-31 12:05:15 -05:00
Tyler Goodlet 8264b7d136 Drop old module loading from abspath cruft 2020-01-31 12:04:46 -05:00
Tyler Goodlet d64508e1a6 Add more detailed docs around nursery logic
The logic in the `ActorNursery` block is critical to cancellation
semantics and in particular, understanding how supervisor strategies are
invoked. Stick in a bunch of explanatory comments to clear up these
details and also prepare to introduce more supervisor strats besides
the current one-cancels-all approach.
2020-01-31 09:50:25 -05:00
Tyler Goodlet 6348121d23 Do __main__ fixups like ``mulitprocessing does``
Instead of hackery trying to map modules manually from the filesystem
let Python do all the work by simply copying what ``multiprocessing``
does to "fixup the __main__ module" in spawned subprocesses. The new
private module ``_mp_fixup_main.py`` is simply cherry picked code from
``multiprocessing.spawn`` which does just that. We only need these
"fixups" when using a backend other then ``multiprocessing``; for
now just when using ``trio_run_in_process``.
2020-01-29 21:14:48 -05:00
Tyler Goodlet 2a4307975d Fix that thing where the first example in your docs is supposed to work
Thanks to @salotz for pointing out that the first example in the docs
was broken. Though it's somewhat embarrassing this might also explain
the problem in #79 and certain issues in #59...

The solution here is to import the target RPC module using the its
unique basename and absolute filepath in the sub-actor that requires it.
Special handling for `__main__` and `__mp_main__` is needed since the
spawned subprocess will have no knowledge about these parent-
-state-specific module variables. Solution: map the modules name to the
respective module file basename in the child process since the module
variables will of course have different values in children.
2020-01-29 12:16:14 -05:00
Tyler Goodlet 43cca122f5 Handle windows in `@tractor_test` as well 2020-01-26 23:44:47 -05:00
Tyler Goodlet b4cb7439a1 Drop useless fork error branch 2020-01-26 22:46:48 -05:00
Tyler Goodlet e57811a602 Fork isn't present on windows... 2020-01-26 22:35:42 -05:00
Tyler Goodlet ecced3d09a Allow choosing the spawn backend per test session
Add a `--spawn-backend` option which can be set to one of {'mp',
'trio_run_in_process'} which will either run the test suite using the
`multiprocessing` or `trio-run-in-process` backend respectively.
Currently trying to run both in the same session can result in hangs
seemingly due to a lack of cleanup of forkservers / resource trackers
from `multiprocessing` which cause broken pipe errors on occasion (no
idea on the details).

For `test_cancellation.py::test_nested_multierrors`, use less nesting
when mp is used since it breaks if we push it too hard with the
whole recursive subprocess spawning thing...
2020-01-26 21:36:08 -05:00
Tyler Goodlet 27c9760f96 Be explicit about the spawning backend default
Set `trio-run-in-process` as the default on *nix systems and
`multiprocessing`'s spawn method on Windows. Enable overriding the
default choice using `tractor._spawn.try_set_start_method()`. Allows
for easy runs of the test suite using a user chosen backend.
2020-01-26 21:13:29 -05:00
Tyler Goodlet bc259b7eab Use trip as default in all tests for now 2020-01-24 00:54:19 -05:00
Tyler Goodlet d9803ca906 Be explicit with the real name for trip 2020-01-24 00:47:01 -05:00
Tyler Goodlet 4837595e36 Fake out mypy again 2020-01-23 01:32:02 -05:00
Tyler Goodlet 4c5a60d06a Don't import trip on Windows 2020-01-23 01:23:26 -05:00
Tyler Goodlet ddbf55768f Try out trip as the default spawn_method on unix for now 2020-01-23 01:15:46 -05:00
Tyler Goodlet 4b0554b61f Type checker fixes 2020-01-21 10:28:32 -05:00
Tyler Goodlet 6c45416016 Drop ActorNusery.wait(); it's no longer necessary really 2020-01-21 10:27:53 -05:00
Tyler Goodlet c074aea030 Support TRIP for process launching
This took a ton of tinkering and a rework of the actor nursery tear down
logic. The main changes include:

- each subprocess is now spawned from inside a trio task
from one of two containing nurseries created in the body of
`tractor.open_nursery()`: one for `run_in_actor()` processes and one for
`start_actor()` "daemons". This is to address the need for
`trio-run-in_process.open_in_process()` opening a nursery which must
be closed from the same task that opened it. Using this same approach
for `multiprocessing` seems to work well. The nurseries are waited in
order (rip actors then daemon actors) during tear down which allows
for avoiding the recursive re-entry of `ActorNursery.wait()` handled
prior.

- pull out all the nested functions / closures that were in
`ActorNursery.wait()` and move into the `_spawn` module such that
that process shutdown logic takes place in each containing task's
code path. This allows for vastly simplifying `.wait()` to just contain an
event trigger which initiates process waiting / result collection.
Likely `.wait()` should just be removed since it can no longer be used
to synchronously wait on the actor nursery.

- drop `ActorNursery.__aenter__()` / `.__atexit__()` and move this
"supervisor" tear down logic into the closing block of `open_nursery()`.
This not only cleans makes the code more comprehensible it also
makes our nursery implementation look more like the one in `trio`.

Resolves #93
2020-01-21 10:27:53 -05:00
Tyler Goodlet 91c3716968 Do module abspath loading in actor init 2020-01-21 10:27:53 -05:00
Tyler Goodlet afa640dcab More trip WIP stuff working.. kinda
Get a few more things working:
- fail reliably when remote module loading goes awry
- do a real hacky job of module loading using `sys.path` stuffsies
- we're still totally borked when trying to spin up and quickly cancel
a bunch of subactors...

It's a small move forward I guess.
2020-01-21 10:27:53 -05:00
Tyler Goodlet 1b7cdfe512 WIP trying out trio_run_in_process 2020-01-21 10:27:53 -05:00
Tyler Goodlet 698951c515 More mypy apeasement on 3.7 2020-01-15 21:06:13 -05:00
Tyler Goodlet e2c9477122 Allow overriding the root logger name
Handy if other dependent projects want to use the logging system but
also want to slap their own root "branding" onto the record prefix.
2019-12-20 16:37:17 -05:00
Tyler Goodlet 79c152fe38 Make latest mpypy happy 2019-12-10 00:55:03 -05:00
Tyler Goodlet 14bfef0df7 Update types for log adapter 2019-12-09 22:10:15 -05:00
Tyler Goodlet cf73283586 Make info object a mapping type
Make the info object a `Mapping` to play nicer with static type
checking. Simplify the task or actor context method lookup using a dict.
2019-12-09 00:03:22 -05:00
Tyler Goodlet 52efbfc2cd Log task and actor names where possible
Prepend the actor and task names in each log emission. This makes
debugging much more sane since you can see from which process and
running task the log message originates from!

Resolves #13
2019-12-01 23:26:25 -05:00
Tyler Goodlet d2a01e8b81 Drop use of `trio.Event.clear()`
Just spin up new events instead; because apparently they're
so cheap (rolls eyes).

Resolves #78
2019-11-23 11:29:23 -05:00
Tyler Goodlet f977d37cee Add nursery self-destruct logic on cancel failure
If a nursery fails to cancel (some sub-actors presumably) then hard kill
the whole process tree to avoid hangs during a catastrophic failure.
This logic may get factored out (and changed) as we introduce custom
supervisor strategies.
2019-11-22 17:11:48 -05:00
Tyler Goodlet 5e056bae71 Expose trio exceptions to `RemoteActorError` 2019-10-30 00:32:10 -04:00
Tyler Goodlet 95e8f3d306 Propagate `trio.MultiError`s up the actor tree
`trio.MultiError` isn't an `Exception` (derived instead from
`BaseException`) so we have to specially catch it in the task
invocation machinery and ship it upwards (like regular errors)
since nurseries running in sub-actors can raise them.
2019-10-28 00:47:06 -04:00
Tyler Goodlet da4796749f Continue hacking the forkserver in Python 3.8
They got all fancy and added shared memory segment tracking and then
had to "generalize" the tracker name...hooray

Fixes #81
2019-10-15 22:37:47 -04:00
Tyler Goodlet 7da95a806d Rename override module 2019-10-14 12:58:10 -04:00
Tyler Goodlet f885b02c73 Validate stream functions at decorate time 2019-03-29 19:10:32 -04:00
Tyler Goodlet 5c0ae47cf5 Fix type annotation 2019-03-26 08:03:12 -04:00
Tyler Goodlet e51f84af90 Require explicit marking of non async gen streaming funcs
Add `@tractor.stream` which must be used to denote non async generator
streaming functions which use the `tractor.Context` API to push values.
This enforces a more explicit denotation as well as allows enforcing the
declaration of the `ctx` argument in definitions.
2019-03-25 21:36:13 -04:00
Tyler Goodlet 4ee35038fb Move discovery functions to their own module 2019-03-24 11:37:11 -04:00
Tyler Goodlet 2aa6ffce60 Provide each task's cancel scope to every `Context`
This begins moving toward explicitly decorated "streaming functions"
instead of checking for a `ctx` arg in the signature.

- provide each context with its task's top level `trio.CancelScope`
  such that tasks can cancel themselves explictly if needed via calling
  `Context.cancel_scope()`
- make `Actor.cancel_task()` a private method (`_cancel_task()`) and
  handle remote rpc calls specially such that the caller does not need
  to provide the `chan` argument; non-primitive types can't be passed on
  the wire and we don't want the client actor be require knowledge of
  the channel instance the request is associated with. This also ties into
  how we're tracking tasks right now (`Actor._rpc_tasks` is keyed by the
  call id, a UUID, *plus* the channel).
- make `_do_handshake` a private actor method
- use UUID version 4
2019-03-23 23:31:26 -04:00
Tyler Goodlet 4e078368fc Propagate `tractor.run()` logging level to subactors 2019-03-18 21:32:08 -04:00
Tyler Goodlet de8d69c58b Expose `Context` at top level 2019-03-15 19:40:34 -04:00
goodboy 29ffbfe6ca
Merge pull request #63 from chrizzFTD/update_tests_for_windows
Update tests for windows
2019-03-14 21:06:37 -04:00
Christian López Barrón b992dc19e3 moved assert statement for name on try_set_start_method after its autoset 2019-03-13 21:32:45 +11:00
Tyler Goodlet 63d067792c Rename `StreamQueue` to `MsgpackStream`
Prepares for other possible interchange formats plus it wasn't really
a queue, just a TCP stream wrapper + `msgpack` interchange.
2019-03-12 01:22:46 -04:00
Tyler Goodlet b70f4eafcb Flip tests to use `start_method` kwarg 2019-03-08 20:06:16 -05:00
Tyler Goodlet c3daf73112 Document the mp start method more explicitly 2019-03-08 20:01:42 -05:00
Tyler Goodlet dc5cc040e6 Try to support waiting on Windows processes
This pokes around a little in `trio` hazmat but it *should
work* as it piggy backs on the new cross platform subprocess support.

Relates to #59
2019-03-06 21:24:23 -05:00
Tyler Goodlet 483ae42a46 Add a `spawn_method` dynamic fixture 2019-03-06 00:36:37 -05:00
Tyler Goodlet 7014a07986 Add "spawn" start method support
Add full support for using the "spawn" process starting method as per:
https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods

Add a  `spawn_method` argument to `tractor.run()` for specifying the
desired method explicitly. By default use the "fastest" method available.
On *nix systems this is the original "forkserver" method.

This should be the solution to getting windows support!

Resolves #60
2019-03-06 00:29:07 -05:00
Tyler Goodlet d75739e9c7 Factor process creation into a separate factory
Make a `_spawn` module for encapsulating all the `multiprocessing`
"spawn method" stuff and factor current forkserver steps into it.
2019-03-05 18:52:19 -05:00