forked from goodboy/tractor
1
0
Fork 0
Commit Graph

333 Commits (cb2d2ed9d5efb3bef6cc494bb4c9ae3e2fec90c8)

Author SHA1 Message Date
Tyler Goodlet 3e19fd311b Move debugger locking to new stream api 2021-04-28 12:23:08 -04:00
Tyler Goodlet 80c96cab01 Add a warning for soon to be deprecated `ctx` use in `@stream` func 2021-04-28 12:23:08 -04:00
Tyler Goodlet 36251357b3 Add a new one-way stream API
NB: this is a breaking change removing support for `Portal.run()` being
able to invoke remote streaming functions and instead replacing the
method call with an async context manager api `Portal.open_stream_from()`
This style explicitly defines stream teardown at the call site instead
of expecting the user to handle tricky things correctly themselves: eg.
`async_geneartor.aclosing()`. Going forward `Portal.run()` can be used
only for invoking async functions.
2021-04-28 12:23:08 -04:00
Tyler Goodlet 81f3558494 Formatting 2021-04-28 12:23:08 -04:00
Tyler Goodlet 897ab79946 Add a no runtime error 2021-04-28 12:23:08 -04:00
Tyler Goodlet 7f38b7225d Aggregate and organize streaming components
Move receive stream into streaming modules and rebrand as a "message
stream".  Factor out cancellation mechanics in `.aclose()` into the
`Context` type which will soon provide the api for for cancelling portal
invocations.  Comment-stage a few methods on both types in anticipation
of a new bi-directional streaming api.  Add a `MsgStream` bidirectional
channel type which will be the eventual type yielded from
`Context.open_stream()`.  Adjust the response/dialog types to be the set
`{'asyncfun', 'asyncgen', 'context'}`. OH, and add async func checking
in `Portal.run()` to catch and error on sync funcs early.
2021-04-28 12:23:08 -04:00
Tyler Goodlet d0eacc3fd6 Appease mypy 2021-04-27 12:08:30 -04:00
Tyler Goodlet 89ce1a63e4 Only accept asyncfunc response type 2021-04-27 12:08:30 -04:00
Tyler Goodlet 5798ef6796 Enforce async funcs on callee side, convert arbiter methods 2021-04-27 12:08:30 -04:00
Tyler Goodlet c2a1612bf5 Drop sync function support
You can always wrap a sync function in an async one and there seems to
be no good reason to support invoking them directly especially since
cancellation won't work without some thread hackery. If it's requested
we'll point users to `trio-parallel`.

Resolves #77
2021-04-27 12:08:30 -04:00
Tyler Goodlet be22a2526a Add `Actor.cancel_soon()` for sync self destruct
Add a sync method that can be used to cancel the current actor from
a synchronous context. This is useful in debugging situations where
sync debugger code may need to kill the process tree.

Also, make the internal "lifetime stack" a global var; easier to manage
from client code that may was to add callbacks prior to the actor
runtime being fully setup.
2021-04-27 11:35:28 -04:00
Tyler Goodlet 47565cfbf3 Use root as default name from `tractor.run()` 2021-02-25 08:51:28 -05:00
Tyler Goodlet cd636b270e Update debug tests to expect 'root' actor name 2021-02-24 13:38:20 -05:00
Tyler Goodlet 983e66b31b Add second implicit-runtime-boot branch 2021-02-24 13:13:45 -05:00
Tyler Goodlet b285db4c58 Factor OCA supervisor into new func 2021-02-24 13:13:38 -05:00
Tyler Goodlet 5ffd2d2ab3 Ignore type checks on stdlib overrides 2021-02-21 14:08:23 -05:00
Tyler Goodlet 7888ef6f01 Fix more stdlib typing issues with latest mypy 2021-02-21 12:48:03 -05:00
Tyler Goodlet 109066dda9 Support sync code breakpointing via built-in
Override `breakpoint()` for sync code making it work
properly with `trio` as per:

https://github.com/python-trio/trio/issues/1155#issuecomment-742964018

Relates to #193
2021-02-21 12:36:00 -05:00
Tyler Goodlet 9f4e497b9c Don't shield proc waits 2021-01-14 18:21:26 -05:00
Tyler Goodlet e546ead2ff Pub sub internals type fixes 2021-01-14 18:20:59 -05:00
Tyler Goodlet 3df001f3a9 Fix msg pub global lock sharing
Using `None` as the default key for a `@msg.pub` can cause conflicts if
there is more then one "taskless" (no tasks={,} passed) pub offered on
an actor... So instead use the first trio "task name" (usually just the
function name) instead thus avoiding this very hard to debug and
understand problem.

Probably should throw in a test but I'm super lazy today.
2021-01-14 18:20:49 -05:00
Tyler Goodlet 5ed5d18ccb Begin rpc_module_paths deprecation 2021-01-08 22:08:45 -05:00
Tyler Goodlet 32b10681a1 Drop tractor.run() from @tractor_test 2021-01-08 20:56:03 -05:00
Tyler Goodlet 41a4de5af2 Use actual task name lel 2021-01-08 20:55:42 -05:00
Tyler Goodlet 59421d9f3a Fix some borked tests 2021-01-08 20:55:11 -05:00
Tyler Goodlet 333ddcf93f Can we ever really appease mypy? 2021-01-03 11:18:31 -05:00
Tyler Goodlet 0bb2163b0c Implicitly open root actor on first nursery use. 2021-01-02 21:39:30 -05:00
Tyler Goodlet bd3059f01b Allow for error bypass 2021-01-02 21:39:30 -05:00
Tyler Goodlet 803152ead5 Use explicit named args 2021-01-02 21:39:30 -05:00
Tyler Goodlet e6245671b0 Use runtime level on attach 2021-01-02 21:38:55 -05:00
goodboy bfe500060f
Merge pull request #181 from goodboy/drop_tractor_run
Deprecate `tractor.run()`
2020-12-28 12:53:04 -05:00
Tyler Goodlet 723fb17394 Add deprecation warning to run() 2020-12-27 13:29:30 -05:00
Tyler Goodlet f05534e472 Re-org root actor startup into context manager
This begins the move to dropping support for `tractor.run()` which we
don't really need since the runtime is started (as it always has been)
from a new sub-task / nursery. Instead this introduces starting the
actor tree through a `open_root_actor()` async context manager which
we'll likely implicitly call (from the root) on the first use of an
actor nursery.

Drop `_actor._start_actor()` and factor its contents into this new api.
Make `run()` and `run_daemon()` use `open_root_actor()` until we decide
to remove them.

Relates to #168 and #177
2020-12-27 13:29:30 -05:00
Tyler Goodlet b040cdc0c9 Add null byte guard from mainline 2020-12-27 13:28:54 -05:00
Tyler Goodlet 6b650c0fe6 Add a "runtime" log level 2020-12-26 15:45:45 -05:00
Tyler Goodlet 0d05a727b6 Use error log level by default 2020-12-25 15:28:32 -05:00
Tyler Goodlet c28ffd8b1c Don't exception log multi-cancels 2020-12-25 15:23:59 -05:00
Tyler Goodlet 5d7a4e2b12 Denoise some common teardown "errors" to warnings. 2020-12-25 15:10:20 -05:00
Tyler Goodlet 8522f90000 Add type annots to exceptions mod
Also add a `is_multi_cancelled()` predicate to test for
`trio.MultiError`s that contain entirely cancel signals.

Resolves #125
2020-12-25 15:07:36 -05:00
Tyler Goodlet 4bf9b27f57 Drop all .statespace refs; it was a silly idea 2020-12-22 19:33:16 -05:00
Tyler Goodlet 9fd3c42eb1 Port inter-process method calls to `Portal.run_from_ns()` 2020-12-22 10:39:47 -05:00
Tyler Goodlet 7134f35d6e Add `Portal.run_from_ns()`
It turns out in order to maintain our sneaky little "call an `Actor`
method in this remote process" we still need the ability to invoke
functions from a namespace. We're currently using a "self" namespace as
a way to do this for internal inter-process method calling.  Either way,
I see no reason not to keep a public method for this invoke style (we
just won't market it) since it is still how the machinery works
underneath.
2020-12-22 10:39:47 -05:00
Tyler Goodlet a668f714d5 Allow passing function refs to `Portal.run()`
This resolves and completes #69 allowing all RPC invocation APIs to pass
function references directly instead of explicit `str` names for the
target namespace and function (this is still done implicitly
underneath).  This brings us closer to `trio`'s task running API as well
as acknowledges that any inter-host RPC system (and API) will likely
need to be implemented on top of local RPC primitives anyway. Even if
this ends up **not** being true we can always go to "function stubs" as
part of our IAC protocol or, add a new method to do explicit namespace
calls: `.run_from_module()` or whatever everyone votes on.

Resolves #69

Further, this commit drops `Actor.statespace` from the entire system
since a user can easily get this same functionality using module
level variables. Fix docs to match all these changes (luckily mostly
already done due to example scripts referencing).
2020-12-21 09:09:55 -05:00
Tyler Goodlet 0d67ce4abc Fix collections type import for py3.10 2020-12-18 17:58:07 -05:00
Tyler Goodlet 797bcc1df2 Handle early timeouts on last debugger test 2020-12-17 13:35:45 -05:00
Tyler Goodlet 201771a521 'Fix mypy, change interal type name to `ReceiveStream`, settle on `.shield()`' 2020-12-17 12:01:49 -05:00
Tyler Goodlet 15ead6b561 Add a way to shield a stream's underlying channel
Add a ``tractor._portal.StreamReceiveChannel.shield_channel()`` context
manager which allows for avoiding the closing of an IPC stream's
underlying channel for the purposes of task re-spawning. Sometimes you
might want to cancel a task consuming a stream but not tear down the IPC
between actors (the default). A common use can might be where the task's
"setup" work might need to be redone but you want to keep the
established portal / channel in tact despite the task restart.

Includes a test.
2020-12-16 21:42:28 -05:00
Tyler Goodlet d497078eb7 Appease 3.8 mypy 2020-12-11 20:04:56 -05:00
Tyler Goodlet e51c2620e5 End the `pdb` SIGINT handling madness
Turns out this is a lower level issue in terms of the stdlib's default
`pdb.Pdb` settings and how they conflict with `trio`s cancellation and
KBI handling. The details are hashed out more thoroughly in
python-trio/trio#1155. Maybe we can get a fix in trio so things are
solved under our feet :)
2020-12-11 00:15:09 -05:00
Tyler Goodlet 12f425137c Drop duplicate project-package name in msg header 2020-11-03 12:15:49 -05:00
Tyler Goodlet 1580cc6fa0 Add explanation to module load error 2020-10-15 23:16:56 -04:00
Tyler Goodlet 5822d38ae4 Set _is_root runtime var in _main() 2020-10-15 23:16:54 -04:00
Tyler Goodlet 3b8684f655 Always call `Actor.cancel()` at end of root's main task
It's simpler and the only real logical difference is logging messages.
This should also give us an overall consistent tear down sequence.
2020-10-14 13:59:57 -04:00
Tyler Goodlet 02a9cac557 Drop remaining warn()s 2020-10-14 13:48:14 -04:00
Tyler Goodlet f60321a35a Always cancel service nursery last
The channel server should be torn down *before* the rpc
task/service nursery. Do this explicitly even in the root's main task
to avoid a strange hang I found in the pubsub tests. Start dropping
the `warnings.warn()` usage.
2020-10-14 13:46:05 -04:00
goodboy 7115d6c3bd
Merge pull request #129 from goodboy/multiproc_debug
Wen? Multiprocessing-native debugger now!
2020-10-14 09:14:03 -04:00
Tyler Goodlet e3c26943ba Support debug mode only on the trio backend 2020-10-13 14:20:44 -04:00
Tyler Goodlet 08ff989631 Add some comments 2020-10-13 11:59:18 -04:00
Tyler Goodlet 573b8fef73 Add better actor cancellation tracking
Add `Actor._cancel_called` and `._cancel_complete` making it possible to
determine whether the actor has started the cancellation sequence and
whether that sequence has fully completed. This allows for blocking in
internal machinery tasks as necessary. Also, always trigger the end of
ongoing rpc tasks even if the last task errors; there's no guarantee the
trio cancellation semantics will guarantee us a nice internal "state"
without this.
2020-10-13 11:48:52 -04:00
Tyler Goodlet c375a2d028 mypy fixes 2020-10-13 11:03:55 -04:00
Tyler Goodlet c41e5c8313 Fix missing await 2020-10-13 00:45:29 -04:00
Tyler Goodlet 79c38b04e7 Report `trio.Cancelled` when exhausting portals..
For reliable remote cancellation we need to "report" `trio.Cancelled`s
(just like any other error) when exhausting a portal such that the
caller can make decisions about cancelling the respective actor if need
be.

Resolves #156
2020-10-12 23:28:36 -04:00
Tyler Goodlet 07112089d0 Add mention subactor uid during locking 2020-10-07 05:53:26 -04:00
Tyler Goodlet d43d367153 Facepalm: tty locking from root doesn't require an extra task 2020-10-05 11:58:58 -04:00
Tyler Goodlet 83a45119e9 Add "root mailbox" contact info passing
Every subactor in the tree now receives the socket (or whatever the
mailbox type ends up being) during startup and can call the new
`tractor._discovery.get_root()` function to get a portal to the current
root actor in their tree. The main reason for adding this atm is to
support nested child actors gaining access to the root's tty lock for
debugging.

Also, when a channel disconnects from a message loop, might as well kill
all its rpc tasks.
2020-10-05 11:58:58 -04:00
Tyler Goodlet a2151cdd4d Allow re-entrant breakpoints during pdb stepping 2020-10-05 11:58:58 -04:00
Tyler Goodlet 9067bb2a41 Shorten arbiter contact timeout 2020-10-05 11:58:58 -04:00
Tyler Goodlet 29ed065dc4 Ack our inability to hard kill sub-procs 2020-09-28 13:56:42 -04:00
Tyler Goodlet fc2cb610b9 Make "hard kill" just a `Process.terminate()`
It's not like any of this code is really being used anyway since we
aren't indefinitely blocking for cancelled subactors to terminate (yet).
Drop the `do_hard_kill()` bit for now and just rely on the underlying
process api. Oh, and mark the nursery as cancelled asap.
2020-09-28 13:49:45 -04:00
Tyler Goodlet 5dd2d35fc5 Huh, maybe we don't need to block SIGINT
Seems like the request task cancel scope is actually solving all the
deadlock issues and masking SIGINT isn't changing much behaviour at all.
I think let's keep it unmasked for now in case it does turn out useful
in cancelling from unrecoverable states while in debug.
2020-09-28 13:11:22 -04:00
Tyler Goodlet 25e93925b0 Add a cancel scope around child debugger requests
This is needed in order to avoid the deadlock condition where
a child actor is waiting on the root actor's tty lock but it's parent
(possibly the root) is waiting on it to terminate after sending a cancel
request. The solution is simple: create a cancel scope around the
request in the child and always cancel it when a cancel request from the
parent arrives.
2020-09-28 13:02:33 -04:00
Tyler Goodlet 363498b882 Disable SIGINT handling in child processes
There seems to be no good reason not too since our cancellation
machinery/protocol should do this work when the root receives the
signal. This also (hopefully) helps with some debugging race condition
stuff.
2020-09-28 09:24:36 -04:00
Tyler Goodlet f1b242f913 Block SIGINT handling while in the debugger
This seems to prevent a certain class of bugs to do with the root actor
cancelling local tasks and getting into deadlock while children are
trying to acquire the tty lock. I'm not sure it's the best idea yet
since you're pretty much guaranteed to get "stuck" if a child activates
the debugger after the root has been cancelled (at least "stuck" in
terms of SIGINT being ignored). That kinda race condition seems to still
exist somehow: a child can "beat" the root to activating the tty lock
and the parent is stuck waiting on the child to terminate via its
nursery.
2020-09-28 08:54:21 -04:00
Tyler Goodlet 76e1c83161 Add matrix room link 2020-09-24 11:12:45 -04:00
Tyler Goodlet 9e1d9a8ce1 Add an internal context stack
This aids with tearing down resources **after** the crash handling and
debugger have completed. Leaving this internal for now but should
eventually get a public convenience function like
`tractor.context_stack()`.
2020-09-24 10:12:33 -04:00
Tyler Goodlet 09daba4c9c Explicitly handle `debug_mode` flag correctly 2020-09-24 10:12:33 -04:00
Tyler Goodlet 8b6e9f5530 Port to new debug api, set `_is_root` state flag on startup 2020-09-24 10:12:33 -04:00
Tyler Goodlet 150179bfe4 Support entering post mortem on crashes in root actor 2020-09-24 10:12:33 -04:00
Tyler Goodlet 291ecec070 Maybe not sticky by default 2020-09-24 10:12:33 -04:00
Tyler Goodlet bd157e05ef Port to service nursery 2020-09-24 10:12:33 -04:00
Tyler Goodlet fd5fb9241a Sparsen some lines 2020-09-24 10:12:33 -04:00
Tyler Goodlet ebb21b9ba3 Support re-entrant breakpoints
Keep an actor local (bool) flag which determines if there is already
a running debugger instance for the current process. If another task
tries to enter in this case, simply ignore it since allowing entry may
result in a deadlock where the new task will be sync waiting on the
parent stdio lock (a case that will never arrive due to the current
debugger's active use of it).

In the future we may want to allow FIFO queueing of local tasks where
instead of ignoring re-entrant breakpoints we allow tasks to async wait
for debugger release, though not sure the implications of that since
you'd likely want to support switching the debugger to the new task and
that could cause deadlocks where tasks are inter-dependent. It may be
more sane to just error on multiple breakpoint requests within an actor.
2020-09-24 10:12:33 -04:00
Tyler Goodlet f9ef3fc5de Cleanups and more comments 2020-09-24 10:12:33 -04:00
Tyler Goodlet 68773d51fd Always expose the debug module 2020-09-24 10:12:33 -04:00
Tyler Goodlet abaa2f5da0 Drop uneeded `parent_chan_cs()` cancel call 2020-09-24 10:12:33 -04:00
Tyler Goodlet 8eb9a742dd Add multi-process debugging support using `pdbpp`
This is the first step in addressing #113 and the initial support
of #130. Basically this allows (sub)processes to engage the `pdbpp`
debug machinery which read/writes the root actor's tty but only in
a FIFO semaphored way such that no two processes are using it
simultaneously. That means you can have multiple actors enter a trace or
crash and run the debugger in a sensible way without clobbering each
other's access to stdio. It required adding some "tear down hooks" to
a custom `pdbpp.Pdb` type such that we release a child's lock on the
parent on debugger exit (in this case when either of the "continue" or
"quit" commands are issued to the debugger console).

There's some code left commented in anticipation of full support for
issue #130 where we're need to actually capture and feed stdin to the
target (remote) actor which won't necessarily being running on the same
host.
2020-09-24 10:12:10 -04:00
Tyler Goodlet b06d4b023e Add support for "debug mode"
When enabled a crashed actor will connect to the parent with `pdb`
in post mortem mode.
2020-09-24 10:12:10 -04:00
Tyler Goodlet b11e91375c Initial attempt at multi-actor debugging
Allow entering and attaching to a `pdb` instance in a child process.
The current hackery is to have the child make an rpc to the parent and
ask it to hijack stdin, once complete the child enters a `pdb` blocking
method. The parent then relays all stdin input to the child thus
controlling the "remote" debugger.

A few things were added to accomplish this:
- tracking the mapping of subactors to their parent nurseries
- in the root actor, cancelling all nurseries under the root `trio` task
  on cancellation (i.e. `Actor.cancel()`)
- pass a "runtime vars" map down the actor tree for propagating global state
2020-09-24 10:12:10 -04:00
Tyler Goodlet 8c97f7bbb3 Create runtime variables 2020-09-24 10:12:10 -04:00
Tyler Goodlet ec5d443ee5 Always log actor errors 2020-08-13 11:55:22 -04:00
Tyler Goodlet 1ae0efb033 Make rpc_module_paths a list 2020-08-13 11:53:45 -04:00
Tyler Goodlet 8a995beb6a Docs fixes 2020-08-08 22:29:57 -04:00
Tyler Goodlet 292513b353 Module define default accept addr 2020-08-08 20:58:04 -04:00
Tyler Goodlet b3eba00c3a Appease the great mypy 2020-08-08 20:57:43 -04:00
Tyler Goodlet 42be410076 Handle mp accept_addr 2020-08-08 20:27:43 -04:00
Tyler Goodlet 8477d21499 Restructure actor runtime nursery scoping
In an effort acquire more deterministic actor cancellation,
this adds a clearer and more resilient (whilst possibly a bit
slower) internal nursery structure with explicit semantics for
clarifying the task-scope shutdown sequence.

Namely, on cancellation, the explicit steps are now:
- cancel all currently running rpc tasks and wait
  for them to complete
- cancel the channel server and wait for it to complete
- cancel the msg loop for the channel with the immediate parent
- de-register with arbiter if possible
- wait on remaining connections to release
- exit process

To accomplish this add a new nursery called the "service nursery" which
spawns all rpc tasks **instead of using** the "root nursery". The root
is now used solely for async launching the msg loop for the primary
channel with the parent such that it is (nearly) the last thing torn
down on cancellation.

In the future it should also be possible to have `self.cancel()` return
a result to the parent once the runtime is sure that the rest of the
shutdown is atomic; this would allow for a true unbounded shield in
`Portal.cancel_actor()`. This will likely require that the error
handling blocks in `Actor._async_main()` are moved "inside" the root
nursery block such that the msg loop with the parent truly is the last
thing to terminate.
2020-08-08 14:55:41 -04:00
Tyler Goodlet 90c7fa6963 Allow shielding in `open_portal()` 2020-08-08 14:47:52 -04:00
Tyler Goodlet 532429aec9 Harden `trio` spawner process waiting
Always shield waiting for he process and always run
``trio.Process.__aexit__()`` on teardown. This enforces
that shutdown happens to due cancellation triggered inside
the sub-actor instead of the process being killed externally
by the parent.
2020-08-08 14:43:25 -04:00
Tyler Goodlet fe45d99f65 Allow opening a portal through an existing channel 2020-08-07 12:02:06 -04:00
Tyler Goodlet ae8488a578 Always shield de-register step with arbiter 2020-08-07 11:36:26 -04:00