forked from goodboy/tractor
442 lines
13 KiB
ReStructuredText
442 lines
13 KiB
ReStructuredText
|logo| ``tractor``: next-gen Python parallelism
|
|
|
|
|gh_actions|
|
|
|docs|
|
|
|
|
``tractor`` is a `structured concurrent`_, multi-processing_ runtime built on trio_.
|
|
|
|
Fundamentally ``tractor`` gives you parallelism via ``trio``-"*actors*":
|
|
our nurseries_ let you spawn new Python processes which each run a ``trio``
|
|
scheduled runtime - a call to ``trio.run()``.
|
|
|
|
We believe the system adhere's to the `3 axioms`_ of an "`actor model`_"
|
|
but likely *does not* look like what *you* probably think an "actor
|
|
model" looks like, and that's *intentional*.
|
|
|
|
The first step to grok ``tractor`` is to get the basics of ``trio`` down.
|
|
A great place to start is the `trio docs`_ and this `blog post`_.
|
|
|
|
|
|
Features
|
|
--------
|
|
- **It's just** a ``trio`` API
|
|
- *Infinitely nesteable* process trees
|
|
- Built-in inter-process streaming APIs
|
|
- A (first ever?) "native" multi-core debugger UX for Python using `pdb++`_
|
|
- Support for a swappable, OS specific, process spawning layer
|
|
- A modular transport stack, allowing for custom serialization,
|
|
communications protocols, and environment specific IPC primitives
|
|
- `structured concurrency`_ from the ground up
|
|
|
|
|
|
Run a func in a process
|
|
-----------------------
|
|
Use ``trio``'s style of focussing on *tasks as functions*:
|
|
|
|
.. code:: python
|
|
|
|
"""
|
|
Run with a process monitor from a terminal using::
|
|
|
|
$TERM -e watch -n 0.1 "pstree -a $$" \
|
|
& python examples/parallelism/single_func.py \
|
|
&& kill $!
|
|
|
|
"""
|
|
import os
|
|
|
|
import tractor
|
|
import trio
|
|
|
|
|
|
async def burn_cpu():
|
|
|
|
pid = os.getpid()
|
|
|
|
# burn a core @ ~ 50kHz
|
|
for _ in range(50000):
|
|
await trio.sleep(1/50000/50)
|
|
|
|
return os.getpid()
|
|
|
|
|
|
async def main():
|
|
|
|
async with tractor.open_nursery() as n:
|
|
|
|
portal = await n.run_in_actor(burn_cpu)
|
|
|
|
# burn rubber in the parent too
|
|
await burn_cpu()
|
|
|
|
# wait on result from target function
|
|
pid = await portal.result()
|
|
|
|
# end of nursery block
|
|
print(f"Collected subproc {pid}")
|
|
|
|
|
|
if __name__ == '__main__':
|
|
trio.run(main)
|
|
|
|
|
|
This runs ``burn_cpu()`` in a new process and reaps it on completion
|
|
of the nursery block.
|
|
|
|
If you only need to run a sync function and retreive a single result, you
|
|
might want to check out `trio-parallel`_.
|
|
|
|
|
|
Zombie safe: self-destruct a process tree
|
|
-----------------------------------------
|
|
``tractor`` tries to protect you from zombies, no matter what.
|
|
|
|
.. code:: python
|
|
|
|
"""
|
|
Run with a process monitor from a terminal using::
|
|
|
|
$TERM -e watch -n 0.1 "pstree -a $$" \
|
|
& python examples/parallelism/we_are_processes.py \
|
|
&& kill $!
|
|
|
|
"""
|
|
from multiprocessing import cpu_count
|
|
import os
|
|
|
|
import tractor
|
|
import trio
|
|
|
|
|
|
async def target():
|
|
print(
|
|
f"Yo, i'm '{tractor.current_actor().name}' "
|
|
f"running in pid {os.getpid()}"
|
|
)
|
|
|
|
await trio.sleep_forever()
|
|
|
|
|
|
async def main():
|
|
|
|
async with tractor.open_nursery() as n:
|
|
|
|
for i in range(cpu_count()):
|
|
await n.run_in_actor(target, name=f'worker_{i}')
|
|
|
|
print('This process tree will self-destruct in 1 sec...')
|
|
await trio.sleep(1)
|
|
|
|
# raise an error in root actor/process and trigger
|
|
# reaping of all minions
|
|
raise Exception('Self Destructed')
|
|
|
|
|
|
if __name__ == '__main__':
|
|
try:
|
|
trio.run(main)
|
|
except Exception:
|
|
print('Zombies Contained')
|
|
|
|
|
|
If you can create zombie child processes (without using a system signal)
|
|
it **is a bug**.
|
|
|
|
|
|
"Native" multi-process debugging
|
|
--------------------------------
|
|
Using the magic of `pdb++`_ and our internal IPC, we've
|
|
been able to create a native feeling debugging experience for
|
|
any (sub-)process in your ``tractor`` tree.
|
|
|
|
.. code:: python
|
|
|
|
from os import getpid
|
|
|
|
import tractor
|
|
import trio
|
|
|
|
|
|
async def breakpoint_forever():
|
|
"Indefinitely re-enter debugger in child actor."
|
|
while True:
|
|
yield 'yo'
|
|
await tractor.breakpoint()
|
|
|
|
|
|
async def name_error():
|
|
"Raise a ``NameError``"
|
|
getattr(doggypants)
|
|
|
|
|
|
async def main():
|
|
"""Test breakpoint in a streaming actor.
|
|
"""
|
|
async with tractor.open_nursery(
|
|
debug_mode=True,
|
|
loglevel='error',
|
|
) as n:
|
|
|
|
p0 = await n.start_actor('bp_forever', enable_modules=[__name__])
|
|
p1 = await n.start_actor('name_error', enable_modules=[__name__])
|
|
|
|
# retreive results
|
|
stream = await p0.run(breakpoint_forever)
|
|
await p1.run(name_error)
|
|
|
|
|
|
if __name__ == '__main__':
|
|
trio.run(main)
|
|
|
|
|
|
You can run this with::
|
|
|
|
>>> python examples/debugging/multi_daemon_subactors.py
|
|
|
|
And, yes, there's a built-in crash handling mode B)
|
|
|
|
We're hoping to add a respawn-from-repl system soon!
|
|
|
|
|
|
SC compatible bi-directional streaming
|
|
--------------------------------------
|
|
Yes, you saw it here first; we provide 2-way streams
|
|
with reliable, transitive setup/teardown semantics.
|
|
|
|
Our nascent api is remniscent of ``trio.Nursery.start()``
|
|
style invocation:
|
|
|
|
.. code:: python
|
|
|
|
import trio
|
|
import tractor
|
|
|
|
|
|
@tractor.context
|
|
async def simple_rpc(
|
|
|
|
ctx: tractor.Context,
|
|
data: int,
|
|
|
|
) -> None:
|
|
'''Test a small ping-pong 2-way streaming server.
|
|
|
|
'''
|
|
# signal to parent that we're up much like
|
|
# ``trio_typing.TaskStatus.started()``
|
|
await ctx.started(data + 1)
|
|
|
|
async with ctx.open_stream() as stream:
|
|
|
|
count = 0
|
|
async for msg in stream:
|
|
|
|
assert msg == 'ping'
|
|
await stream.send('pong')
|
|
count += 1
|
|
|
|
else:
|
|
assert count == 10
|
|
|
|
|
|
async def main() -> None:
|
|
|
|
async with tractor.open_nursery() as n:
|
|
|
|
portal = await n.start_actor(
|
|
'rpc_server',
|
|
enable_modules=[__name__],
|
|
)
|
|
|
|
# XXX: this syntax requires py3.9
|
|
async with (
|
|
|
|
portal.open_context(
|
|
simple_rpc,
|
|
data=10,
|
|
) as (ctx, sent),
|
|
|
|
ctx.open_stream() as stream,
|
|
):
|
|
|
|
assert sent == 11
|
|
|
|
count = 0
|
|
# receive msgs using async for style
|
|
await stream.send('ping')
|
|
|
|
async for msg in stream:
|
|
assert msg == 'pong'
|
|
await stream.send('ping')
|
|
count += 1
|
|
|
|
if count >= 9:
|
|
break
|
|
|
|
|
|
# explicitly teardown the daemon-actor
|
|
await portal.cancel_actor()
|
|
|
|
|
|
if __name__ == '__main__':
|
|
trio.run(main)
|
|
|
|
|
|
See original proposal and discussion in `#53`_ as well
|
|
as follow up improvements in `#223`_ that we'd love to
|
|
hear your thoughts on!
|
|
|
|
.. _#53: https://github.com/goodboy/tractor/issues/53
|
|
.. _#223: https://github.com/goodboy/tractor/issues/223
|
|
|
|
|
|
Worker poolz are easy peasy
|
|
---------------------------
|
|
The initial ask from most new users is *"how do I make a worker
|
|
pool thing?"*.
|
|
|
|
``tractor`` is built to handle any SC (structured concurrent) process
|
|
tree you can imagine; a "worker pool" pattern is a trivial special
|
|
case.
|
|
|
|
We have a `full worker pool re-implementation`_ of the std-lib's
|
|
``concurrent.futures.ProcessPoolExecutor`` example for reference.
|
|
|
|
You can run it like so (from this dir) to see the process tree in
|
|
real time::
|
|
|
|
$TERM -e watch -n 0.1 "pstree -a $$" \
|
|
& python examples/parallelism/concurrent_actors_primes.py \
|
|
&& kill $!
|
|
|
|
This uses no extra threads, fancy semaphores or futures; all we need
|
|
is ``tractor``'s IPC!
|
|
|
|
|
|
.. _full worker pool re-implementation: https://github.com/goodboy/tractor/blob/master/examples/parallelism/concurrent_actors_primes.py
|
|
|
|
Install
|
|
-------
|
|
From PyPi::
|
|
|
|
pip install tractor
|
|
|
|
|
|
From git::
|
|
|
|
pip install git+git://github.com/goodboy/tractor.git
|
|
|
|
|
|
Under the hood
|
|
--------------
|
|
``tractor`` is an attempt to pair trionic_ `structured concurrency`_ with
|
|
distributed Python. You can think of it as a ``trio``
|
|
*-across-processes* or simply as an opinionated replacement for the
|
|
stdlib's ``multiprocessing`` but built on async programming primitives
|
|
from the ground up.
|
|
|
|
Don't be scared off by this description. ``tractor`` **is just** ``trio``
|
|
but with nurseries for process management and cancel-able streaming IPC.
|
|
If you understand how to work with ``trio``, ``tractor`` will give you
|
|
the parallelism you may have been needing.
|
|
|
|
|
|
Wait, huh?! I thought "actors" have messages, and mailboxes and stuff?!
|
|
***********************************************************************
|
|
Let's stop and ask how many canon actor model papers have you actually read ;)
|
|
|
|
From our experience many "actor systems" aren't really "actor models"
|
|
since they **don't adhere** to the `3 axioms`_ and pay even less
|
|
attention to the problem of *unbounded non-determinism* (which was the
|
|
whole point for creation of the model in the first place).
|
|
|
|
From the author's mouth, **the only thing required** is `adherance to`_
|
|
the `3 axioms`_, *and that's it*.
|
|
|
|
``tractor`` adheres to said base requirements of an "actor model"::
|
|
|
|
In response to a message, an actor may:
|
|
|
|
- send a finite number of new messages
|
|
- create a finite number of new actors
|
|
- designate a new behavior to process subsequent messages
|
|
|
|
|
|
**and** requires *no further api changes* to accomplish this.
|
|
|
|
If you want do debate this further please feel free to chime in on our
|
|
chat or discuss on one of the following issues *after you've read
|
|
everything in them*:
|
|
|
|
- https://github.com/goodboy/tractor/issues/210
|
|
- https://github.com/goodboy/tractor/issues/18
|
|
|
|
|
|
Let's clarify our parlance
|
|
**************************
|
|
Whether or not ``tractor`` has "actors" underneath should be mostly
|
|
irrelevant to users other then for referring to the interactions of our
|
|
primary runtime primitives: each Python process + ``trio.run()``
|
|
+ surrounding IPC machinery. These are our high level, base
|
|
*runtime-units-of-abstraction* which both *are* (as much as they can
|
|
be in Python) and will be referred to as our *"actors"*.
|
|
|
|
The main goal of ``tractor`` is is to allow for highly distributed
|
|
software that, through the adherence to *structured concurrency*,
|
|
results in systems which fail in predictable, recoverable and maybe even
|
|
understandable ways; being an "actor model" is just one way to describe
|
|
properties of the system.
|
|
|
|
|
|
What's on the TODO:
|
|
-------------------
|
|
Help us push toward the future.
|
|
|
|
- (Soon to land) ``asyncio`` support allowing for "infected" actors where
|
|
`trio` drives the `asyncio` scheduler via the astounding "`guest mode`_"
|
|
- Typed messaging protocols (ex. via ``msgspec``)
|
|
- Erlang-style supervisors via composed context managers
|
|
|
|
|
|
Feel like saying hi?
|
|
--------------------
|
|
This project is very much coupled to the ongoing development of
|
|
``trio`` (i.e. ``tractor`` gets most of its ideas from that brilliant
|
|
community). If you want to help, have suggestions or just want to
|
|
say hi, please feel free to reach us in our `matrix channel`_. If
|
|
matrix seems too hip, we're also mostly all in the the `trio gitter
|
|
channel`_!
|
|
|
|
.. _nurseries: https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/#nurseries-a-structured-replacement-for-go-statements
|
|
.. _actor model: https://en.wikipedia.org/wiki/Actor_model
|
|
.. _trio: https://github.com/python-trio/trio
|
|
.. _multi-processing: https://en.wikipedia.org/wiki/Multiprocessing
|
|
.. _trionic: https://trio.readthedocs.io/en/latest/design.html#high-level-design-principles
|
|
.. _async sandwich: https://trio.readthedocs.io/en/latest/tutorial.html#async-sandwich
|
|
.. _structured concurrent: https://trio.discourse.group/t/concise-definition-of-structured-concurrency/228
|
|
.. _3 axioms: https://www.youtube.com/watch?v=7erJ1DV_Tlo&t=162s
|
|
.. _adherance to: https://www.youtube.com/watch?v=7erJ1DV_Tlo&t=1821s
|
|
.. _trio gitter channel: https://gitter.im/python-trio/general
|
|
.. _matrix channel: https://matrix.to/#/!tractor:matrix.org
|
|
.. _pdb++: https://github.com/pdbpp/pdbpp
|
|
.. _guest mode: https://trio.readthedocs.io/en/stable/reference-lowlevel.html?highlight=guest%20mode#using-guest-mode-to-run-trio-on-top-of-other-event-loops
|
|
.. _messages: https://en.wikipedia.org/wiki/Message_passing
|
|
.. _trio docs: https://trio.readthedocs.io/en/latest/
|
|
.. _blog post: https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/
|
|
.. _structured concurrency: https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/
|
|
.. _unrequirements: https://en.wikipedia.org/wiki/Actor_model#Direct_communication_and_asynchrony
|
|
.. _async generators: https://www.python.org/dev/peps/pep-0525/
|
|
.. _trio-parallel: https://github.com/richardsheridan/trio-parallel
|
|
|
|
|
|
.. |gh_actions| image:: https://img.shields.io/endpoint.svg?url=https%3A%2F%2Factions-badge.atrox.dev%2Fgoodboy%2Ftractor%2Fbadge&style=popout-square
|
|
:target: https://actions-badge.atrox.dev/goodboy/tractor/goto
|
|
|
|
.. |docs| image:: https://readthedocs.org/projects/tractor/badge/?version=latest
|
|
:target: https://tractor.readthedocs.io/en/latest/?badge=latest
|
|
:alt: Documentation Status
|
|
|
|
.. |logo| image:: _static/tractor_logo_side.svg
|
|
:width: 250
|
|
:align: middle
|