forked from goodboy/tractor
1
0
Fork 0
tractor/docs/README.rst

350 lines
11 KiB
ReStructuredText
Raw Permalink Normal View History

2021-02-24 03:23:31 +00:00
|logo| ``tractor``: next-gen Python parallelism
|gh_actions|
|docs|
2021-05-31 12:37:44 +00:00
``tractor`` is a `structured concurrent`_, multi-processing_ runtime built on trio_.
2020-10-14 15:07:48 +00:00
2021-05-31 12:37:44 +00:00
Fundamentally ``tractor`` gives you parallelism via ``trio``-"*actors*":
our nurseries_ let you spawn new Python processes which each run a ``trio``
scheduled runtime - a call to ``trio.run()``.
2021-05-31 12:37:44 +00:00
We believe the system adhere's to the `3 axioms`_ of an "`actor model`_"
but likely *does not* look like what *you* probably think an "actor
model" looks like, and that's *intentional*.
2021-02-25 00:11:05 +00:00
The first step to grok ``tractor`` is to get the basics of ``trio`` down.
A great place to start is the `trio docs`_ and this `blog post`_.
Features
--------
- **It's just** a ``trio`` API
2021-05-31 12:37:44 +00:00
- *Infinitely nesteable* process trees
- Built-in inter-process streaming APIs
- A (first ever?) "native" multi-core debugger UX for Python using `pdb++`_
- Support for a swappable, OS specific, process spawning layer
- A modular transport stack, allowing for custom serialization,
2021-02-25 00:11:05 +00:00
communications protocols, and environment specific IPC primitives
2021-05-31 12:37:44 +00:00
- `structured concurrency`_ from the ground up
2020-12-09 18:01:57 +00:00
2021-02-27 19:21:27 +00:00
Run a func in a process
-----------------------
Use ``trio``'s style of focussing on *tasks as functions*:
.. code:: python
"""
Run with a process monitor from a terminal using::
$TERM -e watch -n 0.1 "pstree -a $$" \
& python examples/parallelism/single_func.py \
&& kill $!
"""
import os
import tractor
import trio
async def burn_cpu():
pid = os.getpid()
# burn a core @ ~ 50kHz
for _ in range(50000):
await trio.sleep(1/50000/50)
return os.getpid()
async def main():
async with tractor.open_nursery() as n:
portal = await n.run_in_actor(burn_cpu)
# burn rubber in the parent too
await burn_cpu()
# wait on result from target function
pid = await portal.result()
# end of nursery block
print(f"Collected subproc {pid}")
if __name__ == '__main__':
trio.run(main)
This runs ``burn_cpu()`` in a new process and reaps it on completion
of the nursery block.
If you only need to run a sync function and retreive a single result, you
might want to check out `trio-parallel`_.
Zombie safe: self-destruct a process tree
-----------------------------------------
2021-02-25 00:11:05 +00:00
``tractor`` tries to protect you from zombies, no matter what.
.. code:: python
2021-02-22 18:35:22 +00:00
"""
Run with a process monitor from a terminal using::
2021-02-22 18:35:22 +00:00
$TERM -e watch -n 0.1 "pstree -a $$" \
& python examples/parallelism/we_are_processes.py \
&& kill $!
2021-02-22 18:35:22 +00:00
"""
from multiprocessing import cpu_count
import os
2021-02-22 18:35:22 +00:00
import tractor
import trio
2021-02-22 18:35:22 +00:00
async def target():
print(
f"Yo, i'm '{tractor.current_actor().name}' "
f"running in pid {os.getpid()}"
)
await trio.sleep_forever()
2021-02-22 18:35:22 +00:00
async def main():
2021-02-24 14:12:43 +00:00
async with tractor.open_nursery() as n:
2021-02-24 14:12:43 +00:00
for i in range(cpu_count()):
await n.run_in_actor(target, name=f'worker_{i}')
2021-02-24 14:12:43 +00:00
print('This process tree will self-destruct in 1 sec...')
await trio.sleep(1)
2021-02-24 14:12:43 +00:00
# you could have done this yourself
raise Exception('Self Destructed')
2021-02-22 18:35:22 +00:00
if __name__ == '__main__':
2021-02-24 14:12:43 +00:00
try:
trio.run(main)
except Exception:
print('Zombies Contained')
2020-12-09 18:01:57 +00:00
If you can create zombie child processes (without using a system signal)
it **is a bug**.
"Native" multi-process debugging
--------------------------------
Using the magic of `pdb++`_ and our internal IPC, we've
2021-02-22 18:35:22 +00:00
been able to create a native feeling debugging experience for
2021-02-25 00:11:05 +00:00
any (sub-)process in your ``tractor`` tree.
2021-02-22 18:35:22 +00:00
.. code:: python
2021-02-22 18:35:22 +00:00
from os import getpid
2021-02-22 18:35:22 +00:00
import tractor
import trio
2021-02-22 18:35:22 +00:00
async def breakpoint_forever():
"Indefinitely re-enter debugger in child actor."
while True:
yield 'yo'
await tractor.breakpoint()
2021-02-22 18:35:22 +00:00
async def name_error():
"Raise a ``NameError``"
getattr(doggypants)
2021-02-22 18:35:22 +00:00
async def main():
"""Test breakpoint in a streaming actor.
"""
async with tractor.open_nursery(
debug_mode=True,
loglevel='error',
) as n:
2021-02-22 18:35:22 +00:00
p0 = await n.start_actor('bp_forever', enable_modules=[__name__])
p1 = await n.start_actor('name_error', enable_modules=[__name__])
2021-02-22 18:35:22 +00:00
# retreive results
stream = await p0.run(breakpoint_forever)
await p1.run(name_error)
2021-02-22 18:35:22 +00:00
if __name__ == '__main__':
trio.run(main)
2021-02-22 18:35:22 +00:00
You can run this with::
2021-02-22 18:35:22 +00:00
>>> python examples/debugging/multi_daemon_subactors.py
2021-02-22 18:35:22 +00:00
And, yes, there's a built-in crash handling mode B)
2021-02-22 18:35:22 +00:00
We're hoping to add a respawn-from-repl system soon!
Worker poolz are easy peasy
---------------------------
2021-02-25 00:11:05 +00:00
The initial ask from most new users is *"how do I make a worker
pool thing?"*.
2021-02-25 00:11:05 +00:00
``tractor`` is built to handle any SC (structured concurrent) process
tree you can imagine; a "worker pool" pattern is a trivial special
case.
2021-02-27 21:08:44 +00:00
We have a `full worker pool re-implementation`_ of the std-lib's
``concurrent.futures.ProcessPoolExecutor`` example for reference.
You can run it like so (from this dir) to see the process tree in
real time::
$TERM -e watch -n 0.1 "pstree -a $$" \
& python examples/parallelism/concurrent_actors_primes.py \
&& kill $!
This uses no extra threads, fancy semaphores or futures; all we need
is ``tractor``'s IPC!
2021-02-27 21:08:44 +00:00
.. _full worker pool re-implementation: https://github.com/goodboy/tractor/blob/master/examples/parallelism/concurrent_actors_primes.py
Install
-------
2021-02-27 21:10:57 +00:00
From PyPi::
pip install tractor
2021-02-25 00:11:05 +00:00
From git::
pip install git+git://github.com/goodboy/tractor.git
Under the hood
--------------
``tractor`` is an attempt to pair trionic_ `structured concurrency`_ with
distributed Python. You can think of it as a ``trio``
*-across-processes* or simply as an opinionated replacement for the
stdlib's ``multiprocessing`` but built on async programming primitives
from the ground up.
Don't be scared off by this description. ``tractor`` **is just** ``trio``
but with nurseries for process management and cancel-able streaming IPC.
If you understand how to work with ``trio``, ``tractor`` will give you
2021-05-31 12:37:44 +00:00
the parallelism you may have been needing.
Wait, huh?! I thought "actors" have messages, and mailboxes and stuff?!
2021-05-31 12:56:36 +00:00
***********************************************************************
2021-06-14 10:41:10 +00:00
Let's stop and ask how many canon actor model papers have you actually read ;)
From our experience many "actor systems" aren't really "actor models"
since they **don't adhere** to the `3 axioms`_ and pay even less
attention to the problem of *unbounded non-determinism* (which was the
whole point for creation of the model in the first place).
2021-05-31 12:37:44 +00:00
From the author's mouth, **the only thing required** is `adherance to`_
the `3 axioms`_, *and that's it*.
2021-06-14 10:41:10 +00:00
``tractor`` adheres to said base requirements of an "actor model"::
2021-05-31 12:56:36 +00:00
In response to a message, an actor may:
2021-05-31 12:37:44 +00:00
2021-05-31 12:56:36 +00:00
- send a finite number of new messages
- create a finite number of new actors
- designate a new behavior to process subsequent messages
2021-06-14 10:41:10 +00:00
**and** requires *no further api changes* to accomplish this.
2021-05-31 12:56:36 +00:00
If you want do debate this further please feel free to chime in on our
2021-06-14 10:41:10 +00:00
chat or discuss on one of the following issues *after you've read
2021-06-14 12:10:59 +00:00
everything in them*:
2021-06-14 10:41:10 +00:00
- https://github.com/goodboy/tractor/issues/210
- https://github.com/goodboy/tractor/issues/18
2021-05-31 12:37:44 +00:00
2021-05-31 12:56:36 +00:00
Let's clarify our parlance
**************************
2021-05-31 12:37:44 +00:00
Whether or not ``tractor`` has "actors" underneath should be mostly
2021-05-31 12:56:36 +00:00
irrelevant to users other then for referring to the interactions of our
primary runtime primitives: each Python process + ``trio.run()``
+ surrounding IPC machinery. These are our high level, base
*runtime-units-of-abstraction* which both *are* (as much as they can
be in Python) and will be referred to as our *"actors"*.
The main goal of ``tractor`` is is to allow for highly distributed
software that, through the adherence to *structured concurrency*,
results in systems which fail in predictable, recoverable and maybe even
understandable ways; being an "actor model" is just one way to describe
properties of the system.
2021-02-25 00:11:05 +00:00
What's on the TODO:
-------------------
Help us push toward the future.
- (Soon to land) ``asyncio`` support allowing for "infected" actors where
`trio` drives the `asyncio` scheduler via the astounding "`guest mode`_"
- Typed messaging protocols (ex. via ``msgspec``)
- Erlang-style supervisors via composed context managers
Feel like saying hi?
--------------------
2019-01-17 04:19:29 +00:00
This project is very much coupled to the ongoing development of
2020-09-24 14:04:56 +00:00
``trio`` (i.e. ``tractor`` gets most of its ideas from that brilliant
community). If you want to help, have suggestions or just want to
say hi, please feel free to reach us in our `matrix channel`_. If
matrix seems too hip, we're also mostly all in the the `trio gitter
channel`_!
2021-05-31 12:37:44 +00:00
.. _nurseries: https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/#nurseries-a-structured-replacement-for-go-statements
.. _actor model: https://en.wikipedia.org/wiki/Actor_model
.. _trio: https://github.com/python-trio/trio
.. _multi-processing: https://en.wikipedia.org/wiki/Multiprocessing
.. _trionic: https://trio.readthedocs.io/en/latest/design.html#high-level-design-principles
.. _async sandwich: https://trio.readthedocs.io/en/latest/tutorial.html#async-sandwich
.. _structured concurrent: https://trio.discourse.group/t/concise-definition-of-structured-concurrency/228
.. _3 axioms: https://www.youtube.com/watch?v=7erJ1DV_Tlo&t=162s
2021-03-11 15:07:59 +00:00
.. .. _3 axioms: https://en.wikipedia.org/wiki/Actor_model#Fundamental_concepts
2021-05-31 12:37:44 +00:00
.. _adherance to: https://www.youtube.com/watch?v=7erJ1DV_Tlo&t=1821s
.. _trio gitter channel: https://gitter.im/python-trio/general
.. _matrix channel: https://matrix.to/#/!tractor:matrix.org
.. _pdb++: https://github.com/pdbpp/pdbpp
.. _guest mode: https://trio.readthedocs.io/en/stable/reference-lowlevel.html?highlight=guest%20mode#using-guest-mode-to-run-trio-on-top-of-other-event-loops
.. _messages: https://en.wikipedia.org/wiki/Message_passing
.. _trio docs: https://trio.readthedocs.io/en/latest/
.. _blog post: https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/
2021-03-11 15:07:59 +00:00
.. _structured concurrency: https://en.wikipedia.org/wiki/Structured_concurrency
.. _unrequirements: https://en.wikipedia.org/wiki/Actor_model#Direct_communication_and_asynchrony
.. _async generators: https://www.python.org/dev/peps/pep-0525/
2021-02-27 19:21:27 +00:00
.. _trio-parallel: https://github.com/richardsheridan/trio-parallel
.. |gh_actions| image:: https://img.shields.io/endpoint.svg?url=https%3A%2F%2Factions-badge.atrox.dev%2Fgoodboy%2Ftractor%2Fbadge&style=popout-square
:target: https://actions-badge.atrox.dev/goodboy/tractor/goto
.. |docs| image:: https://readthedocs.org/projects/tractor/badge/?version=latest
:target: https://tractor.readthedocs.io/en/latest/?badge=latest
:alt: Documentation Status
2021-02-24 03:23:31 +00:00
.. |logo| image:: _static/tractor_logo_side.svg
:width: 250
2021-02-24 03:23:31 +00:00
:align: middle