tractor/docs/README.rst

333 lines
10 KiB
ReStructuredText
Raw Normal View History

2021-02-24 03:23:31 +00:00
|logo| ``tractor``: next-gen Python parallelism
|gh_actions|
|docs|
2021-05-31 12:37:44 +00:00
``tractor`` is a `structured concurrent`_, multi-processing_ runtime built on trio_.
2020-10-14 15:07:48 +00:00
2021-05-31 12:37:44 +00:00
Fundamentally ``tractor`` gives you parallelism via ``trio``-"*actors*":
our nurseries_ let you spawn new Python processes which each run a ``trio``
scheduled runtime - a call to ``trio.run()``.
2021-05-31 12:37:44 +00:00
We believe the system adhere's to the `3 axioms`_ of an "`actor model`_"
but likely *does not* look like what *you* probably think an "actor
model" looks like, and that's *intentional*.
2021-02-25 00:11:05 +00:00
The first step to grok ``tractor`` is to get the basics of ``trio`` down.
A great place to start is the `trio docs`_ and this `blog post`_.
Features
--------
- **It's just** a ``trio`` API
2021-05-31 12:37:44 +00:00
- *Infinitely nesteable* process trees
- Built-in inter-process streaming APIs
- A (first ever?) "native" multi-core debugger UX for Python using `pdb++`_
- Support for a swappable, OS specific, process spawning layer
- A modular transport stack, allowing for custom serialization,
2021-02-25 00:11:05 +00:00
communications protocols, and environment specific IPC primitives
2021-05-31 12:37:44 +00:00
- `structured concurrency`_ from the ground up
2020-12-09 18:01:57 +00:00
2021-02-27 19:21:27 +00:00
Run a func in a process
-----------------------
Use ``trio``'s style of focussing on *tasks as functions*:
.. code:: python
"""
Run with a process monitor from a terminal using::
$TERM -e watch -n 0.1 "pstree -a $$" \
& python examples/parallelism/single_func.py \
&& kill $!
"""
import os
import tractor
import trio
async def burn_cpu():
pid = os.getpid()
# burn a core @ ~ 50kHz
for _ in range(50000):
await trio.sleep(1/50000/50)
return os.getpid()
async def main():
async with tractor.open_nursery() as n:
portal = await n.run_in_actor(burn_cpu)
# burn rubber in the parent too
await burn_cpu()
# wait on result from target function
pid = await portal.result()
# end of nursery block
print(f"Collected subproc {pid}")
if __name__ == '__main__':
trio.run(main)
This runs ``burn_cpu()`` in a new process and reaps it on completion
of the nursery block.
If you only need to run a sync function and retreive a single result, you
might want to check out `trio-parallel`_.
Zombie safe: self-destruct a process tree
-----------------------------------------
2021-02-25 00:11:05 +00:00
``tractor`` tries to protect you from zombies, no matter what.
.. code:: python
2021-02-22 18:35:22 +00:00
"""
Run with a process monitor from a terminal using::
2021-02-22 18:35:22 +00:00
$TERM -e watch -n 0.1 "pstree -a $$" \
& python examples/parallelism/we_are_processes.py \
&& kill $!
2021-02-22 18:35:22 +00:00
"""
from multiprocessing import cpu_count
import os
2021-02-22 18:35:22 +00:00
import tractor
import trio
2021-02-22 18:35:22 +00:00
async def target():
print(
f"Yo, i'm '{tractor.current_actor().name}' "
f"running in pid {os.getpid()}"
)
await trio.sleep_forever()
2021-02-22 18:35:22 +00:00
async def main():
2021-02-24 14:12:43 +00:00
async with tractor.open_nursery() as n:
2021-02-24 14:12:43 +00:00
for i in range(cpu_count()):
await n.run_in_actor(target, name=f'worker_{i}')
2021-02-24 14:12:43 +00:00
print('This process tree will self-destruct in 1 sec...')
await trio.sleep(1)
2021-02-24 14:12:43 +00:00
# you could have done this yourself
raise Exception('Self Destructed')
2021-02-22 18:35:22 +00:00
if __name__ == '__main__':
2021-02-24 14:12:43 +00:00
try:
trio.run(main)
except Exception:
print('Zombies Contained')
2020-12-09 18:01:57 +00:00
If you can create zombie child processes (without using a system signal)
it **is a bug**.
"Native" multi-process debugging
--------------------------------
Using the magic of `pdb++`_ and our internal IPC, we've
2021-02-22 18:35:22 +00:00
been able to create a native feeling debugging experience for
2021-02-25 00:11:05 +00:00
any (sub-)process in your ``tractor`` tree.
2021-02-22 18:35:22 +00:00
.. code:: python
2021-02-22 18:35:22 +00:00
from os import getpid
2021-02-22 18:35:22 +00:00
import tractor
import trio
2021-02-22 18:35:22 +00:00
async def breakpoint_forever():
"Indefinitely re-enter debugger in child actor."
while True:
yield 'yo'
await tractor.breakpoint()
2021-02-22 18:35:22 +00:00
async def name_error():
"Raise a ``NameError``"
getattr(doggypants)
2021-02-22 18:35:22 +00:00
async def main():
"""Test breakpoint in a streaming actor.
"""
async with tractor.open_nursery(
debug_mode=True,
loglevel='error',
) as n:
2021-02-22 18:35:22 +00:00
p0 = await n.start_actor('bp_forever', enable_modules=[__name__])
p1 = await n.start_actor('name_error', enable_modules=[__name__])
2021-02-22 18:35:22 +00:00
# retreive results
stream = await p0.run(breakpoint_forever)
await p1.run(name_error)
2021-02-22 18:35:22 +00:00
if __name__ == '__main__':
trio.run(main)
2021-02-22 18:35:22 +00:00
You can run this with::
2021-02-22 18:35:22 +00:00
>>> python examples/debugging/multi_daemon_subactors.py
2021-02-22 18:35:22 +00:00
And, yes, there's a built-in crash handling mode B)
2021-02-22 18:35:22 +00:00
We're hoping to add a respawn-from-repl system soon!
Worker poolz are easy peasy
---------------------------
2021-02-25 00:11:05 +00:00
The initial ask from most new users is *"how do I make a worker
pool thing?"*.
2021-02-25 00:11:05 +00:00
``tractor`` is built to handle any SC (structured concurrent) process
tree you can imagine; a "worker pool" pattern is a trivial special
case.
2021-02-27 21:08:44 +00:00
We have a `full worker pool re-implementation`_ of the std-lib's
``concurrent.futures.ProcessPoolExecutor`` example for reference.
You can run it like so (from this dir) to see the process tree in
real time::
$TERM -e watch -n 0.1 "pstree -a $$" \
& python examples/parallelism/concurrent_actors_primes.py \
&& kill $!
This uses no extra threads, fancy semaphores or futures; all we need
is ``tractor``'s IPC!
2021-02-27 21:08:44 +00:00
.. _full worker pool re-implementation: https://github.com/goodboy/tractor/blob/master/examples/parallelism/concurrent_actors_primes.py
Install
-------
2021-02-27 21:10:57 +00:00
From PyPi::
pip install tractor
2021-02-25 00:11:05 +00:00
From git::
pip install git+git://github.com/goodboy/tractor.git
Under the hood
--------------
``tractor`` is an attempt to pair trionic_ `structured concurrency`_ with
distributed Python. You can think of it as a ``trio``
*-across-processes* or simply as an opinionated replacement for the
stdlib's ``multiprocessing`` but built on async programming primitives
from the ground up.
Don't be scared off by this description. ``tractor`` **is just** ``trio``
but with nurseries for process management and cancel-able streaming IPC.
If you understand how to work with ``trio``, ``tractor`` will give you
2021-05-31 12:37:44 +00:00
the parallelism you may have been needing.
Wait, huh?! I thought "actors" have messages, and mailboxes and stuff?!
-----------------------------------------------------------------------
Let's stop and ask how many canon actor model papers have you actually read?
From the author's mouth, **the only thing required** is `adherance to`_
the `3 axioms`_, *and that's it*.
2021-05-31 12:37:44 +00:00
To get more fired up on the matter, please read issue 1 and issue 2.
*News flash*: many "actor systems" people create aren't really "actor
models" since they don't adhere to the `3 axioms`_. Despite not looking
like an one from the outside ``tractor`` **does seem to adhere** to the
base requirements to be considered an "actor model".
If you want do debate this further please feel free to chime in on our
chat or discuss on one of the above issues **after you've read
everything in them**.
Let's keep our parlance simple
******************************
The main goal of ``tractor`` besides the above feature set is is to
allow for highly distributed software that, through the adherence to
*structured concurrency*, results in systems which fail in predictable,
recoverable and maybe even understandable ways.
Whether or not ``tractor`` has "actors" underneath should be mostly
irrelvant to users other then for referring to the interactions of
our primary runtime primitives: a Python process + `trio.run()` +
surrounding IPC machinery as *single-units-of-abstraction*.
2021-02-25 00:11:05 +00:00
What's on the TODO:
-------------------
Help us push toward the future.
- (Soon to land) ``asyncio`` support allowing for "infected" actors where
`trio` drives the `asyncio` scheduler via the astounding "`guest mode`_"
- Typed messaging protocols (ex. via ``msgspec``)
- Erlang-style supervisors via composed context managers
Feel like saying hi?
--------------------
2019-01-17 04:19:29 +00:00
This project is very much coupled to the ongoing development of
2020-09-24 14:04:56 +00:00
``trio`` (i.e. ``tractor`` gets most of its ideas from that brilliant
community). If you want to help, have suggestions or just want to
say hi, please feel free to reach us in our `matrix channel`_. If
matrix seems too hip, we're also mostly all in the the `trio gitter
channel`_!
2021-05-31 12:37:44 +00:00
.. _nurseries: https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/#nurseries-a-structured-replacement-for-go-statements
.. _actor model: https://en.wikipedia.org/wiki/Actor_model
.. _trio: https://github.com/python-trio/trio
.. _multi-processing: https://en.wikipedia.org/wiki/Multiprocessing
.. _trionic: https://trio.readthedocs.io/en/latest/design.html#high-level-design-principles
.. _async sandwich: https://trio.readthedocs.io/en/latest/tutorial.html#async-sandwich
.. _structured concurrent: https://trio.discourse.group/t/concise-definition-of-structured-concurrency/228
.. _3 axioms: https://www.youtube.com/watch?v=7erJ1DV_Tlo&t=162s
.. _adherance to: https://www.youtube.com/watch?v=7erJ1DV_Tlo&t=1821s
.. _trio gitter channel: https://gitter.im/python-trio/general
.. _matrix channel: https://matrix.to/#/!tractor:matrix.org
.. _pdb++: https://github.com/pdbpp/pdbpp
.. _guest mode: https://trio.readthedocs.io/en/stable/reference-lowlevel.html?highlight=guest%20mode#using-guest-mode-to-run-trio-on-top-of-other-event-loops
.. _messages: https://en.wikipedia.org/wiki/Message_passing
.. _trio docs: https://trio.readthedocs.io/en/latest/
.. _blog post: https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/
.. _structured concurrency: https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/
.. _unrequirements: https://en.wikipedia.org/wiki/Actor_model#Direct_communication_and_asynchrony
.. _async generators: https://www.python.org/dev/peps/pep-0525/
2021-02-27 19:21:27 +00:00
.. _trio-parallel: https://github.com/richardsheridan/trio-parallel
.. |gh_actions| image:: https://img.shields.io/endpoint.svg?url=https%3A%2F%2Factions-badge.atrox.dev%2Fgoodboy%2Ftractor%2Fbadge&style=popout-square
:target: https://actions-badge.atrox.dev/goodboy/tractor/goto
.. |docs| image:: https://readthedocs.org/projects/tractor/badge/?version=latest
:target: https://tractor.readthedocs.io/en/latest/?badge=latest
:alt: Documentation Status
2021-02-24 03:23:31 +00:00
.. |logo| image:: _static/tractor_logo_side.svg
:width: 250
2021-02-24 03:23:31 +00:00
:align: middle