Compare commits

...

30 Commits

Author SHA1 Message Date
Tyler Goodlet 3a31c9d338 to_asyncio: mask error logging, not sure it adds that much 2023-09-26 10:32:01 -04:00
Tyler Goodlet 3dc57e384e Always no-raise try-to-pop registry addrs 2023-09-15 14:20:12 -04:00
Tyler Goodlet 687852f368 Add stale entry deleted from registrar test
By spawning an actor task that immediately shuts down the transport
server and then sleeps, verify that attempting to connect via the
`._discovery.find_actor()` helper delivers `None` for the `Portal`
value.

Relates to #184 and #216
2023-08-28 12:20:12 -04:00
Tyler Goodlet d83d991f21 Handle stale registrar entries; detect and delete
In cases where an actor's transport server task (by default handling new
TCP connections) terminates early but does not de-register from the
pertaining registry (aka the registrar) actor's address table, the
trying-to-connect client actor will get a connection error on that
address. In the case where client handles a (local) `OSError` (meaning
the target actor address is likely being contacted over `localhost`)
exception, make a further call to the registrar to delete the stale
entry and `yield None` gracefully indicating to calling code that no
`Portal` can be delivered to the target address.

This issue was originally discovered in `piker` where the `emsd`
(clearing engine) actor would sometimes crash on rapid client
re-connects and then leave a `pikerd` stale entry. With this fix new
clients will attempt connect via an endpoint which will re-spawn the
`emsd` when a `None` portal is delivered (via `maybe_spawn_em()`).
2023-08-28 11:26:36 -04:00
Tyler Goodlet 1cf712cfac Add `Arbiter.delete_sockaddr()` to remove addrs
Since stale addrs can be leaked where the actor transport server task
crashes but doesn't (successfully) unregister from the registrar, we
need a remote way to remove such entries; hence this new (registrar)
method.

To implement this make use of the `bidict` lib for the `._registry`
table thus making it super simple to do reverse uuid lookups from an
input socket-address.
2023-08-21 19:07:14 -04:00
Tyler Goodlet 22c14e235e Expose `Channel` @ pkg level, drop `_debug.pp()` alias 2023-08-18 10:18:25 -04:00
Tyler Goodlet 1102843087 Teensie tidy up on actor doc string 2023-08-18 10:10:36 -04:00
Tyler Goodlet e03bec5efc Move `.to_asyncio` to modern optional value type annots 2023-07-21 15:08:46 -04:00
Tyler Goodlet bee2c36072 Make `NamespacePath` work on object refs
Detect if the input ref is a non-func (like an `object` instance) in
which case grab its type name using `type()`. Wrap all the name-getting
into a new `_mk_fqpn()` static meth: gets the "fully qualified path
name" and returns path and name in tuple; port other methds to use it.
Refine and update the docs B)
2023-07-12 13:07:30 -04:00
Tyler Goodlet b36b3d522f Map `breakpoint()` built-in to new `.pause_from_sync()` ep 2023-07-07 15:35:52 -04:00
Tyler Goodlet 4ace8f6037 Fix frame-selection display on first REPL entry
For whatever reason pdb(p), and in general, will show the frame of the
*next* python instruction/LOC on initial entry (at least using
`.set_trace()`), as such remove the `try/finally` block in the sync
code entrypoint `.pause_from_sync()`, and also since doesn't seem like
we really need it anyway.

Further, and to this end:
- enable hidden frames support in our default config.
- fix/drop/mask all the frame ref-ing/mangling we had prior since it's no
  longer needed as well as manual `Lock` releasing which seems to work
  already by having the `greenback` spawned task do it's normal thing?
- move to no `Union` type annots.
- hide all frames that can add "this is the runtime confusion" to
  traces.
2023-07-07 14:51:44 -04:00
Tyler Goodlet 98a7326c85 ._runtime: log level tweaks, use crit for stale debug lock detection 2023-07-07 14:49:23 -04:00
Tyler Goodlet 46972df041 .log: more correct handling for `get_logger(__name__)` usage 2023-07-07 14:48:37 -04:00
Tyler Goodlet 565d7c3ee5 Add longer "required reading" list B) 2023-07-07 14:47:42 -04:00
Tyler Goodlet ac695a05bf Updates from latest `piker.data._sharedmem` changes 2023-06-22 17:16:17 -04:00
Tyler Goodlet fc56971a2d First proto: use `greenback` for sync func breakpointing
This works now for supporting a new `tractor.pause_from_sync()`
`tractor`-aware-replacement for `Pdb.set_trace()` from sync functions
which are also scheduled from our runtime. Uses `greenback` to do all
the magic of scheduling the bg `tractor._debug._pause()` task and
engaging the normal TTY locking machinery triggered by `await
tractor.breakpoint()`

Further this starts some public API renaming, making a switch to
`tractor.pause()` from `.breakpoint()` which IMO much better expresses
the semantics of the runtime intervention required to suffice
multi-process "breakpointing"; it also is an alternate name for the same
in computer science more generally: https://en.wikipedia.org/wiki/Breakpoint
It also avoids using the same name as the `breakpoint()` built-in which
is important since there **is alot more going on** when you call our
equivalent API.

Deats of that:
- add deprecation warning for `tractor.breakpoint()`
- add `tractor.pause()` and a shorthand, easier-to-type, alias `.pp()`
  for "pause-point" B)
- add `pause_from_sync()` as the new `breakpoint()`-from-sync-function
  hack which does all the `greenback` stuff for the user.

Still TODO:
- figure out where in the runtime and when to call
  `greenback.ensure_portal()`.
- fix the frame selection issue where
  `trio._core._ki._ki_protection_decorator:wrapper` seems to be always
  shown on REPL start as the selected frame..
2023-06-21 16:08:18 -04:00
Tyler Goodlet ee87cf0e29 Add a debug-mode-breakpoint-causes-hang case!
Only found this by luck more or less (while working on something in
a client project) and it turns out we can actually get to (yet another)
hang state where SIGINT will be ignored by the root actor on teardown..

I've added all the necessary logic flags to reproduce. We obviously need
a follow up bug issue and a test suite to replicate!

It appears as though the following are required based on very light
tinkering:
- infected asyncio mode active
- debug mode active
- the `trio` context must breakpoint *before* `.started()`-ing
- the `asyncio` must **not** error
2023-06-21 14:07:31 -04:00
Tyler Goodlet ebcb275cd8 Add (first-draft) infected-`asyncio` actor task uses debugger example 2023-06-21 14:07:31 -04:00
Tyler Goodlet f745da9fb2 Add `numpy` for testing optional integrated shm API layer 2023-06-15 12:20:20 -04:00
Tyler Goodlet 4f442efbd7 Pass `str` dtype for `use_str` case 2023-06-15 12:20:20 -04:00
Tyler Goodlet f9a84f0732 Allocate size-specced "empty" sequence from default values by type 2023-06-15 12:20:20 -04:00
Tyler Goodlet e0bf964ff0 Mod define `_USE_POSIX`, add a of of todos 2023-06-15 12:20:20 -04:00
Tyler Goodlet a9fc4c1b91 Parametrize rw test with variable frame sizes
Demonstrates fixed size frame-oriented reads by the child where the
parent only transmits a "read" stream msg on "frame fill events" such
that the child incrementally reads the shm list data (much like in
a real-time-buffered streaming system).
2023-06-15 12:20:20 -04:00
Tyler Goodlet b52ff270c5 Add `ShmList` slice support in `.__getitem__()` 2023-06-15 12:20:20 -04:00
Tyler Goodlet 1713ecd9f8 Rename token type to `NDToken` in the style of `nptyping` 2023-06-15 12:20:20 -04:00
Tyler Goodlet edb82fdd78 Don't require runtime (for now), type annot fixing 2023-06-15 12:20:20 -04:00
Tyler Goodlet 339d787cf8 Add repetitive attach to existing segment test 2023-06-15 12:20:20 -04:00
Tyler Goodlet c32b21b4b1 Add initial readers-writer shm list tests 2023-06-15 12:20:20 -04:00
Tyler Goodlet 71477290fc Add `ShmList` wrapping the stdlib's `ShareableList`
First attempt at getting `multiprocessing.shared_memory.ShareableList`
working; we wrap the stdlib type with a readonly attr and a `.key` for
cross-actor lookup. Also, rename all `numpy` specific routines to have
a `ndarray` suffix in the func names.
2023-06-15 12:20:20 -04:00
Tyler Goodlet 9716d86825 Initial module import from `piker.data._sharemem`
More or less a verbatim copy-paste minus some edgy variable naming and
internal `piker` module imports. There is a bunch of OHLC related
defaults that need to be dropped and we need to adjust to an optional
dependence on `numpy` by supporting shared lists as per the mp docs.
2023-06-15 12:20:20 -04:00
16 changed files with 1574 additions and 167 deletions

View File

@ -3,8 +3,8 @@
|gh_actions|
|docs|
``tractor`` is a `structured concurrent`_, multi-processing_ runtime
built on trio_.
``tractor`` is a `structured concurrent`_, (optionally
distributed_) multi-processing_ runtime built on trio_.
Fundamentally, ``tractor`` gives you parallelism via
``trio``-"*actors*": independent Python processes (aka
@ -17,11 +17,20 @@ protocol" constructed on top of multiple Pythons each running a ``trio``
scheduled runtime - a call to ``trio.run()``.
We believe the system adheres to the `3 axioms`_ of an "`actor model`_"
but likely *does not* look like what *you* probably think an "actor
model" looks like, and that's *intentional*.
but likely **does not** look like what **you** probably *think* an "actor
model" looks like, and that's **intentional**.
The first step to grok ``tractor`` is to get the basics of ``trio`` down.
A great place to start is the `trio docs`_ and this `blog post`_.
Where do i start!?
------------------
The first step to grok ``tractor`` is to get an intermediate
knowledge of ``trio`` and **structured concurrency** B)
Some great places to start are,
- the seminal `blog post`_
- obviously the `trio docs`_
- wikipedia's nascent SC_ page
- the fancy diagrams @ libdill-docs_
Features
@ -593,6 +602,7 @@ matrix seems too hip, we're also mostly all in the the `trio gitter
channel`_!
.. _structured concurrent: https://trio.discourse.group/t/concise-definition-of-structured-concurrency/228
.. _distributed: https://en.wikipedia.org/wiki/Distributed_computing
.. _multi-processing: https://en.wikipedia.org/wiki/Multiprocessing
.. _trio: https://github.com/python-trio/trio
.. _nurseries: https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/#nurseries-a-structured-replacement-for-go-statements
@ -611,8 +621,9 @@ channel`_!
.. _trio docs: https://trio.readthedocs.io/en/latest/
.. _blog post: https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/
.. _structured concurrency: https://en.wikipedia.org/wiki/Structured_concurrency
.. _SC: https://en.wikipedia.org/wiki/Structured_concurrency
.. _libdill-docs: https://sustrik.github.io/libdill/structured-concurrency.html
.. _structured chadcurrency: https://en.wikipedia.org/wiki/Structured_concurrency
.. _structured concurrency: https://en.wikipedia.org/wiki/Structured_concurrency
.. _unrequirements: https://en.wikipedia.org/wiki/Actor_model#Direct_communication_and_asynchrony
.. _async generators: https://www.python.org/dev/peps/pep-0525/
.. _trio-parallel: https://github.com/richardsheridan/trio-parallel

View File

@ -0,0 +1,117 @@
import asyncio
import trio
import tractor
from tractor import to_asyncio
async def aio_sleep_forever():
await asyncio.sleep(float('inf'))
async def bp_then_error(
to_trio: trio.MemorySendChannel,
from_trio: asyncio.Queue,
raise_after_bp: bool = True,
) -> None:
# sync with ``trio``-side (caller) task
to_trio.send_nowait('start')
# NOTE: what happens here inside the hook needs some refinement..
# => seems like it's still `._debug._set_trace()` but
# we set `Lock.local_task_in_debug = 'sync'`, we probably want
# some further, at least, meta-data about the task/actoq in debug
# in terms of making it clear it's asyncio mucking about.
breakpoint()
# short checkpoint / delay
await asyncio.sleep(0.5)
if raise_after_bp:
raise ValueError('blah')
# TODO: test case with this so that it gets cancelled?
else:
# XXX NOTE: this is required in order to get the SIGINT-ignored
# hang case documented in the module script section!
await aio_sleep_forever()
@tractor.context
async def trio_ctx(
ctx: tractor.Context,
bp_before_started: bool = False,
):
# this will block until the ``asyncio`` task sends a "first"
# message, see first line in above func.
async with (
to_asyncio.open_channel_from(
bp_then_error,
raise_after_bp=not bp_before_started,
) as (first, chan),
trio.open_nursery() as n,
):
assert first == 'start'
if bp_before_started:
await tractor.breakpoint()
await ctx.started(first)
n.start_soon(
to_asyncio.run_task,
aio_sleep_forever,
)
await trio.sleep_forever()
async def main(
bps_all_over: bool = False,
) -> None:
async with tractor.open_nursery() as n:
p = await n.start_actor(
'aio_daemon',
enable_modules=[__name__],
infect_asyncio=True,
debug_mode=True,
loglevel='cancel',
)
async with p.open_context(
trio_ctx,
bp_before_started=bps_all_over,
) as (ctx, first):
assert first == 'start'
if bps_all_over:
await tractor.breakpoint()
# await trio.sleep_forever()
await ctx.cancel()
assert 0
# TODO: case where we cancel from trio-side while asyncio task
# has debugger lock?
# await p.cancel_actor()
if __name__ == '__main__':
# works fine B)
trio.run(main)
# will hang and ignores SIGINT !!
# NOTE: you'll need to send a SIGQUIT (via ctl-\) to kill it
# manually..
# trio.run(main, True)

View File

@ -6,3 +6,4 @@ mypy
trio_typing
pexpect
towncrier
numpy

View File

@ -41,6 +41,9 @@ setup(
],
install_requires=[
# discovery subsys
'bidict',
# trio related
# proper range spec:
# https://packaging.python.org/en/latest/discussions/install-requires-vs-requirements/#id5

View File

@ -219,7 +219,8 @@ def daemon(
arb_addr: tuple[str, int],
):
'''
Run a daemon actor as a "remote arbiter".
Run a daemon actor as a "remote registrar" and/or plain ol
separate actor (service) tree.
'''
if loglevel in ('trace', 'debug'):

View File

@ -1,6 +1,7 @@
"""
Actor "discovery" testing
"""
'''
Discovery subsystem via a "registrar" actor scenarios.
'''
import os
import signal
import platform
@ -127,7 +128,10 @@ async def unpack_reg(actor_or_portal):
else:
msg = await actor_or_portal.run_from_ns('self', 'get_registry')
return {tuple(key.split('.')): val for key, val in msg.items()}
return {
tuple(key.split('.')): val
for key, val in msg.items()
}
async def spawn_and_check_registry(
@ -283,37 +287,41 @@ async def close_chans_before_nursery(
async with tractor.open_nursery() as tn:
portal1 = await tn.start_actor(
name='consumer1', enable_modules=[__name__])
name='consumer1',
enable_modules=[__name__],
)
portal2 = await tn.start_actor(
'consumer2', enable_modules=[__name__])
'consumer2',
enable_modules=[__name__],
)
# TODO: compact this back as was in last commit once
# 3.9+, see https://github.com/goodboy/tractor/issues/207
async with portal1.open_stream_from(
stream_forever
) as agen1:
async with portal2.open_stream_from(
async with (
portal1.open_stream_from(
stream_forever
) as agen2:
async with trio.open_nursery() as n:
n.start_soon(streamer, agen1)
n.start_soon(cancel, use_signal, .5)
try:
await streamer(agen2)
finally:
# Kill the root nursery thus resulting in
# normal arbiter channel ops to fail during
# teardown. It doesn't seem like this is
# reliably triggered by an external SIGINT.
# tractor.current_actor()._root_nursery.cancel_scope.cancel()
) as agen1,
portal2.open_stream_from(
stream_forever
) as agen2,
):
async with trio.open_nursery() as n:
n.start_soon(streamer, agen1)
n.start_soon(cancel, use_signal, .5)
try:
await streamer(agen2)
finally:
# Kill the root nursery thus resulting in
# normal arbiter channel ops to fail during
# teardown. It doesn't seem like this is
# reliably triggered by an external SIGINT.
# tractor.current_actor()._root_nursery.cancel_scope.cancel()
# XXX: THIS IS THE KEY THING that
# happens **before** exiting the
# actor nursery block
# XXX: THIS IS THE KEY THING that
# happens **before** exiting the
# actor nursery block
# also kill off channels cuz why not
await agen1.aclose()
await agen2.aclose()
# also kill off channels cuz why not
await agen1.aclose()
await agen2.aclose()
finally:
with trio.CancelScope(shield=True):
await trio.sleep(1)
@ -331,10 +339,12 @@ def test_close_channel_explicit(
use_signal,
arb_addr,
):
"""Verify that closing a stream explicitly and killing the actor's
'''
Verify that closing a stream explicitly and killing the actor's
"root nursery" **before** the containing nursery tears down also
results in subactor(s) deregistering from the arbiter.
"""
'''
with pytest.raises(KeyboardInterrupt):
trio.run(
partial(
@ -347,16 +357,18 @@ def test_close_channel_explicit(
@pytest.mark.parametrize('use_signal', [False, True])
def test_close_channel_explicit_remote_arbiter(
def test_close_channel_explicit_remote_registrar(
daemon,
start_method,
use_signal,
arb_addr,
):
"""Verify that closing a stream explicitly and killing the actor's
'''
Verify that closing a stream explicitly and killing the actor's
"root nursery" **before** the containing nursery tears down also
results in subactor(s) deregistering from the arbiter.
"""
'''
with pytest.raises(KeyboardInterrupt):
trio.run(
partial(
@ -366,3 +378,51 @@ def test_close_channel_explicit_remote_arbiter(
remote_arbiter=True,
),
)
@tractor.context
async def kill_transport(
ctx: tractor.Context,
) -> None:
await ctx.started()
actor: tractor.Actor = tractor.current_actor()
actor.cancel_server()
await trio.sleep_forever()
# @pytest.mark.parametrize('use_signal', [False, True])
def test_stale_entry_is_deleted(
daemon,
start_method,
arb_addr,
):
'''
Ensure that when a stale entry is detected in the registrar's table
that the `find_actor()` API takes care of deleting the stale entry
and not delivering a bad portal.
'''
async def main():
name: str = 'transport_fails_actor'
regport: tractor.Portal
tn: tractor.ActorNursery
async with (
tractor.open_nursery() as tn,
tractor.get_registrar(*arb_addr) as regport,
):
ptl: tractor.Portal = await tn.start_actor(
name,
enable_modules=[__name__],
)
async with ptl.open_context(
kill_transport,
) as (first, ctx):
async with tractor.find_actor(name) as maybe_portal:
assert maybe_portal is None
await ptl.cancel_actor()
trio.run(main)

167
tests/test_shm.py 100644
View File

@ -0,0 +1,167 @@
"""
Shared mem primitives and APIs.
"""
import uuid
# import numpy
import pytest
import trio
import tractor
from tractor._shm import (
open_shm_list,
attach_shm_list,
)
@tractor.context
async def child_attach_shml_alot(
ctx: tractor.Context,
shm_key: str,
) -> None:
await ctx.started(shm_key)
# now try to attach a boatload of times in a loop..
for _ in range(1000):
shml = attach_shm_list(
key=shm_key,
readonly=False,
)
assert shml.shm.name == shm_key
await trio.sleep(0.001)
def test_child_attaches_alot():
async def main():
async with tractor.open_nursery() as an:
# allocate writeable list in parent
key = f'shml_{uuid.uuid4()}'
shml = open_shm_list(
key=key,
)
portal = await an.start_actor(
'shm_attacher',
enable_modules=[__name__],
)
async with (
portal.open_context(
child_attach_shml_alot,
shm_key=shml.key,
) as (ctx, start_val),
):
assert start_val == key
await ctx.result()
await portal.cancel_actor()
trio.run(main)
@tractor.context
async def child_read_shm_list(
ctx: tractor.Context,
shm_key: str,
use_str: bool,
frame_size: int,
) -> None:
# attach in child
shml = attach_shm_list(
key=shm_key,
# dtype=str if use_str else float,
)
await ctx.started(shml.key)
async with ctx.open_stream() as stream:
async for i in stream:
print(f'(child): reading shm list index: {i}')
if use_str:
expect = str(float(i))
else:
expect = float(i)
if frame_size == 1:
val = shml[i]
assert expect == val
print(f'(child): reading value: {val}')
else:
frame = shml[i - frame_size:i]
print(f'(child): reading frame: {frame}')
@pytest.mark.parametrize(
'use_str',
[False, True],
ids=lambda i: f'use_str_values={i}',
)
@pytest.mark.parametrize(
'frame_size',
[1, 2**6, 2**10],
ids=lambda i: f'frame_size={i}',
)
def test_parent_writer_child_reader(
use_str: bool,
frame_size: int,
):
async def main():
async with tractor.open_nursery(
# debug_mode=True,
) as an:
portal = await an.start_actor(
'shm_reader',
enable_modules=[__name__],
debug_mode=True,
)
# allocate writeable list in parent
key = 'shm_list'
seq_size = int(2 * 2 ** 10)
shml = open_shm_list(
key=key,
size=seq_size,
dtype=str if use_str else float,
readonly=False,
)
async with (
portal.open_context(
child_read_shm_list,
shm_key=key,
use_str=use_str,
frame_size=frame_size,
) as (ctx, sent),
ctx.open_stream() as stream,
):
assert sent == key
for i in range(seq_size):
val = float(i)
if use_str:
val = str(val)
# print(f'(parent): writing {val}')
shml[i] = val
# only on frame fills do we
# signal to the child that a frame's
# worth is ready.
if (i % frame_size) == 0:
print(f'(parent): signalling frame full on {val}')
await stream.send(i)
else:
print(f'(parent): signalling final frame on {val}')
await stream.send(i)
await portal.cancel_actor()
trio.run(main)

View File

@ -21,7 +21,6 @@ tractor: structured concurrent ``trio``-"actors".
from exceptiongroup import BaseExceptionGroup
from ._clustering import open_actor_cluster
from ._ipc import Channel
from ._context import (
Context,
context,
@ -32,6 +31,7 @@ from ._streaming import (
)
from ._discovery import (
get_arbiter,
get_registrar,
find_actor,
wait_for_actor,
query_actor,
@ -48,6 +48,8 @@ from ._exceptions import (
)
from ._debug import (
breakpoint,
pause,
pause_from_sync,
post_mortem,
)
from . import msg
@ -55,31 +57,36 @@ from ._root import (
run_daemon,
open_root_actor,
)
from ._ipc import Channel
from ._portal import Portal
from ._runtime import Actor
__all__ = [
'Actor',
'BaseExceptionGroup',
'Channel',
'Context',
'ContextCancelled',
'ModuleNotExposed',
'MsgStream',
'BaseExceptionGroup',
'Portal',
'RemoteActorError',
'breakpoint',
'context',
'current_actor',
'find_actor',
'query_actor',
'get_arbiter',
'get_registrar',
'is_root_process',
'msg',
'open_actor_cluster',
'open_nursery',
'open_root_actor',
'pause',
'post_mortem',
'pause_from_sync',
'query_actor',
'run_daemon',
'stream',

View File

@ -30,7 +30,6 @@ from functools import (
from contextlib import asynccontextmanager as acm
from typing import (
Any,
Optional,
Callable,
AsyncIterator,
AsyncGenerator,
@ -40,7 +39,10 @@ from types import FrameType
import pdbp
import tractor
import trio
from trio_typing import TaskStatus
from trio_typing import (
TaskStatus,
# Task,
)
from .log import get_logger
from ._discovery import get_root
@ -69,10 +71,10 @@ class Lock:
'''
repl: MultiActorPdb | None = None
# placeholder for function to set a ``trio.Event`` on debugger exit
# pdb_release_hook: Optional[Callable] = None
# pdb_release_hook: Callable | None = None
_trio_handler: Callable[
[int, Optional[FrameType]], Any
[int, FrameType | None], Any
] | int | None = None
# actor-wide variable pointing to current task name using debugger
@ -83,23 +85,23 @@ class Lock:
# and must be cancelled if this actor is cancelled via IPC
# request-message otherwise deadlocks with the parent actor may
# ensure
_debugger_request_cs: Optional[trio.CancelScope] = None
_debugger_request_cs: trio.CancelScope | None = None
# NOTE: set only in the root actor for the **local** root spawned task
# which has acquired the lock (i.e. this is on the callee side of
# the `lock_tty_for_child()` context entry).
_root_local_task_cs_in_debug: Optional[trio.CancelScope] = None
_root_local_task_cs_in_debug: trio.CancelScope | None = None
# actor tree-wide actor uid that supposedly has the tty lock
global_actor_in_debug: Optional[tuple[str, str]] = None
global_actor_in_debug: tuple[str, str] = None
local_pdb_complete: Optional[trio.Event] = None
no_remote_has_tty: Optional[trio.Event] = None
local_pdb_complete: trio.Event | None = None
no_remote_has_tty: trio.Event | None = None
# lock in root actor preventing multi-access to local tty
_debug_lock: trio.StrictFIFOLock = trio.StrictFIFOLock()
_orig_sigint_handler: Optional[Callable] = None
_orig_sigint_handler: Callable | None = None
_blocked: set[tuple[str, str]] = set()
@classmethod
@ -110,6 +112,7 @@ class Lock:
)
@classmethod
@pdbp.hideframe # XXX NOTE XXX see below in `.pause_from_sync()`
def unshield_sigint(cls):
# always restore ``trio``'s sigint handler. see notes below in
# the pdb factory about the nightmare that is that code swapping
@ -129,10 +132,6 @@ class Lock:
if owner:
raise
# actor-local state, irrelevant for non-root.
cls.global_actor_in_debug = None
cls.local_task_in_debug = None
try:
# sometimes the ``trio`` might already be terminated in
# which case this call will raise.
@ -143,6 +142,11 @@ class Lock:
cls.unshield_sigint()
cls.repl = None
# actor-local state, irrelevant for non-root.
cls.global_actor_in_debug = None
cls.local_task_in_debug = None
class TractorConfig(pdbp.DefaultConfig):
'''
@ -151,7 +155,7 @@ class TractorConfig(pdbp.DefaultConfig):
'''
use_pygments: bool = True
sticky_by_default: bool = False
enable_hidden_frames: bool = False
enable_hidden_frames: bool = True
# much thanks @mdmintz for the hot tip!
# fixes line spacing issue when resizing terminal B)
@ -228,26 +232,23 @@ async def _acquire_debug_lock_from_root_task(
to the ``pdb`` repl.
'''
task_name = trio.lowlevel.current_task().name
task_name: str = trio.lowlevel.current_task().name
we_acquired: bool = False
log.runtime(
f"Attempting to acquire TTY lock, remote task: {task_name}:{uid}"
)
we_acquired = False
try:
log.runtime(
f"entering lock checkpoint, remote task: {task_name}:{uid}"
)
we_acquired = True
# NOTE: if the surrounding cancel scope from the
# `lock_tty_for_child()` caller is cancelled, this line should
# unblock and NOT leave us in some kind of
# a "child-locked-TTY-but-child-is-uncontactable-over-IPC"
# condition.
await Lock._debug_lock.acquire()
we_acquired = True
if Lock.no_remote_has_tty is None:
# mark the tty lock as being in use so that the runtime
@ -374,7 +375,7 @@ async def wait_for_parent_stdin_hijack(
This function is used by any sub-actor to acquire mutex access to
the ``pdb`` REPL and thus the root's TTY for interactive debugging
(see below inside ``_breakpoint()``). It can be used to ensure that
(see below inside ``_pause()``). It can be used to ensure that
an intermediate nursery-owning actor does not clobber its children
if they are in debug (see below inside
``maybe_wait_for_debugger()``).
@ -440,17 +441,29 @@ def mk_mpdb() -> tuple[MultiActorPdb, Callable]:
return pdb, Lock.unshield_sigint
async def _breakpoint(
async def _pause(
debug_func,
debug_func: Callable | None = None,
release_lock_signal: trio.Event | None = None,
# TODO:
# shield: bool = False
task_status: TaskStatus[trio.Event] = trio.TASK_STATUS_IGNORED
) -> None:
'''
Breakpoint entry for engaging debugger instance sync-interaction,
from async code, executing in actor runtime (task).
A pause point (more commonly known as a "breakpoint") interrupt
instruction for engaging a blocking debugger instance to
conduct manual console-based-REPL-interaction from within
`tractor`'s async runtime, normally from some single-threaded
and currently executing actor-hosted-`trio`-task in some
(remote) process.
NOTE: we use the semantics "pause" since it better encompasses
the entirety of the necessary global-runtime-state-mutation any
actor-task must access and lock in order to get full isolated
control over the process tree's root TTY:
https://en.wikipedia.org/wiki/Breakpoint
'''
__tracebackhide__ = True
@ -559,10 +572,23 @@ async def _breakpoint(
Lock.repl = pdb
try:
# block here one (at the appropriate frame *up*) where
# ``breakpoint()`` was awaited and begin handling stdio.
log.debug("Entering the synchronous world of pdb")
debug_func(actor, pdb)
# breakpoint()
if debug_func is None:
# assert release_lock_signal, (
# 'Must pass `release_lock_signal: trio.Event` if no '
# 'trace func provided!'
# )
print(f"{actor.uid} ENTERING WAIT")
task_status.started()
# with trio.CancelScope(shield=True):
# await release_lock_signal.wait()
else:
# block here one (at the appropriate frame *up*) where
# ``breakpoint()`` was awaited and begin handling stdio.
log.debug("Entering the synchronous world of pdb")
debug_func(actor, pdb)
except bdb.BdbQuit:
Lock.release()
@ -583,7 +609,7 @@ async def _breakpoint(
def shield_sigint_handler(
signum: int,
frame: 'frame', # type: ignore # noqa
# pdb_obj: Optional[MultiActorPdb] = None,
# pdb_obj: MultiActorPdb | None = None,
*args,
) -> None:
@ -597,7 +623,7 @@ def shield_sigint_handler(
'''
__tracebackhide__ = True
uid_in_debug = Lock.global_actor_in_debug
uid_in_debug: tuple[str, str] | None = Lock.global_actor_in_debug
actor = tractor.current_actor()
# print(f'{actor.uid} in HANDLER with ')
@ -615,14 +641,14 @@ def shield_sigint_handler(
else:
raise KeyboardInterrupt
any_connected = False
any_connected: bool = False
if uid_in_debug is not None:
# try to see if the supposed (sub)actor in debug still
# has an active connection to *this* actor, and if not
# it's likely they aren't using the TTY lock / debugger
# and we should propagate SIGINT normally.
chans = actor._peers.get(tuple(uid_in_debug))
chans: list[tractor.Channel] = actor._peers.get(tuple(uid_in_debug))
if chans:
any_connected = any(chan.connected() for chan in chans)
if not any_connected:
@ -635,7 +661,7 @@ def shield_sigint_handler(
return do_cancel()
# only set in the actor actually running the REPL
pdb_obj = Lock.repl
pdb_obj: MultiActorPdb | None = Lock.repl
# root actor branch that reports whether or not a child
# has locked debugger.
@ -693,7 +719,7 @@ def shield_sigint_handler(
)
return do_cancel()
task = Lock.local_task_in_debug
task: str | None = Lock.local_task_in_debug
if (
task
and pdb_obj
@ -708,8 +734,8 @@ def shield_sigint_handler(
# elif debug_mode():
else: # XXX: shouldn't ever get here?
print("WTFWTFWTF")
raise KeyboardInterrupt
raise RuntimeError("WTFWTFWTF")
# raise KeyboardInterrupt("WTFWTFWTF")
# NOTE: currently (at least on ``fancycompleter`` 0.9.2)
# it looks to be that the last command that was run (eg. ll)
@ -737,21 +763,18 @@ def shield_sigint_handler(
# https://github.com/goodboy/tractor/issues/130#issuecomment-663752040
# https://github.com/prompt-toolkit/python-prompt-toolkit/blob/c2c6af8a0308f9e5d7c0e28cb8a02963fe0ce07a/prompt_toolkit/patch_stdout.py
# XXX LEGACY: lol, see ``pdbpp`` issue:
# https://github.com/pdbpp/pdbpp/issues/496
def _set_trace(
actor: tractor.Actor | None = None,
pdb: MultiActorPdb | None = None,
):
__tracebackhide__ = True
actor = actor or tractor.current_actor()
actor: tractor.Actor = actor or tractor.current_actor()
# start 2 levels up in user code
frame: Optional[FrameType] = sys._getframe()
frame: FrameType | None = sys._getframe()
if frame:
frame = frame.f_back # type: ignore
frame: FrameType = frame.f_back # type: ignore
if (
frame
@ -771,12 +794,76 @@ def _set_trace(
Lock.local_task_in_debug = 'sync'
pdb.set_trace(frame=frame)
# undo_
breakpoint = partial(
_breakpoint,
# TODO: allow pausing from sync code, normally by remapping
# python's builtin breakpoint() hook to this runtime aware version.
def pause_from_sync() -> None:
print("ENTER SYNC PAUSE")
import greenback
__tracebackhide__ = True
actor: tractor.Actor = tractor.current_actor()
# task_can_release_tty_lock = trio.Event()
# spawn bg task which will lock out the TTY, we poll
# just below until the release event is reporting that task as
# waiting.. not the most ideal but works for now ;)
greenback.await_(
actor._service_n.start(partial(
_pause,
debug_func=None,
# release_lock_signal=task_can_release_tty_lock,
))
)
db, undo_sigint = mk_mpdb()
Lock.local_task_in_debug = 'sync'
# db.config.enable_hidden_frames = True
# we entered the global ``breakpoint()`` built-in from sync
# code?
frame: FrameType | None = sys._getframe()
# print(f'FRAME: {str(frame)}')
# assert not db._is_hidden(frame)
frame: FrameType = frame.f_back # type: ignore
# print(f'FRAME: {str(frame)}')
# if not db._is_hidden(frame):
# pdbp.set_trace()
# db._hidden_frames.append(
# (frame, frame.f_lineno)
# )
db.set_trace(frame=frame)
# NOTE XXX: see the `@pdbp.hideframe` decoration
# on `Lock.unshield_sigint()`.. I have NO CLUE why
# the next instruction's def frame is being shown
# in the tb but it seems to be something wonky with
# the way `pdb` core works?
# undo_sigint()
# Lock.global_actor_in_debug = actor.uid
# Lock.release()
# task_can_release_tty_lock.set()
# using the "pause" semantics instead since
# that better covers actually somewhat "pausing the runtime"
# for this particular paralell task to do debugging B)
pause = partial(
_pause,
_set_trace,
)
pp = pause # short-hand for "pause point"
async def breakpoint(**kwargs):
log.warning(
'`tractor.breakpoint()` is deprecated!\n'
'Please use `tractor.pause()` instead!\n'
)
await pause(**kwargs)
def _post_mortem(
@ -801,7 +888,7 @@ def _post_mortem(
post_mortem = partial(
_breakpoint,
_pause,
_post_mortem,
)
@ -883,8 +970,7 @@ async def maybe_wait_for_debugger(
# will make the pdb repl unusable.
# Instead try to wait for pdb to be released before
# tearing down.
sub_in_debug = None
sub_in_debug: tuple[str, str] | None = None
for _ in range(poll_steps):
@ -904,13 +990,15 @@ async def maybe_wait_for_debugger(
debug_complete = Lock.no_remote_has_tty
if (
(debug_complete and
not debug_complete.is_set())
debug_complete
and sub_in_debug is not None
and not debug_complete.is_set()
):
log.debug(
log.pdb(
'Root has errored but pdb is in use by '
f'child {sub_in_debug}\n'
'Waiting on tty lock to release..')
'Waiting on tty lock to release..'
)
await debug_complete.wait()

View File

@ -35,7 +35,7 @@ from ._state import current_actor, _runtime_vars
@acm
async def get_arbiter(
async def get_registrar(
host: str,
port: int,
@ -56,11 +56,14 @@ async def get_arbiter(
# (likely a re-entrant call from the arbiter actor)
yield LocalPortal(actor, Channel((host, port)))
else:
async with _connect_chan(host, port) as chan:
async with (
_connect_chan(host, port) as chan,
open_portal(chan) as arb_portal,
):
yield arb_portal
async with open_portal(chan) as arb_portal:
yield arb_portal
get_arbiter = get_registrar
@acm
@ -101,7 +104,10 @@ async def query_actor(
# TODO: return portals to all available actors - for now just
# the last one that registered
if name == 'arbiter' and actor.is_arbiter:
if (
name == 'arbiter'
and actor.is_arbiter
):
raise RuntimeError("The current actor is the arbiter")
yield sockaddr if sockaddr else None
@ -112,7 +118,7 @@ async def find_actor(
name: str,
arbiter_sockaddr: tuple[str, int] | None = None
) -> AsyncGenerator[Optional[Portal], None]:
) -> AsyncGenerator[Portal | None, None]:
'''
Ask the arbiter to find actor(s) by name.
@ -120,17 +126,49 @@ async def find_actor(
known to the arbiter.
'''
async with query_actor(
name=name,
arbiter_sockaddr=arbiter_sockaddr,
) as sockaddr:
actor = current_actor()
async with get_arbiter(
*arbiter_sockaddr or actor._arb_addr
) as arb_portal:
sockaddr = await arb_portal.run_from_ns(
'self',
'find_actor',
name=name,
)
# TODO: return portals to all available actors - for now just
# the last one that registered
if (
name == 'arbiter'
and actor.is_arbiter
):
raise RuntimeError("The current actor is the arbiter")
if sockaddr:
async with _connect_chan(*sockaddr) as chan:
async with open_portal(chan) as portal:
yield portal
else:
yield None
try:
async with _connect_chan(*sockaddr) as chan:
async with open_portal(chan) as portal:
yield portal
return
# most likely we were unable to connect the
# transport and there is likely a stale entry in
# the registry actor's table, thus we need to
# instruct it to clear that stale entry and then
# more silently (pretend there was no reason but
# to) indicate that the target actor can't be
# contacted at that addr.
except OSError:
# NOTE: ensure we delete the stale entry from the
# registar actor.
uid: tuple[str, str] = await arb_portal.run_from_ns(
'self',
'delete_sockaddr',
sockaddr=sockaddr,
)
yield None
@acm

View File

@ -89,7 +89,7 @@ async def open_root_actor(
# https://github.com/python-trio/trio/issues/1155#issuecomment-742964018
builtin_bp_handler = sys.breakpointhook
orig_bp_path: str | None = os.environ.get('PYTHONBREAKPOINT', None)
os.environ['PYTHONBREAKPOINT'] = 'tractor._debug._set_trace'
os.environ['PYTHONBREAKPOINT'] = 'tractor._debug.pause_from_sync'
# attempt to retreive ``trio``'s sigint handler and stash it
# on our debugger lock state.
@ -235,9 +235,10 @@ async def open_root_actor(
BaseExceptionGroup,
) as err:
entered = await _debug._maybe_enter_pm(err)
if not entered and not is_multi_cancelled(err):
if (
not (await _debug._maybe_enter_pm(err))
and not is_multi_cancelled(err)
):
logger.exception("Root actor crashed:")
# always re-raise

View File

@ -40,6 +40,7 @@ from contextlib import ExitStack
import warnings
from async_generator import aclosing
from bidict import bidict
from exceptiongroup import BaseExceptionGroup
import trio # type: ignore
from trio_typing import TaskStatus
@ -95,6 +96,10 @@ async def _invoke(
treat_as_gen: bool = False
failed_resp: bool = False
if _state.debug_mode():
import greenback
await greenback.ensure_portal()
# possibly a traceback (not sure what typing is for this..)
tb = None
@ -448,17 +453,18 @@ class Actor:
(swappable) network protocols.
Each "actor" is ``trio.run()`` scheduled "runtime" composed of many
concurrent tasks in a single thread. The "runtime" tasks conduct
a slew of low(er) level functions to make it possible for message
passing between actors as well as the ability to create new actors
(aka new "runtimes" in new processes which are supervised via
a nursery construct). Each task which sends messages to a task in
a "peer" (not necessarily a parent-child, depth hierarchy)) is able
to do so via an "address", which maps IPC connections across memory
boundaries, and task request id which allows for per-actor
tasks to send and receive messages to specific peer-actor tasks with
which there is an ongoing RPC/IPC dialog.
Each "actor" is ``trio.run()`` scheduled "runtime" composed of
many concurrent tasks in a single thread. The "runtime" tasks
conduct a slew of low(er) level functions to make it possible
for message passing between actors as well as the ability to
create new actors (aka new "runtimes" in new processes which
are supervised via a nursery construct). Each task which sends
messages to a task in a "peer" (not necessarily a parent-child,
depth hierarchy) is able to do so via an "address", which maps
IPC connections across memory boundaries, and a task request id
which allows for per-actor tasks to send and receive messages
to specific peer-actor tasks with which there is an ongoing
RPC/IPC dialog.
'''
# ugh, we need to get rid of this and replace with a "registry" sys
@ -756,6 +762,7 @@ class Actor:
# deliver response to local caller/waiter
await self._push_result(chan, cid, msg)
log.runtime('Waiting on actor nursery to exit..')
await local_nursery.exited.wait()
if disconnected:
@ -810,7 +817,7 @@ class Actor:
db_cs
and not db_cs.cancel_called
):
log.warning(
log.critical(
f'STALE DEBUG LOCK DETECTED FOR {uid}'
)
# TODO: figure out why this breaks tests..
@ -1768,10 +1775,10 @@ class Arbiter(Actor):
def __init__(self, *args, **kwargs) -> None:
self._registry: dict[
self._registry: bidict[
tuple[str, str],
tuple[str, int],
] = {}
] = bidict({})
self._waiters: dict[
str,
# either an event to sync to receiving an actor uid (which
@ -1862,4 +1869,26 @@ class Arbiter(Actor):
) -> None:
uid = (str(uid[0]), str(uid[1]))
self._registry.pop(uid)
entry: tuple = self._registry.pop(uid, None)
if entry is None:
log.warning(f'Request to de-register {uid} failed?')
async def delete_sockaddr(
self,
sockaddr: tuple[str, int],
) -> tuple[str, str]:
uid: tuple | None = self._registry.inverse.pop(
sockaddr,
None,
)
if uid:
log.warning(
f'Deleting registry entry for {sockaddr}@{uid}!'
)
else:
log.warning(
f'No registry entry for {sockaddr}@{uid}!'
)
return uid

833
tractor/_shm.py 100644
View File

@ -0,0 +1,833 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
"""
SC friendly shared memory management geared at real-time
processing.
Support for ``numpy`` compatible array-buffers is provided but is
considered optional within the context of this runtime-library.
"""
from __future__ import annotations
from sys import byteorder
import time
from typing import Optional
from multiprocessing import shared_memory as shm
from multiprocessing.shared_memory import (
SharedMemory,
ShareableList,
)
from msgspec import Struct
import tractor
from .log import get_logger
_USE_POSIX = getattr(shm, '_USE_POSIX', False)
if _USE_POSIX:
from _posixshmem import shm_unlink
try:
import numpy as np
from numpy.lib import recfunctions as rfn
import nptyping
except ImportError:
pass
log = get_logger(__name__)
def disable_mantracker():
'''
Disable all ``multiprocessing``` "resource tracking" machinery since
it's an absolute multi-threaded mess of non-SC madness.
'''
from multiprocessing import resource_tracker as mantracker
# Tell the "resource tracker" thing to fuck off.
class ManTracker(mantracker.ResourceTracker):
def register(self, name, rtype):
pass
def unregister(self, name, rtype):
pass
def ensure_running(self):
pass
# "know your land and know your prey"
# https://www.dailymotion.com/video/x6ozzco
mantracker._resource_tracker = ManTracker()
mantracker.register = mantracker._resource_tracker.register
mantracker.ensure_running = mantracker._resource_tracker.ensure_running
mantracker.unregister = mantracker._resource_tracker.unregister
mantracker.getfd = mantracker._resource_tracker.getfd
disable_mantracker()
class SharedInt:
'''
Wrapper around a single entry shared memory array which
holds an ``int`` value used as an index counter.
'''
def __init__(
self,
shm: SharedMemory,
) -> None:
self._shm = shm
@property
def value(self) -> int:
return int.from_bytes(self._shm.buf, byteorder)
@value.setter
def value(self, value) -> None:
self._shm.buf[:] = value.to_bytes(self._shm.size, byteorder)
def destroy(self) -> None:
if _USE_POSIX:
# We manually unlink to bypass all the "resource tracker"
# nonsense meant for non-SC systems.
name = self._shm.name
try:
shm_unlink(name)
except FileNotFoundError:
# might be a teardown race here?
log.warning(f'Shm for {name} already unlinked?')
class NDToken(Struct, frozen=True):
'''
Internal represenation of a shared memory ``numpy`` array "token"
which can be used to key and load a system (OS) wide shm entry
and correctly read the array by type signature.
This type is msg safe.
'''
shm_name: str # this servers as a "key" value
shm_first_index_name: str
shm_last_index_name: str
dtype_descr: tuple
size: int # in struct-array index / row terms
# TODO: use nptyping here on dtypes
@property
def dtype(self) -> list[tuple[str, str, tuple[int, ...]]]:
return np.dtype(
list(
map(tuple, self.dtype_descr)
)
).descr
def as_msg(self):
return self.to_dict()
@classmethod
def from_msg(cls, msg: dict) -> NDToken:
if isinstance(msg, NDToken):
return msg
# TODO: native struct decoding
# return _token_dec.decode(msg)
msg['dtype_descr'] = tuple(map(tuple, msg['dtype_descr']))
return NDToken(**msg)
# _token_dec = msgspec.msgpack.Decoder(NDToken)
# TODO: this api?
# _known_tokens = tractor.ActorVar('_shm_tokens', {})
# _known_tokens = tractor.ContextStack('_known_tokens', )
# _known_tokens = trio.RunVar('shms', {})
# TODO: this should maybe be provided via
# a `.trionics.maybe_open_context()` wrapper factory?
# process-local store of keys to tokens
_known_tokens: dict[str, NDToken] = {}
def get_shm_token(key: str) -> NDToken | None:
'''
Convenience func to check if a token
for the provided key is known by this process.
Returns either the ``numpy`` token or a string for a shared list.
'''
return _known_tokens.get(key)
def _make_token(
key: str,
size: int,
dtype: np.dtype,
) -> NDToken:
'''
Create a serializable token that can be used
to access a shared array.
'''
return NDToken(
shm_name=key,
shm_first_index_name=key + "_first",
shm_last_index_name=key + "_last",
dtype_descr=tuple(np.dtype(dtype).descr),
size=size,
)
class ShmArray:
'''
A shared memory ``numpy.ndarray`` API.
An underlying shared memory buffer is allocated based on
a user specified ``numpy.ndarray``. This fixed size array
can be read and written to by pushing data both onto the "front"
or "back" of a set index range. The indexes for the "first" and
"last" index are themselves stored in shared memory (accessed via
``SharedInt`` interfaces) values such that multiple processes can
interact with the same array using a synchronized-index.
'''
def __init__(
self,
shmarr: np.ndarray,
first: SharedInt,
last: SharedInt,
shm: SharedMemory,
# readonly: bool = True,
) -> None:
self._array = shmarr
# indexes for first and last indices corresponding
# to fille data
self._first = first
self._last = last
self._len = len(shmarr)
self._shm = shm
self._post_init: bool = False
# pushing data does not write the index (aka primary key)
self._write_fields: list[str] | None = None
dtype = shmarr.dtype
if dtype.fields:
self._write_fields = list(shmarr.dtype.fields.keys())[1:]
# TODO: ringbuf api?
@property
def _token(self) -> NDToken:
return NDToken(
shm_name=self._shm.name,
shm_first_index_name=self._first._shm.name,
shm_last_index_name=self._last._shm.name,
dtype_descr=tuple(self._array.dtype.descr),
size=self._len,
)
@property
def token(self) -> dict:
"""Shared memory token that can be serialized and used by
another process to attach to this array.
"""
return self._token.as_msg()
@property
def index(self) -> int:
return self._last.value % self._len
@property
def array(self) -> np.ndarray:
'''
Return an up-to-date ``np.ndarray`` view of the
so-far-written data to the underlying shm buffer.
'''
a = self._array[self._first.value:self._last.value]
# first, last = self._first.value, self._last.value
# a = self._array[first:last]
# TODO: eventually comment this once we've not seen it in the
# wild in a long time..
# XXX: race where first/last indexes cause a reader
# to load an empty array..
if len(a) == 0 and self._post_init:
raise RuntimeError('Empty array race condition hit!?')
# breakpoint()
return a
def ustruct(
self,
fields: Optional[list[str]] = None,
# type that all field values will be cast to
# in the returned view.
common_dtype: np.dtype = float,
) -> np.ndarray:
array = self._array
if fields:
selection = array[fields]
# fcount = len(fields)
else:
selection = array
# fcount = len(array.dtype.fields)
# XXX: manual ``.view()`` attempt that also doesn't work.
# uview = selection.view(
# dtype='<f16',
# ).reshape(-1, 4, order='A')
# assert len(selection) == len(uview)
u = rfn.structured_to_unstructured(
selection,
# dtype=float,
copy=True,
)
# unstruct = np.ndarray(u.shape, dtype=a.dtype, buffer=shm.buf)
# array[:] = a[:]
return u
# return ShmArray(
# shmarr=u,
# first=self._first,
# last=self._last,
# shm=self._shm
# )
def last(
self,
length: int = 1,
) -> np.ndarray:
'''
Return the last ``length``'s worth of ("row") entries from the
array.
'''
return self.array[-length:]
def push(
self,
data: np.ndarray,
field_map: Optional[dict[str, str]] = None,
prepend: bool = False,
update_first: bool = True,
start: int | None = None,
) -> int:
'''
Ring buffer like "push" to append data
into the buffer and return updated "last" index.
NB: no actual ring logic yet to give a "loop around" on overflow
condition, lel.
'''
length = len(data)
if prepend:
index = (start or self._first.value) - length
if index < 0:
raise ValueError(
f'Array size of {self._len} was overrun during prepend.\n'
f'You have passed {abs(index)} too many datums.'
)
else:
index = start if start is not None else self._last.value
end = index + length
if field_map:
src_names, dst_names = zip(*field_map.items())
else:
dst_names = src_names = self._write_fields
try:
self._array[
list(dst_names)
][index:end] = data[list(src_names)][:]
# NOTE: there was a race here between updating
# the first and last indices and when the next reader
# tries to access ``.array`` (which due to the index
# overlap will be empty). Pretty sure we've fixed it now
# but leaving this here as a reminder.
if (
prepend
and update_first
and length
):
assert index < self._first.value
if (
index < self._first.value
and update_first
):
assert prepend, 'prepend=True not passed but index decreased?'
self._first.value = index
elif not prepend:
self._last.value = end
self._post_init = True
return end
except ValueError as err:
if field_map:
raise
# should raise if diff detected
self.diff_err_fields(data)
raise err
def diff_err_fields(
self,
data: np.ndarray,
) -> None:
# reraise with any field discrepancy
our_fields, their_fields = (
set(self._array.dtype.fields),
set(data.dtype.fields),
)
only_in_ours = our_fields - their_fields
only_in_theirs = their_fields - our_fields
if only_in_ours:
raise TypeError(
f"Input array is missing field(s): {only_in_ours}"
)
elif only_in_theirs:
raise TypeError(
f"Input array has unknown field(s): {only_in_theirs}"
)
# TODO: support "silent" prepends that don't update ._first.value?
def prepend(
self,
data: np.ndarray,
) -> int:
end = self.push(data, prepend=True)
assert end
def close(self) -> None:
self._first._shm.close()
self._last._shm.close()
self._shm.close()
def destroy(self) -> None:
if _USE_POSIX:
# We manually unlink to bypass all the "resource tracker"
# nonsense meant for non-SC systems.
shm_unlink(self._shm.name)
self._first.destroy()
self._last.destroy()
def flush(self) -> None:
# TODO: flush to storage backend like markestore?
...
def open_shm_ndarray(
size: int,
key: str | None = None,
dtype: np.dtype | None = None,
append_start_index: int | None = None,
readonly: bool = False,
) -> ShmArray:
'''
Open a memory shared ``numpy`` using the standard library.
This call unlinks (aka permanently destroys) the buffer on teardown
and thus should be used from the parent-most accessor (process).
'''
# create new shared mem segment for which we
# have write permission
a = np.zeros(size, dtype=dtype)
a['index'] = np.arange(len(a))
shm = SharedMemory(
name=key,
create=True,
size=a.nbytes
)
array = np.ndarray(
a.shape,
dtype=a.dtype,
buffer=shm.buf
)
array[:] = a[:]
array.setflags(write=int(not readonly))
token = _make_token(
key=key,
size=size,
dtype=dtype,
)
# create single entry arrays for storing an first and last indices
first = SharedInt(
shm=SharedMemory(
name=token.shm_first_index_name,
create=True,
size=4, # std int
)
)
last = SharedInt(
shm=SharedMemory(
name=token.shm_last_index_name,
create=True,
size=4, # std int
)
)
# Start the "real-time" append-updated (or "pushed-to") section
# after some start index: ``append_start_index``. This allows appending
# from a start point in the array which isn't the 0 index and looks
# something like,
# -------------------------
# | | i
# _________________________
# <-------------> <------->
# history real-time
#
# Once fully "prepended", the history section will leave the
# ``ShmArray._start.value: int = 0`` and the yet-to-be written
# real-time section will start at ``ShmArray.index: int``.
# this sets the index to nearly 2/3rds into the the length of
# the buffer leaving at least a "days worth of second samples"
# for the real-time section.
if append_start_index is None:
append_start_index = round(size * 0.616)
last.value = first.value = append_start_index
shmarr = ShmArray(
array,
first,
last,
shm,
)
assert shmarr._token == token
_known_tokens[key] = shmarr.token
# "unlink" created shm on process teardown by
# pushing teardown calls onto actor context stack
stack = tractor.current_actor().lifetime_stack
stack.callback(shmarr.close)
stack.callback(shmarr.destroy)
return shmarr
def attach_shm_ndarray(
token: tuple[str, str, tuple[str, str]],
readonly: bool = True,
) -> ShmArray:
'''
Attach to an existing shared memory array previously
created by another process using ``open_shared_array``.
No new shared mem is allocated but wrapper types for read/write
access are constructed.
'''
token = NDToken.from_msg(token)
key = token.shm_name
if key in _known_tokens:
assert NDToken.from_msg(_known_tokens[key]) == token, "WTF"
# XXX: ugh, looks like due to the ``shm_open()`` C api we can't
# actually place files in a subdir, see discussion here:
# https://stackoverflow.com/a/11103289
# attach to array buffer and view as per dtype
_err: Optional[Exception] = None
for _ in range(3):
try:
shm = SharedMemory(
name=key,
create=False,
)
break
except OSError as oserr:
_err = oserr
time.sleep(0.1)
else:
if _err:
raise _err
shmarr = np.ndarray(
(token.size,),
dtype=token.dtype,
buffer=shm.buf
)
shmarr.setflags(write=int(not readonly))
first = SharedInt(
shm=SharedMemory(
name=token.shm_first_index_name,
create=False,
size=4, # std int
),
)
last = SharedInt(
shm=SharedMemory(
name=token.shm_last_index_name,
create=False,
size=4, # std int
),
)
# make sure we can read
first.value
sha = ShmArray(
shmarr,
first,
last,
shm,
)
# read test
sha.array
# Stash key -> token knowledge for future queries
# via `maybe_opepn_shm_array()` but only after we know
# we can attach.
if key not in _known_tokens:
_known_tokens[key] = token
# "close" attached shm on actor teardown
tractor.current_actor().lifetime_stack.callback(sha.close)
return sha
def maybe_open_shm_ndarray(
key: str, # unique identifier for segment
size: int,
dtype: np.dtype | None = None,
append_start_index: int = 0,
readonly: bool = True,
) -> tuple[ShmArray, bool]:
'''
Attempt to attach to a shared memory block using a "key" lookup
to registered blocks in the users overall "system" registry
(presumes you don't have the block's explicit token).
This function is meant to solve the problem of discovering whether
a shared array token has been allocated or discovered by the actor
running in **this** process. Systems where multiple actors may seek
to access a common block can use this function to attempt to acquire
a token as discovered by the actors who have previously stored
a "key" -> ``NDToken`` map in an actor local (aka python global)
variable.
If you know the explicit ``NDToken`` for your memory segment instead
use ``attach_shm_array``.
'''
try:
# see if we already know this key
token = _known_tokens[key]
return (
attach_shm_ndarray(
token=token,
readonly=readonly,
),
False, # not newly opened
)
except KeyError:
log.warning(f"Could not find {key} in shms cache")
if dtype:
token = _make_token(
key,
size=size,
dtype=dtype,
)
else:
try:
return (
attach_shm_ndarray(
token=token,
readonly=readonly,
),
False,
)
except FileNotFoundError:
log.warning(f"Could not attach to shm with token {token}")
# This actor does not know about memory
# associated with the provided "key".
# Attempt to open a block and expect
# to fail if a block has been allocated
# on the OS by someone else.
return (
open_shm_ndarray(
key=key,
size=size,
dtype=dtype,
append_start_index=append_start_index,
readonly=readonly,
),
True,
)
class ShmList(ShareableList):
'''
Carbon copy of ``.shared_memory.ShareableList`` with a few
enhancements:
- readonly mode via instance var flag `._readonly: bool`
- ``.__getitem__()`` accepts ``slice`` inputs
- exposes the underlying buffer "name" as a ``.key: str``
'''
def __init__(
self,
sequence: list | None = None,
*,
name: str | None = None,
readonly: bool = True
) -> None:
self._readonly = readonly
self._key = name
return super().__init__(
sequence=sequence,
name=name,
)
@property
def key(self) -> str:
return self._key
@property
def readonly(self) -> bool:
return self._readonly
def __setitem__(
self,
position,
value,
) -> None:
# mimick ``numpy`` error
if self._readonly:
raise ValueError('assignment destination is read-only')
return super().__setitem__(position, value)
def __getitem__(
self,
indexish,
) -> list:
# NOTE: this is a non-writeable view (copy?) of the buffer
# in a new list instance.
if isinstance(indexish, slice):
return list(self)[indexish]
return super().__getitem__(indexish)
# TODO: should we offer a `.array` and `.push()` equivalent
# to the `ShmArray`?
# currently we have the following limitations:
# - can't write slices of input using traditional slice-assign
# syntax due to the ``ShareableList.__setitem__()`` implementation.
# - ``list(shmlist)`` returns a non-mutable copy instead of
# a writeable view which would be handier numpy-style ops.
def open_shm_list(
key: str,
sequence: list | None = None,
size: int = int(2 ** 10),
dtype: float | int | bool | str | bytes | None = float,
readonly: bool = True,
) -> ShmList:
if sequence is None:
default = {
float: 0.,
int: 0,
bool: True,
str: 'doggy',
None: None,
}[dtype]
sequence = [default] * size
shml = ShmList(
sequence=sequence,
name=key,
readonly=readonly,
)
# "close" attached shm on actor teardown
try:
actor = tractor.current_actor()
actor.lifetime_stack.callback(shml.shm.close)
actor.lifetime_stack.callback(shml.shm.unlink)
except RuntimeError:
log.warning('tractor runtime not active, skipping teardown steps')
return shml
def attach_shm_list(
key: str,
readonly: bool = False,
) -> ShmList:
return ShmList(
name=key,
readonly=readonly,
)

View File

@ -193,15 +193,39 @@ def get_logger(
'''
log = rlog = logging.getLogger(_root_name)
if name and name != _proj_name:
if (
name
and name != _proj_name
):
# handling for modules that use ``get_logger(__name__)`` to
# avoid duplicate project-package token in msg output
rname, _, tail = name.partition('.')
if rname == _root_name:
name = tail
# NOTE: for handling for modules that use ``get_logger(__name__)``
# we make the following stylistic choice:
# - always avoid duplicate project-package token
# in msg output: i.e. tractor.tractor _ipc.py in header
# looks ridiculous XD
# - never show the leaf module name in the {name} part
# since in python the {filename} is always this same
# module-file.
sub_name: None | str = None
rname, _, sub_name = name.partition('.')
pkgpath, _, modfilename = sub_name.rpartition('.')
# NOTE: for tractor itself never include the last level
# module key in the name such that something like: eg.
# 'tractor.trionics._broadcast` only includes the first
# 2 tokens in the (coloured) name part.
if rname == 'tractor':
sub_name = pkgpath
if _root_name in sub_name:
duplicate, _, sub_name = sub_name.partition('.')
if not sub_name:
log = rlog
else:
log = rlog.getChild(sub_name)
log = rlog.getChild(name)
log.level = rlog.level
# add our actor-task aware adapter which will dynamically look up
@ -254,3 +278,7 @@ def get_console_log(
def get_loglevel() -> str:
return _default_loglevel
# global module logger for tractor itself
log = get_logger('tractor')

View File

@ -43,38 +43,62 @@ Built-in messaging patterns, types, APIs and helpers.
# - https://github.com/msgpack/msgpack-python#packingunpacking-of-custom-data-type
from __future__ import annotations
from inspect import isfunction
from pkgutil import resolve_name
class NamespacePath(str):
'''
A serializeable description of a (function) Python object location
described by the target's module path and namespace key meant as
a message-native "packet" to allows actors to point-and-load objects
by absolute reference.
A serializeable description of a (function) Python object
location described by the target's module path and namespace
key meant as a message-native "packet" to allows actors to
point-and-load objects by an absolute ``str`` (and thus
serializable) reference.
'''
_ref: object = None
_ref: object | type | None = None
def load_ref(self) -> object:
def load_ref(self) -> object | type:
if self._ref is None:
self._ref = resolve_name(self)
return self._ref
def to_tuple(
self,
@staticmethod
def _mk_fqnp(ref: type | object) -> tuple[str, str]:
'''
Generate a minial ``str`` pair which describes a python
object's namespace path and object/type name.
) -> tuple[str, str]:
ref = self.load_ref()
return ref.__module__, getattr(ref, '__name__', '')
In more precise terms something like:
- 'py.namespace.path:object_name',
- eg.'tractor.msg:NamespacePath' will be the ``str`` form
of THIS type XD
'''
if (
isinstance(ref, object)
and not isfunction(ref)
):
name: str = type(ref).__name__
else:
name: str = getattr(ref, '__name__')
# fully qualified namespace path, tuple.
fqnp: tuple[str, str] = (
ref.__module__,
name,
)
return fqnp
@classmethod
def from_ref(
cls,
ref,
ref: type | object,
) -> NamespacePath:
return cls(':'.join(
(ref.__module__,
getattr(ref, '__name__', ''))
))
fqnp: tuple[str, str] = cls._mk_fqnp(ref)
return cls(':'.join(fqnp))
def to_tuple(self) -> tuple[str, str]:
return self._mk_fqnp(self.load_ref())

View File

@ -28,7 +28,6 @@ from typing import (
Callable,
AsyncIterator,
Awaitable,
Optional,
)
import trio
@ -65,9 +64,9 @@ class LinkedTaskChannel(trio.abc.Channel):
_trio_exited: bool = False
# set after ``asyncio.create_task()``
_aio_task: Optional[asyncio.Task] = None
_aio_err: Optional[BaseException] = None
_broadcaster: Optional[BroadcastReceiver] = None
_aio_task: asyncio.Task | None = None
_aio_err: BaseException | None = None
_broadcaster: BroadcastReceiver | None = None
async def aclose(self) -> None:
await self._from_aio.aclose()
@ -188,7 +187,7 @@ def _run_asyncio_task(
cancel_scope = trio.CancelScope()
aio_task_complete = trio.Event()
aio_err: Optional[BaseException] = None
aio_err: BaseException | None = None
chan = LinkedTaskChannel(
aio_q, # asyncio.Queue
@ -217,7 +216,7 @@ def _run_asyncio_task(
try:
result = await coro
except BaseException as aio_err:
log.exception('asyncio task errored')
# log.exception('asyncio task errored:')
chan._aio_err = aio_err
raise
@ -263,7 +262,7 @@ def _run_asyncio_task(
'''
nonlocal chan
aio_err = chan._aio_err
task_err: Optional[BaseException] = None
task_err: BaseException | None = None
# only to avoid ``asyncio`` complaining about uncaptured
# task exceptions
@ -301,7 +300,7 @@ def _run_asyncio_task(
elif task_err is None:
assert aio_err
aio_err.with_traceback(aio_err.__traceback__)
log.error('infected task errorred')
# log.error('infected task errorred')
# XXX: alway cancel the scope on error
# in case the trio task is blocking
@ -329,11 +328,11 @@ async def translate_aio_errors(
'''
trio_task = trio.lowlevel.current_task()
aio_err: Optional[BaseException] = None
aio_err: BaseException | None = None
# TODO: make thisi a channel method?
def maybe_raise_aio_err(
err: Optional[Exception] = None
err: Exception | None = None
) -> None:
aio_err = chan._aio_err
if (
@ -357,7 +356,7 @@ async def translate_aio_errors(
# relay cancel through to called ``asyncio`` task
assert chan._aio_task
chan._aio_task.cancel(
msg=f'the `trio` caller task was cancelled: {trio_task.name}'
msg=f'`trio`-side caller task cancelled: {trio_task.name}'
)
raise
@ -367,7 +366,7 @@ async def translate_aio_errors(
trio.ClosedResourceError,
# trio.BrokenResourceError,
):
aio_err = chan._aio_err
aio_err: BaseException = chan._aio_err
if (
task.cancelled() and
type(aio_err) is CancelledError