@ -0,0 +1,973 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
Variant - 1 " main-thread forkserver " spawn backend ( today ' s
working impl ) + the generic fork - from - main - interp - worker - thread
primitives it ' s built on.
Spawn - method key : ` ' main_thread_forkserver ' ` . The legacy
` ' subint_forkserver ' ` key currently aliases here too — see
` tractor . spawn . _subint_forkserver ` for the future variant - 2
( subint - isolated - child runtime , gated on
[ jcrist / msgspec #1026](https://github.com/jcrist/msgspec/issues/1026))
that key is reserved for .
Background
- - - - - - - - - -
Two empirical CPython properties drive the design :
1. * * ` os . fork ( ) ` from a non - main sub - interpreter is refused by
CPython . * * ` PyOS_AfterFork_Child ( ) ` →
` _PyInterpreterState_DeleteExceptMain ( ) ` gates on the calling
thread ' s tstate belonging to the main interpreter and aborts
the forked child otherwise ( ` Fatal Python error : not main
interpreter ` ) . Full source - level walkthrough :
` ai / conc - anal / subint_fork_blocked_by_cpython_post_fork_issue . md ` .
2. * * ` os . fork ( ) ` from a regular ` threading . Thread ` attached to
the * main * interpreter — i . e . a worker thread that has never
entered a subint — works cleanly . * * Empirically validated
across four scenarios by
` ai / conc - anal / subint_fork_from_main_thread_smoketest . py ` on
py3 .14 .
The fork - from - main - thread primitives below codify property ( 2 )
into a reusable surface : spawn a worker thread , fork in it ,
retrieve the child pid back to the caller trio task , and offer a
` trio . Process ` - shaped shim around the raw pid so the existing
` soft_kill ` / ` hard_reap ` patterns from ` _spawn . py ` keep working
unchanged .
Design rationale — why a forkserver , and why in - process
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Two design questions worth pinning down up front , since the
naming intentionally evokes the stdlib ` multiprocessing . forkserver `
for comparison :
* * ( 1 ) Why a forkserver pattern at all , vs . forking directly
from the trio task ? * *
` os . fork ( ) ` is fundamentally hostile to trio : trio owns
file descriptors , signal - wakeup - fds , threadpools , and an
event loop with non - trivial post - fork lifecycle invariants
( see python - trio / trio #1614 et al.). Forking a trio-running
thread duplicates all that state into the child , which then
either needs surgical reset ( fragile ) or has to immediately
` exec ( ) ` ( defeats the point of fork - without - exec ) . The
* forkserver * sidesteps this by isolating the ` os . fork ( ) `
call in a worker that has provably never entered trio — so
the child inherits a clean , trio - free image .
* * ( 2 ) Why an in - process forkserver , vs . stdlib
` multiprocessing . forkserver ` ? * *
The stdlib design solves the same " fork from clean state "
problem by spinning up a * * separate sidecar process * * at
first use of ` mp . set_start_method ( ' forkserver ' ) ` . The parent
then IPC ' s each spawn request to that sidecar over a unix
socket ; the sidecar is the process that actually calls
` os . fork ( ) ` . This works but pays for cleanliness with three
costs :
- * * Sidecar lifecycle * * : a second long - lived process per
parent , with its own start / stop / health - check semantics .
- * * IPC overhead per spawn * * : every actor - spawn round - trips
an ` mp ` request message through a unix socket before any
child code runs .
- * * State isolation by process boundary * * : the sidecar can ' t
share parent state at all — every spawn is a " cold " child
re - importing modules from disk .
Once the variant - 2 ( subint - isolated child runtime ) lands the
in - process forkserver collapses all three costs :
- no sidecar — the forkserver is just another thread ,
- spawn signal is a thread - local event / condition , not IPC ,
- child inherits the warm parent state ( loaded modules ,
populated caches , etc . ) for free .
For the full variant - 2 picture see
` tractor . spawn . _subint_forkserver ` ' s docstring. Today (variant
1 ) we already get costs 1 + 2 collapsed ; cost 3 will land
when msgspec #1026 unblocks isolated-mode subints.
What survives the fork ? — POSIX semantics
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
A natural worry when forking from a parent that ' s running
` trio . run ( ) ` on another thread : does that trio thread ( and
any other threads in the parent ) keep running in the child ?
* * No . * * POSIX ` fork ( ) ` only preserves the * calling * thread
in the child . Every other thread in the parent — trio ' s
runner thread , any ` to_thread ` cache threads , anything else
— is gone the instant ` fork ( ) ` returns in the child .
Concretely , after the forkserver worker calls ` os . fork ( ) ` :
| thread | parent | child |
| - - - - - - - - - - - - - - - - - - - - - - - | - - - - - - - - - - - | - - - - - - - - - - - - - - - |
| forkserver worker | continues | sole survivor |
| ` trio . run ( ) ` thread | continues | gone |
| any other thread | continues | gone |
The forkserver worker becomes the new " main " execution
context in the child ; ` trio . run ( ) ` and every other parent
thread never executes a single instruction post - fork in the
child .
This is exactly * why * ` os . fork ( ) ` is delegated to a
dedicated worker thread that has provably never entered
trio : we want that trio - free thread to be the surviving
one in the child .
That said , dead - thread * artifacts * still cross the fork
boundary ( canonical " fork in a multithreaded program is
dangerous " — see `man pthread_atfork`). What persists, and
how we handle each :
- * * Inherited file descriptors * * — the dead trio thread ' s
epoll fd , signal - wakeup - fd , eventfds , sockets , IPC
pipes , pytest ' s capture-fds, etc. are all still in the
child ' s fd table (kernel-level inheritance). Handled by
` _close_inherited_fds ( ) ` in the child prelude — walks
` / proc / self / fd ` and closes everything except stdio +
the channel pipe to the forkserver .
- * * Memory image * * — trio ' s internal data structures
( scheduler , task queues , runner state ) sit in COW
memory but nobody ' s executing them. Get GC ' d /
overwritten when the child ' s fresh `trio.run()` boots.
- * * Python thread state * * — handled automatically by
CPython . ` PyOS_AfterFork_Child ( ) ` calls
` _PyThreadState_DeleteExceptCurrent ( ) ` , so dead
` PyThreadState ` objects are cleaned and
` threading . enumerate ( ) ` returns just the surviving
thread .
- * * User - level locks ( ` threading . Lock ` ) * * —
held - by - dead - thread state is the canonical fork hazard .
Not an issue in practice for tractor : trio doesn ' t hold
cross - thread locks across fork ( its synchronization is
within the trio task system , which doesn ' t survive in
either direction ) . CPython ' s GIL is auto-reset by the
fork callback .
FYI : how this dodges the ` trio . run ( ) ` × ` fork ( ) ` hazards
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
` os . fork ( ) ` is famously hostile to ` trio ` ( see
python - trio / trio #1614 et al.) because trio owns several
classes of process - global state that all break across the
fork boundary in different ways . The forkserver - thread
design dodges each class explicitly :
- * * Signal - wakeup - fd * * : trio installs a wakeup - fd via
` signal . set_wakeup_fd ( ) ` on ` trio . run ( ) ` startup so
signals can interrupt ` epoll_wait ` . The child inherits
this fd , but trio ' s runner that owns it is gone — so
any signal delivery in the child writes to a dead
reader . * Dodge * : the inherited wakeup - fd is closed by
` _close_inherited_fds ( ) ` , then the child ' s own
` trio . run ( ) ` installs a fresh one .
- * * ` epoll ` / ` kqueue ` instance * * : trio ' s I/O backend holds
one . Inherited as a dead fd ; same fix as above .
- * * Threadpool cache threads * * ( ` trio . to_thread ` ) : worker
threads with cached tstate . Don ' t exist in the child
( POSIX ) ; cache state is meaningless garbage that gets
reset when the child ' s trio.run() initializes its own
thread cache .
- * * Cancel scopes / nurseries / open ` trio . Process ` /
open sockets * * : these are trio - runtime objects , not
kernel objects . The runtime that owns them is gone in
the child , so the Python objects exist as zombie data
in COW memory and get overwritten as the child runs .
Inherited * kernel * fds those objects wrapped ( sockets ,
proc pipes ) are caught by ` _close_inherited_fds ( ) ` .
- * * ` atexit ` handlers * * : trio doesn ' t register any that
would mis - fire post - fork ; trio ' s lifetime-stack is
all ` with ` - block - scoped and dies with the runner .
- * * Foreign - language I / O state * * ( libcurl , OpenSSL session
caches , etc . ) : out of scope — same hazard as any
fork - without - exec ; users layering those on top of
tractor need their own pthread_atfork handlers .
Net effect : for the runtime surface tractor controls
( trio + IPC layer + msgspec ) , the forkserver - thread
isolation + ` _close_inherited_fds ( ) ` cleanup gives the
forked child a clean trio environment . Everything else
falls under the standard fork - without - exec disclaimer .
Implementation status
- - - - - - - - - - - - - - - - - - - - -
- A dedicated main - interp worker thread owns all ` os . fork ( ) `
calls ( never enters a subint ) . ✓ landed .
- Parent actor ' s `trio.run()` lives **on the main interp**
for now ( not a subint yet ) . The subint - hosted root
runtime is the variant - 2 step gated on jcrist / msgspec #1026.
- Spawn - request signal : trio task ` → to_thread . run_sync ` to
the forkserver - worker thread . ✓ landed .
- Forked child : runs ` _actor_child_main ` against a normal
trio runtime . ✓ landed .
Validated by ` tests / spawn / test_subint_forkserver . py ` ( file
will be renamed to ` test_main_thread_forkserver . py ` in a
follow - up ) including the
` test_subint_forkserver_spawn_basic ` backend - tier check .
Still - open work ( tracked on tractor #379):
- no cancellation / hard - kill stress coverage yet
( counterpart to ` tests / test_subint_cancellation . py ` for
the plain ` subint ` backend ) ,
- ` child_sigint = ' trio ' ` mode ( flag scaffolded below ; default
is ` ' ipc ' ` ) . Originally intended as a manual SIGINT →
trio - cancel bridge , but investigation showed trio ' s
handler IS already correctly installed in the fork - child
subactor — the orphan - SIGINT hang is actually a separate
bug where trio ' s event loop stays wedged in `epoll_wait`
despite delivery . See
` ai / conc - anal / subint_forkserver_orphan_sigint_hang_issue . md `
for the full trace + fix directions . Once that root cause
is fixed , this flag may end up a no - op / doc - only mode .
TODO — cleanup gated on msgspec PEP 684 support
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Both worker - thread primitives below allocate a dedicated
` threading . Thread ` rather than using
` trio . to_thread . run_sync ( ) ` . That ' s a cautious design
rooted in three distinct - but - entangled issues ( GIL
starvation from legacy - config subints , tstate - recycling
destroy race on trio cache threads , fork - from - main - tstate
invariant ) . Some of those dissolve under PEP 684
isolated - mode subints ; one requires empirical re - testing
to know .
Full analysis + audit plan in
` ai / conc - anal / subint_forkserver_thread_constraints_on_pep684_issue . md ` ,
tracked at #450; gated on jcrist/msgspec#1026.
What lives here
- - - - - - - - - - - - - - -
Truly generic primitives ( tractor - spawn - backend - agnostic ) :
- ` _close_inherited_fds ( ) ` — fd hygiene primitive
- ` _format_child_exit ( ) ` — ` waitpid ( ) ` status renderer
- ` wait_child ( ) ` — synchronous waitpid wrapper
- ` fork_from_worker_thread ( ) ` — the core fork primitive
- ` _ForkedProc ` — trio - cancellable child - wait shim
The variant - 1 spawn - backend coroutine on top :
- ` main_thread_forkserver_proc ( ) ` — SpawnSpec handshake , IPC
wiring , lifecycle . Registered as the
` ' main_thread_forkserver ' ` ( and currently the legacy
` ' subint_forkserver ' ` - aliased ) entry in
` tractor . spawn . _spawn . _methods ` .
See also
- - - - - - - -
- ` tractor . spawn . _subint_forkserver ` — variant - 2 placeholder
module ; reserved for the future subint - isolated - child
runtime once jcrist / msgspec #1026 unblocks.
- ` tractor . spawn . _subint_fork ` — the stub for the
fork - from - non - main - subint strategy that DIDN ' T work (kept
in - tree as documentation of the attempt + the CPython - level
block ) .
- ` ai / conc - anal / subint_fork_blocked_by_cpython_post_fork_issue . md `
— CPython source walkthrough of why fork - from - subint is dead .
- ` ai / conc - anal / subint_fork_from_main_thread_smoketest . py `
— standalone feasibility check ( delegates to this module
for the primitives it exercises ) .
'''
from __future__ import annotations
import os
import signal
import sys
import threading
from functools import partial
from typing import (
Any ,
Callable ,
Literal ,
TYPE_CHECKING ,
)
import trio
from trio import TaskStatus
from tractor . log import get_logger
from tractor . msg import (
types as msgtypes ,
pretty_struct ,
)
from tractor . runtime . _state import current_actor
from tractor . runtime . _portal import Portal
from . _spawn import (
cancel_on_completion ,
soft_kill ,
)
from . _subint import _has_subints
if TYPE_CHECKING :
from tractor . discovery . _addr import UnwrappedAddress
from tractor . ipc import (
_server ,
)
from tractor . runtime . _runtime import Actor
from tractor . runtime . _supervise import ActorNursery
log = get_logger ( ' tractor ' )
# Configurable child-side SIGINT handling for forkserver-spawned
# subactors. Threaded through `main_thread_forkserver_proc`'s
# `proc_kwargs` under the `'child_sigint'` key.
#
# - `'ipc'` (default, currently the only implemented mode):
# child has NO trio-level SIGINT handler — trio.run() is on
# the fork-inherited non-main thread, `signal.set_wakeup_fd()`
# is main-thread-only. Cancellation flows exclusively via
# the parent's `Portal.cancel_actor()` IPC path. Safe +
# deterministic for nursery-structured apps where the parent
# is always the cancel authority. Known gap: orphan
# (post-parent-SIGKILL) children don't respond to SIGINT
# — see `test_orphaned_subactor_sigint_cleanup_DRAFT`.
#
# - `'trio'` (**not yet implemented**): install a manual
# SIGINT → trio-cancel bridge in the child's fork prelude
# (pre-`trio.run()`) so external Ctrl-C reaches stuck
# grandchildren even with a dead parent. Adds signal-
# handling surface the `'ipc'` default cleanly avoids; only
# pay for it when externally-interruptible children actually
# matter (e.g. CLI tool grandchildren).
ChildSigintMode = Literal [ ' ipc ' , ' trio ' ]
_DEFAULT_CHILD_SIGINT : ChildSigintMode = ' ipc '
def _close_inherited_fds (
keep : frozenset [ int ] = frozenset ( { 0 , 1 , 2 } ) ,
) - > int :
'''
Close every open file descriptor in the current process
EXCEPT those in ` keep ` ( default : stdio only ) .
Intended as the first thing a post - ` os . fork ( ) ` child runs
after closing any communication pipes it knows about . This
is the fork - child FD hygiene discipline that
` subprocess . Popen ( close_fds = True ) ` applies by default for
its exec - based children , but which we have to implement
ourselves because our ` fork_from_worker_thread ( ) ` primitive
deliberately does NOT exec .
Why it matters
- - - - - - - - - - - - - -
Without this , a forkserver - spawned subactor inherits the
parent actor ' s IPC listener sockets, trio-epoll fd, trio
wakeup - pipe , peer - channel sockets , etc . If that subactor
then itself forkserver - spawns a grandchild , the grandchild
inherits the FDs transitively from * both * its direct
parent AND the root actor — IPC message routing becomes
ambiguous and the cancel cascade deadlocks . See
` ai / conc - anal / subint_forkserver_test_cancellation_leak_issue . md `
for the full diagnosis + the empirical repro .
Fresh children will open their own IPC sockets via
` _actor_child_main ( ) ` , so they don ' t need any of the
parent ' s FDs.
Returns the count of fds that were successfully closed —
useful for sanity - check logging at callsites .
'''
# Enumerate open fds via `/proc/self/fd` on Linux (the fast +
# precise path); fall back to `RLIMIT_NOFILE` range close on
# other platforms. Matches stdlib
# `subprocess._posixsubprocess.close_fds` strategy.
try :
fd_names : list [ str ] = os . listdir ( ' /proc/self/fd ' )
candidates : list [ int ] = [
int ( n ) for n in fd_names if n . isdigit ( )
]
except (
FileNotFoundError ,
PermissionError ,
) :
import resource
soft , _ = resource . getrlimit ( resource . RLIMIT_NOFILE )
candidates = list ( range ( 3 , soft ) )
closed : int = 0
for fd in candidates :
if fd in keep :
continue
try :
os . close ( fd )
closed + = 1
except OSError :
# fd was already closed (race with listdir) or otherwise
# unclosable — either is fine.
log . exception (
f ' Failed to close inherited fd in child ?? \n '
f ' { fd !r} \n '
)
return closed
def _format_child_exit (
status : int ,
) - > str :
'''
Render ` os . waitpid ( ) ` - returned status as a short human
string ( ` ' rc=0 ' ` / ` ' signal=SIGABRT ' ` / etc . ) for log
output .
'''
if os . WIFEXITED ( status ) :
return f ' rc= { os . WEXITSTATUS ( status ) } '
elif os . WIFSIGNALED ( status ) :
sig : int = os . WTERMSIG ( status )
return f ' signal= { signal . Signals ( sig ) . name } '
else :
return f ' raw_status= { status } '
def wait_child (
pid : int ,
* ,
expect_exit_ok : bool = True ,
) - > tuple [ bool , str ] :
'''
` os . waitpid ( ) ` + classify the child ' s exit as
expected - or - not .
` expect_exit_ok = True ` → expect clean ` rc = 0 ` . ` False ` →
expect abnormal death ( any signal or nonzero rc ) . Used
by the control - case smoke - test scenario where CPython
is meant to abort the child .
Returns ` ( ok , status_str ) ` — ` ok ` reflects whether the
observed outcome matches ` expect_exit_ok ` , ` status_str `
is a short render of the actual status .
'''
_ , status = os . waitpid ( pid , 0 )
exited_normally : bool = (
os . WIFEXITED ( status )
and
os . WEXITSTATUS ( status ) == 0
)
ok : bool = (
exited_normally
if expect_exit_ok
else not exited_normally
)
return ok , _format_child_exit ( status )
def fork_from_worker_thread (
child_target : Callable [ [ ] , int ] | None = None ,
* ,
thread_name : str = ' main-thread-fork ' ,
join_timeout : float = 10.0 ,
) - > int :
'''
` os . fork ( ) ` from a main - interp worker thread ; return the
forked child ' s pid.
The calling context * * must * * be the main interpreter
( not a subinterpreter ) — that ' s the whole point of this
primitive . A regular ` threading . Thread ( target = . . . ) `
spawned from main - interp code satisfies this
automatically because Python attaches the thread ' s
tstate to the * calling * interpreter , and our main
thread ' s calling interp is always main.
If ` child_target ` is provided , it runs IN the forked
child process before ` os . _exit ` is called . The callable
should return an int used as the child ' s exit rc. If
` child_target ` is None , the child ` _exit ( 0 ) ` s immediately
( useful for the baseline sanity case ) .
On the PARENT side , this function drives the worker
thread to completion ( ` fork ( ) ` returns near - instantly ;
the thread is expected to exit promptly ) and then
returns the forked child ' s pid. Raises `RuntimeError`
if the worker thread fails to return within
` join_timeout ` seconds — that ' d be an unexpected CPython
pathology .
'''
# Use a pipe to shuttle the forked child's pid from the
# worker thread back to the caller.
rfd , wfd = os . pipe ( )
def _worker ( ) - > None :
'''
Runs on the forkserver worker thread . Forks ; child
runs ` child_target ` ( if any ) and exits ; parent side
writes the child pid to the pipe so the main - thread
caller can retrieve it .
'''
pid : int = os . fork ( )
if pid == 0 :
# CHILD: close the pid-pipe ends (we don't use
# them here), then scrub ALL other inherited FDs
# so the child starts with a clean slate
# (stdio-only). Critical for multi-level spawn
# trees — see `_close_inherited_fds()` docstring.
os . close ( rfd )
os . close ( wfd )
_close_inherited_fds ( )
rc : int = 0
if child_target is not None :
try :
rc = child_target ( ) or 0
except BaseException as err :
log . error (
f ' main-thread-fork child_target '
f ' raised: \n '
f ' |_ { type ( err ) . __name__ } : { err } '
)
rc = 2
os . _exit ( rc )
else :
# PARENT (still inside the worker thread):
# hand the child pid back to main via pipe.
os . write ( wfd , pid . to_bytes ( 8 , ' little ' ) )
worker : threading . Thread = threading . Thread (
target = _worker ,
name = thread_name ,
daemon = False ,
)
worker . start ( )
worker . join ( timeout = join_timeout )
if worker . is_alive ( ) :
# Pipe cleanup best-effort before bail.
try :
os . close ( rfd )
except OSError :
log . exception (
f ' Failed to close PID-pipe read-fd in parent ?? \n '
f ' { rfd !r} \n '
)
try :
os . close ( wfd )
except OSError :
log . exception (
f ' Failed to close PID-pipe write-fd in parent ?? \n '
f ' { wfd !r} \n '
)
raise RuntimeError (
f ' main-thread-fork worker thread '
f ' { thread_name !r} did not return within '
f ' { join_timeout } s — this is unexpected since '
f ' `os.fork()` should return near-instantly on '
f ' the parent side. '
)
pid_bytes : bytes = os . read ( rfd , 8 )
os . close ( rfd )
os . close ( wfd )
pid : int = int . from_bytes ( pid_bytes , ' little ' )
log . runtime (
f ' main-thread-fork forked child \n '
f ' (> \n '
f ' |_pid= { pid } \n '
)
return pid
class _ForkedProc :
'''
Thin ` trio . Process ` - compatible shim around a raw OS pid
returned by ` fork_from_worker_thread ( ) ` , exposing just
enough surface for the ` soft_kill ( ) ` / hard - reap pattern
borrowed from ` trio_proc ( ) ` .
Unlike ` trio . Process ` , we have no direct handles on the
child ' s std-streams (fork-without-exec inherits the
parent ' s FDs, but we don ' t marshal them into this
wrapper ) — ` . stdin ` / ` . stdout ` / ` . stderr ` are all ` None ` ,
which matches what ` soft_kill ( ) ` handles via its
` is not None ` guards .
'''
def __init__ ( self , pid : int ) :
self . pid : int = pid
self . _returncode : int | None = None
# `soft_kill`/`hard_kill` check these for pipe
# teardown — all None since we didn't wire up pipes
# on the fork-without-exec path.
self . stdin = None
self . stdout = None
self . stderr = None
# pidfd (Linux 5.3+, Python 3.9+) — a file descriptor
# referencing this child process which becomes readable
# once the child exits. Enables a fully trio-cancellable
# wait via `trio.lowlevel.wait_readable()` — same
# pattern `trio.Process.wait()` uses under the hood, and
# the same pattern `multiprocessing.Process.sentinel`
# uses for `tractor.spawn._spawn.proc_waiter()`. Without
# this, waiting via `trio.to_thread.run_sync(os.waitpid,
# ...)` blocks a cache thread on a sync syscall that is
# NOT trio-cancellable, which prevents outer cancel
# scopes from unwedging a stuck-child cancel cascade.
self . _pidfd : int = os . pidfd_open ( pid )
def poll ( self ) - > int | None :
'''
Non - blocking liveness probe . Returns ` None ` if the
child is still running , else its exit code ( negative
for signal - death , matching ` subprocess . Popen `
convention ) .
'''
if self . _returncode is not None :
return self . _returncode
try :
waited_pid , status = os . waitpid ( self . pid , os . WNOHANG )
except ChildProcessError :
# already reaped (or never existed) — treat as
# clean exit for polling purposes.
self . _returncode = 0
return 0
if waited_pid == 0 :
return None
self . _returncode = self . _parse_status ( status )
return self . _returncode
@property
def returncode ( self ) - > int | None :
return self . _returncode
async def wait ( self ) - > int :
'''
Async , fully - trio - cancellable wait for the child ' s
exit . Uses ` trio . lowlevel . wait_readable ( ) ` on the
` pidfd ` sentinel — same pattern as ` trio . Process . wait `
and ` tractor . spawn . _spawn . proc_waiter ` ( mp backend ) .
Safe to call multiple times ; subsequent calls return
the cached rc without re - issuing the syscall .
'''
if self . _returncode is not None :
return self . _returncode
# Park until the pidfd becomes readable — the OS
# signals this exactly once on child exit. Cancellable
# via any outer trio cancel scope (this was the key
# fix vs. the prior `to_thread.run_sync(os.waitpid,
# abandon_on_cancel=False)` which blocked a thread on
# a sync syscall and swallowed cancels).
await trio . lowlevel . wait_readable ( self . _pidfd )
# pidfd signaled → reap non-blocking to collect the
# exit status. `WNOHANG` here is correct: by the time
# the pidfd is readable, `waitpid()` won't block.
try :
_ , status = os . waitpid ( self . pid , os . WNOHANG )
except ChildProcessError :
# already reaped by something else
status = 0
self . _returncode = self . _parse_status ( status )
# pidfd is one-shot; close it so we don't leak fds
# across many spawns.
try :
os . close ( self . _pidfd )
except OSError :
pass
self . _pidfd = - 1
return self . _returncode
def kill ( self ) - > None :
'''
OS - level ` SIGKILL ` to the child . Swallows
` ProcessLookupError ` ( already dead ) .
'''
try :
os . kill ( self . pid , signal . SIGKILL )
except ProcessLookupError :
pass
def __del__ ( self ) - > None :
# belt-and-braces: close the pidfd if `wait()` wasn't
# called (e.g. unexpected teardown path).
fd : int = getattr ( self , ' _pidfd ' , - 1 )
if fd > = 0 :
try :
os . close ( fd )
except OSError :
pass
def _parse_status ( self , status : int ) - > int :
if os . WIFEXITED ( status ) :
return os . WEXITSTATUS ( status )
elif os . WIFSIGNALED ( status ) :
# negative rc by `subprocess.Popen` convention
return - os . WTERMSIG ( status )
return 0
def __repr__ ( self ) - > str :
return (
f ' <_ForkedProc pid= { self . pid } '
f ' returncode= { self . _returncode } > '
)
async def main_thread_forkserver_proc (
name : str ,
actor_nursery : ActorNursery ,
subactor : Actor ,
errors : dict [ tuple [ str , str ] , Exception ] ,
# passed through to actor main
bind_addrs : list [ UnwrappedAddress ] ,
parent_addr : UnwrappedAddress ,
_runtime_vars : dict [ str , Any ] ,
* ,
infect_asyncio : bool = False ,
task_status : TaskStatus [ Portal ] = trio . TASK_STATUS_IGNORED ,
proc_kwargs : dict [ str , any ] = { } ,
) - > None :
'''
Spawn a subactor via ` os . fork ( ) ` from a non - trio worker
thread ( see ` fork_from_worker_thread ( ) ` ) , with the forked
child running ` tractor . _child . _actor_child_main ( ) ` and
connecting back via tractor ' s normal IPC handshake.
Supervision model mirrors ` trio_proc ( ) ` — we manage a
real OS subprocess , so ` Portal . cancel_actor ( ) ` +
` soft_kill ( ) ` on graceful teardown and ` os . kill ( SIGKILL ) `
on hard - reap both apply directly ( no
` _interpreters . destroy ( ) ` voodoo needed since the child
is in its own process ) .
The only real difference from ` trio_proc ` is the spawn
mechanism : fork from a known - clean main - interp worker
thread instead of ` trio . lowlevel . open_process ( ) ` .
'''
if not _has_subints :
raise RuntimeError (
f ' The { " main_thread_forkserver " !r} spawn backend '
f ' requires Python 3.14+. \n '
f ' Current runtime: { sys . version } '
)
# Backend-scoped config pulled from `proc_kwargs`. Using
# `proc_kwargs` (vs a first-class kwarg on this function)
# matches how other backends expose per-spawn tuning
# (`trio_proc` threads it to `trio.lowlevel.open_process`,
# etc.) and keeps `ActorNursery.start_actor(proc_kwargs=...)`
# as the single ergonomic entry point.
child_sigint : ChildSigintMode = proc_kwargs . get (
' child_sigint ' ,
_DEFAULT_CHILD_SIGINT ,
)
if child_sigint not in ( ' ipc ' , ' trio ' ) :
raise ValueError (
f ' Invalid `child_sigint= { child_sigint !r} ` for '
f ' `main_thread_forkserver` backend. \n '
f ' Expected one of: { ChildSigintMode } . '
)
if child_sigint == ' trio ' :
raise NotImplementedError (
" `child_sigint= ' trio ' ` mode — trio-native SIGINT "
" plumbing in the fork-child — is scaffolded but "
" not yet implemented. See the xfail ' d "
" `test_orphaned_subactor_sigint_cleanup_DRAFT` "
" and the TODO in this module ' s docstring. "
)
uid : tuple [ str , str ] = subactor . aid . uid
loglevel : str | None = subactor . loglevel
# Closure captured into the fork-child's memory image.
# In the child this is the first post-fork Python code to
# run, on what was the fork-worker thread in the parent.
# `child_sigint` is captured here so the impl lands inside
# this function once the `'trio'` mode is wired up —
# nothing above this comment needs to change.
def _child_target ( ) - > int :
# Dispatch on the captured SIGINT-mode closure var.
# Today only `'ipc'` is reachable (the `'trio'` branch
# is fenced off at the backend-entry guard above); the
# match is in place so the future `'trio'` impl slots
# in as a plain case arm without restructuring.
match child_sigint :
case ' ipc ' :
pass # <- current behavior: no child-side
# SIGINT plumbing; rely on parent
# `Portal.cancel_actor()` IPC path.
case ' trio ' :
# Unreachable today (see entry-guard above);
# this stub exists so that lifting the guard
# is the only change required to enable
# `'trio'` mode once the SIGINT wakeup-fd
# bridge is implemented.
raise NotImplementedError (
" `child_sigint= ' trio ' ` fork-prelude "
" plumbing not yet wired. "
)
# Lazy import so the parent doesn't pay for it on
# every spawn — it's module-level in `_child` but
# cheap enough to re-resolve here.
from tractor . _child import _actor_child_main
# XXX, `os.fork()` inherits the parent's entire memory
# image, including `tractor.runtime._state._runtime_vars`
# (which in the parent encodes "this process IS the root
# actor"). A fresh `exec`-based child starts cold; we
# replicate that here by explicitly resetting runtime
# vars to their fresh-process defaults — otherwise
# `Actor.__init__` takes the `is_root_process() == True`
# branch, pre-populates `self.enable_modules`, and trips
# the `assert not self.enable_modules` gate at the top
# of `Actor._from_parent()` on the subsequent parent→
# child `SpawnSpec` handshake. (`_state._current_actor`
# is unconditionally overwritten by `_trio_main` → no
# reset needed for it.)
from tractor . runtime . _state import (
get_runtime_vars ,
set_runtime_vars ,
)
set_runtime_vars ( get_runtime_vars ( clear_values = True ) )
_actor_child_main (
uid = uid ,
loglevel = loglevel ,
parent_addr = parent_addr ,
infect_asyncio = infect_asyncio ,
# The child's runtime is trio-native (uses
# `_trio_main` + receives `SpawnSpec` over IPC),
# but label it with the actual parent-side spawn
# mechanism so `Actor.pformat()` / log lines
# reflect reality. Downstream runtime gates that
# key on `_spawn_method` group `main_thread_forkserver`
# alongside `trio`/`subint` where the SpawnSpec
# IPC handshake is concerned — see
# `runtime._runtime.Actor._from_parent()`.
spawn_method = ' main_thread_forkserver ' ,
)
return 0
cancelled_during_spawn : bool = False
proc : _ForkedProc | None = None
ipc_server : _server . Server = actor_nursery . _actor . ipc_server
try :
try :
pid : int = await trio . to_thread . run_sync (
partial (
fork_from_worker_thread ,
_child_target ,
thread_name = (
f ' main-thread-forkserver[ { name } ] '
) ,
) ,
abandon_on_cancel = False ,
)
proc = _ForkedProc ( pid )
log . runtime (
f ' Forked subactor via main-thread-forkserver \n '
f ' (> \n '
f ' |_ { proc } \n '
)
event , chan = await ipc_server . wait_for_peer ( uid )
except trio . Cancelled :
cancelled_during_spawn = True
raise
assert proc is not None
portal = Portal ( chan )
actor_nursery . _children [ uid ] = (
subactor ,
proc ,
portal ,
)
sspec = msgtypes . SpawnSpec (
_parent_main_data = subactor . _parent_main_data ,
enable_modules = subactor . enable_modules ,
reg_addrs = subactor . reg_addrs ,
bind_addrs = bind_addrs ,
_runtime_vars = _runtime_vars ,
)
log . runtime (
f ' Sending spawn spec to forkserver child \n '
f ' {{ }} => { chan . aid . reprol ( ) !r} \n '
f ' \n '
f ' { pretty_struct . pformat ( sspec ) } \n '
)
await chan . send ( sspec )
curr_actor : Actor = current_actor ( )
curr_actor . _actoruid2nursery [ uid ] = actor_nursery
task_status . started ( portal )
with trio . CancelScope ( shield = True ) :
await actor_nursery . _join_procs . wait ( )
async with trio . open_nursery ( ) as nursery :
if portal in actor_nursery . _cancel_after_result_on_exit :
nursery . start_soon (
cancel_on_completion ,
portal ,
subactor ,
errors ,
)
# reuse `trio_proc`'s soft-kill dance — `proc`
# is our `_ForkedProc` shim which implements the
# same `.poll()` / `.wait()` / `.kill()` surface
# `soft_kill` expects.
await soft_kill (
proc ,
_ForkedProc . wait ,
portal ,
)
nursery . cancel_scope . cancel ( )
finally :
# Hard reap: SIGKILL + waitpid. Cheap since we have
# the real OS pid, unlike `subint_proc` which has to
# fuss with `_interpreters.destroy()` races.
if proc is not None and proc . poll ( ) is None :
log . cancel (
f ' Hard killing main-thread-forkserver subactor \n '
f ' >x) \n '
f ' |_ { proc } \n '
)
with trio . CancelScope ( shield = True ) :
proc . kill ( )
await proc . wait ( )
if not cancelled_during_spawn :
actor_nursery . _children . pop ( uid , None )