Compare commits

..

No commits in common. "4b5176e2c3040491c2b60cc9f321081a4a60e9c9" and "54561959e6cd69876f31a573642777b5df9d4ff7" have entirely different histories.

1 changed files with 12 additions and 271 deletions

View File

@ -36,279 +36,20 @@ across four scenarios by
py3.14.
This submodule lifts the validated primitives out of the
smoke-test and into tractor proper as the
`subint_forkserver` spawn backend.
Design rationale why a forkserver, and why in-process
-------------------------------------------------------
There are two design questions worth pinning down up front,
since the name "subint_forkserver" intentionally evokes the
stdlib `multiprocessing.forkserver` for comparison:
**(1) Why a forkserver pattern at all, vs. forking directly
from the trio task?**
`os.fork()` is fundamentally hostile to trio: trio owns
file descriptors, signal-wakeup-fds, threadpools, and an
event loop with non-trivial post-fork lifecycle invariants
(see python-trio/trio#1614 et al.). Forking a trio-running
thread duplicates all that state into the child, which then
either needs surgical reset (fragile) or has to immediately
`exec()` (defeats the point of fork-without-exec). The
*forkserver* sidesteps this by isolating the `os.fork()`
call in a worker that has provably never entered trio so
the child inherits a clean, trio-free image.
**(2) Why an in-process forkserver, vs. stdlib
`multiprocessing.forkserver`?**
The stdlib design solves the same "fork from clean state"
problem by spinning up a **separate sidecar process** at
first use of `mp.set_start_method('forkserver')`. The parent
then IPC's each spawn request to that sidecar over a unix
socket; the sidecar is the process that actually calls
`os.fork()`. This works but pays for cleanliness with three
costs:
- **Sidecar lifecycle**: a second long-lived process per
parent, with its own start/stop/health-check semantics.
- **IPC overhead per spawn**: every actor-spawn round-trips
an `mp` request message through a unix socket before any
child code runs.
- **State isolation by process boundary**: the sidecar can't
share parent state at all every spawn is a "cold" child
re-importing modules from disk.
The subint architecture lets us keep the forkserver
**in-process** because subints already provide the
state-isolation guarantee that `mp.forkserver`'s sidecar
buys via the process boundary. Concretely: in the envisioned
arch (currently partially landed see "Status" below),
- the **main interpreter** stays trio-free and hosts the
forkserver worker thread that owns `os.fork()`,
- the parent actor's **`trio.run()`** lives in a separate
*sub-interpreter* (a different worker thread) fully
isolated `sys.modules` / `__main__` / globals from main,
- when a spawn is requested, the trio task signals the
forkserver thread (intra-process, ~free) and the
forkserver forks; the child inherits the parent's full
in-memory state cheaply.
That collapses the three costs above:
- no sidecar the forkserver is just another thread,
- spawn signal is a thread-local event/condition, not IPC,
- child inherits the warm parent state (loaded modules,
populated caches, etc.) for free.
The tradeoff we accept in exchange: this design is
3.14-only (legacy-config subints still share the GIL, so
the parent's trio loop and the forkserver worker contend
on it; once PEP 684 isolated-mode + msgspec
[jcrist/msgspec#1026](https://github.com/jcrist/msgspec/issues/1026)
land, this constraint relaxes). And the dedicated worker
threads here are heavier than `trio.to_thread.run_sync`
calls see the "TODO" section further down for the audit
plan once those upstream pieces land.
Future arch what subints would buy us
---------------------------------------
The `subint` in this module's name is **family-naming
today** currently the implementation only uses a regular
worker thread on the main interp; no subinterpreter is
created anywhere in the parent or child. The naming becomes
*literal* once jcrist/msgspec#1026 unblocks isolated-mode
subints (PEP 684 per-interp GIL). Three concrete wins land
at that point:
**(1) Cheaper forks (smaller main-interp COW image)**
Today the parent's main interp carries the full tractor
stack: trio runtime, msgspec codecs, IPC layer, every
user module the actor imported. When the forkserver
worker calls `os.fork()` the child inherits ALL of that
as COW memory even though most gets overwritten when
the child boots its own `trio.run()`.
Move the parent's `trio.run()` into a subint (its own
`sys.modules` / `__main__` / globals) and the main
interp **stays minimal** just the forkserver-thread
plumbing + bare CPython. The main interp becomes the
*literal* forkserver: an intentionally-empty execution
context whose only job is to call `os.fork()` cleanly.
Inherited COW image shrinks proportionally.
**(2) True parallelism between forkserver and trio
(per-interp GIL)**
Today the forkserver worker and the trio.run() thread
share the main GIL when one runs the other waits.
Spawn requests briefly stall trio while the worker
takes the GIL to call `os.fork()`. PEP 684 isolated-
mode gives each subint its own GIL: forkserver thread
on main + trio on subint actually run in parallel.
Spawn latency drops, trio loop doesn't notice the
fork happening.
**(3) Multi-actor-per-process (the architectural prize)**
The bigger payoff and the reason `_subint.py` (the
in-thread `subint` backend) exists in parallel with
this module. With per-interp-GIL subints, one process
can host:
- main interp: forkserver thread + bookkeeping
- subint A: actor 1's `trio.run()`
- subint B: actor 2's `trio.run()`
- subint C: ...
`os.fork()` becomes the **last-resort** spawn used
only when a new OS process is actually required
(cgroups, namespaces, security boundary, multi-host
distribution). Within a single process, subint-per-
actor is radically cheaper: no fork, no COW, no
inherited-fd cleanup just `_interpreters.create()`
+ `_interpreters.exec()`.
The two backends converge on a coherent story:
`subint` in-process spawn (cheap, GIL-isolated),
`subint_forkserver` cross-process spawn (when you
truly need OS-level isolation). The forkserver isn't
the default mechanism; it's the bridge to a new
process when subint isolation isn't enough.
Implementation status what's wired today
-----------------------------------------
The "envisioned arch" above is the eventual target; the
**currently-landed** flow is a partial step toward it:
smoke-test and into tractor proper, so they can eventually be
wired into a real "subint forkserver" spawn backend where:
- A dedicated main-interp worker thread owns all `os.fork()`
calls (never enters a subint). landed.
- Parent actor's `trio.run()` lives **on the main interp**
for now (not a subint yet). The subint-hosted root
runtime is gated on jcrist/msgspec#1026 (see
`_subint.py` docstring).
- Spawn-request signal: trio task ` to_thread.run_sync`
to the forkserver-worker thread. landed.
- Forked child: runs `_actor_child_main` against a normal
trio runtime. landed.
calls (never enters a subint).
- The tractor parent-actor's `trio.run()` lives in a
sub-interpreter on a different worker thread.
- When a spawn is requested, the trio-task signals the
forkserver thread; the forkserver forks; child re-enters
the same pattern (trio in a subint + forkserver on main).
The "subint" in the backend name refers to the *family*
this backend ships in the same PR series as `_subint.py`
(in-thread subint backend) and `_subint_fork.py` (the RFC
stub for fork-from-non-main-subint, blocked upstream).
Once the parent's trio also lives in a subint we'll have
the full envisioned arch; until then the forkserver
half is independently useful and ship-able.
What survives the fork? POSIX semantics
-----------------------------------------
A natural worry when forking from a parent that's running
`trio.run()` on another thread: does that trio thread (and
any other threads in the parent) keep running in the child?
**No.** POSIX `fork()` only preserves the *calling* thread
in the child. Every other thread in the parent trio's
runner thread, any `to_thread` cache threads, anything else
is gone the instant `fork()` returns in the child.
Concretely, after the forkserver worker calls `os.fork()`:
| thread | parent | child |
|-----------------------|-----------|---------------|
| forkserver worker | continues | sole survivor |
| `trio.run()` thread | continues | gone |
| any other thread | continues | gone |
The forkserver worker becomes the new "main" execution
context in the child; `trio.run()` and every other
parent thread never executes a single instruction
post-fork in the child.
This is exactly *why* `os.fork()` is delegated to a
dedicated worker thread that has provably never entered
trio: we want that trio-free thread to be the surviving
one in the child.
That said, dead-thread *artifacts* still cross the fork
boundary (canonical "fork in a multithreaded program is
dangerous" — see `man pthread_atfork`). What persists, and
how we handle each:
- **Inherited file descriptors** the dead trio thread's
epoll fd, signal-wakeup-fd, eventfds, sockets, IPC
pipes, pytest's capture-fds, etc. are all still in the
child's fd table (kernel-level inheritance). Handled by
`_close_inherited_fds()` in the child prelude walks
`/proc/self/fd` and closes everything except stdio +
the channel pipe to the forkserver.
- **Memory image** trio's internal data structures
(scheduler, task queues, runner state) sit in COW
memory but nobody's executing them. Get GC'd /
overwritten when the child's fresh `trio.run()` boots.
- **Python thread state** handled automatically by
CPython. `PyOS_AfterFork_Child()` calls
`_PyThreadState_DeleteExceptCurrent()`, so dead
`PyThreadState` objects are cleaned and
`threading.enumerate()` returns just the surviving
thread.
- **User-level locks (`threading.Lock`)**
held-by-dead-thread state is the canonical fork hazard.
Not an issue in practice for tractor: trio doesn't hold
cross-thread locks across fork (its synchronization is
within the trio task system, which doesn't survive in
either direction). CPython's GIL is auto-reset by the
fork callback.
FYI: how this dodges the `trio.run()` × `fork()` hazards
--------------------------------------------------------
`os.fork()` is famously hostile to `trio` (see
python-trio/trio#1614 et al.) because trio owns several
classes of process-global state that all break across the
fork boundary in different ways. The forkserver-thread
design dodges each class explicitly:
- **Signal-wakeup-fd**: trio installs a wakeup-fd via
`signal.set_wakeup_fd()` on `trio.run()` startup so
signals can interrupt `epoll_wait`. The child inherits
this fd, but trio's runner that owns it is gone — so
any signal delivery in the child writes to a dead
reader. *Dodge*: the inherited wakeup-fd is closed by
`_close_inherited_fds()`, then the child's own
`trio.run()` installs a fresh one.
- **`epoll`/`kqueue` instance**: trio's I/O backend holds
one. Inherited as a dead fd; same fix as above.
- **Threadpool cache threads** (`trio.to_thread`): worker
threads with cached tstate. Don't exist in the child
(POSIX); cache state is meaningless garbage that gets
reset when the child's trio.run() initializes its own
thread cache.
- **Cancel scopes / nurseries / open `trio.Process` /
open sockets**: these are trio-runtime objects, not
kernel objects. The runtime that owns them is gone in
the child, so the Python objects exist as zombie data
in COW memory and get overwritten as the child runs.
Inherited *kernel* fds those objects wrapped (sockets,
proc pipes) are caught by `_close_inherited_fds()`.
- **`atexit` handlers**: trio doesn't register any that
would mis-fire post-fork; trio's lifetime-stack is
all `with`-block-scoped and dies with the runner.
- **Foreign-language I/O state** (libcurl, OpenSSL session
caches, etc.): out of scope same hazard as any
fork-without-exec; users layering those on top of
tractor need their own pthread_atfork handlers.
Net effect: for the runtime surface tractor controls
(trio + IPC layer + msgspec), the forkserver-thread
isolation + `_close_inherited_fds()` cleanup gives the
forked child a clean trio environment. Everything else
falls under the standard fork-without-exec disclaimer.
This mirrors the stdlib `multiprocessing.forkserver` design
but keeps the forkserver in-process for faster spawn latency
and inherited parent state.
Status
------
@ -359,7 +100,7 @@ to know.
Full analysis + audit plan for when we can revisit is in
`ai/conc-anal/subint_forkserver_thread_constraints_on_pep684_issue.md`.
Intent: file a follow-up GH issue linked to #379 once
[jcrist/msgspec#1026](https://github.com/jcrist/msgspec/issues/1026)
[jcrist/msgspec#563](https://github.com/jcrist/msgspec/issues/563)
unblocks isolated-mode subints in tractor.
See also