15 KiB
macOS Compatibility Fixes for Piker/Tractor
This guide documents macOS-specific issues encountered when running piker
on macOS and their solutions. These fixes address platform differences between Linux and macOS in areas like socket credentials, shared memory naming, and async runtime coordination.
Table of Contents
- Socket Credential Passing
- Shared Memory Name Length Limits
- Shared Memory Cleanup Race Conditions
- Async Runtime (Trio/AsyncIO) Coordination
1. Socket Credential Passing
Problem
On Linux, tractor
uses SO_PASSCRED
and SO_PEERCRED
socket options for Unix domain socket credential passing. macOS doesn’t support these constants, causing AttributeError
when importing.
# Linux code that fails on macOS
from socket import SO_PASSCRED, SO_PEERCRED # AttributeError on macOS
Error Message
AttributeError: module 'socket' has no attribute 'SO_PASSCRED'
Root Cause
- Linux: Uses
SO_PASSCRED
(to enable credential passing) andSO_PEERCRED
(to retrieve peer credentials) - macOS: Uses
LOCAL_PEERCRED
(value0x0001
) instead, and doesn’t require enabling credential passing
Solution
Make the socket credential imports platform-conditional:
File: tractor/ipc/_uds.py
(or equivalent in piker
if duplicated)
import sys
from socket import (
socket,
AF_UNIX,
SOCK_STREAM,
)
# Platform-specific credential passing constants
if sys.platform == 'linux':
from socket import SO_PASSCRED, SO_PEERCRED
elif sys.platform == 'darwin': # macOS
# macOS uses LOCAL_PEERCRED instead of SO_PEERCRED
# and doesn't need SO_PASSCRED
= 0x0001
LOCAL_PEERCRED = LOCAL_PEERCRED # Alias for compatibility
SO_PEERCRED = None # Not needed on macOS
SO_PASSCRED else:
# Other platforms - may need additional handling
= None
SO_PASSCRED = None
SO_PEERCRED
# When creating a socket
if SO_PASSCRED is not None:
1)
sock.setsockopt(SOL_SOCKET, SO_PASSCRED,
# When getting peer credentials
if SO_PEERCRED is not None:
= sock.getsockopt(SOL_SOCKET, SO_PEERCRED, struct.calcsize('3i')) creds
Implementation Notes
- The
LOCAL_PEERCRED
value0x0001
is specific to macOS (from<sys/un.h>
) - macOS doesn’t require explicitly enabling credential passing like Linux does
- Consider using
ctypes
orcffi
for a more robust solution if available
2. Shared Memory Name Length Limits
Problem
macOS limits POSIX shared memory names to 31 characters (defined as PSHMNAMLEN
in <sys/posix_shm_internal.h>
). Piker generates long descriptive names that exceed this limit, causing OSError
.
# Long name that works on Linux but fails on macOS
= "piker_quoter_tsla.nasdaq.ib_hist_1m" # 39 chars - too long! shm_name
Error Message
OSError: [Errno 63] File name too long: '/piker_quoter_tsla.nasdaq.ib_hist_1m'
Root Cause
- Linux: Supports shared memory names up to 255 characters
- macOS: Limits to 31 characters (including leading
/
)
Solution
Implement automatic name shortening for macOS while preserving the original key for lookups:
File: piker/data/_sharedmem.py
import hashlib
import sys
def _shorten_key_for_macos(key: str) -> str:
'''
macOS has a 31 character limit for POSIX shared memory names.
Hash long keys to fit within this limit while maintaining uniqueness.
'''
# macOS shm_open() has a 31 char limit (PSHMNAMLEN)
# Use format: /p_<hash16> where hash is first 16 hex chars of sha256
# This gives us: / + p_ + 16 hex chars = 19 chars, well under limit
# We keep the 'p' prefix to indicate it's from piker
if len(key) <= 31:
return key
# Create a hash of the full key
= hashlib.sha256(key.encode()).hexdigest()[:16]
key_hash = f'p_{key_hash}'
short_key return short_key
class _Token(Struct, frozen=True):
'''
Internal representation of a shared memory "token"
which can be used to key a system wide post shm entry.
'''
str # actual OS-level name (may be shortened on macOS)
shm_name: str
shm_first_index_name: str
shm_last_index_name: tuple
dtype_descr: int # in struct-array index / row terms
size: str | None = None # original descriptive key (for lookup)
key:
def __eq__(self, other) -> bool:
'''
Compare tokens based on shm names and dtype, ignoring the key field.
The key field is only used for lookups, not for token identity.
'''
if not isinstance(other, _Token):
return False
return (
self.shm_name == other.shm_name
and self.shm_first_index_name == other.shm_first_index_name
and self.shm_last_index_name == other.shm_last_index_name
and self.dtype_descr == other.dtype_descr
and self.size == other.size
)
def __hash__(self) -> int:
'''Hash based on the same fields used in __eq__'''
return hash((
self.shm_name,
self.shm_first_index_name,
self.shm_last_index_name,
self.dtype_descr,
self.size,
))
def _make_token(
str,
key: int,
size: | None = None,
dtype: np.dtype -> _Token:
) '''
Create a serializable token that uniquely identifies a shared memory segment.
'''
if dtype is None:
= def_iohlcv_fields
dtype
# On macOS, shorten long keys to fit the 31-char limit
if sys.platform == 'darwin':
= _shorten_key_for_macos(key)
shm_name = _shorten_key_for_macos(key + "_first")
shm_first = _shorten_key_for_macos(key + "_last")
shm_last else:
= key
shm_name = key + "_first"
shm_first = key + "_last"
shm_last
return _Token(
=shm_name,
shm_name=shm_first,
shm_first_index_name=shm_last,
shm_last_index_name=tuple(np.dtype(dtype).descr),
dtype_descr=size,
size=key, # Store original key for lookup
key )
Key Design Decisions
- Hash-based shortening: Uses SHA256 to ensure uniqueness and avoid collisions
- Preserve original key: Store the original descriptive key in the
_Token
for debugging and lookups - Custom equality: The
__eq__
and__hash__
methods ignore thekey
field to ensure tokens are compared by their actual shm properties - Platform detection: Only applies shortening on macOS (
sys.platform == 'darwin'
)
Edge Cases to Consider
- Token serialization across processes (the
key
field must survive IPC) - Token lookup in dictionaries and caches
- Debugging output (use
key
field for human-readable names)
3. Shared Memory Cleanup Race Conditions
Problem
During teardown, shared memory segments may be unlinked by one process while another is still trying to clean them up, causing FileNotFoundError
to crash the application.
Error Message
FileNotFoundError: [Errno 2] No such file or directory: '/p_74c86c7228dd773b'
Root Cause
In multi-process architectures like tractor
, multiple processes may attempt to clean up shared resources simultaneously. Race conditions during shutdown can cause:
- Process A unlinks the shared memory
- Process B tries to unlink the same memory →
FileNotFoundError
- Uncaught exception crashes Process B
Solution
Add defensive error handling to catch and log cleanup races:
File: piker/data/_sharedmem.py
class ShmArray:
# ... existing code ...
def destroy(self) -> None:
'''
Destroy the shared memory segment and cleanup OS resources.
'''
if _USE_POSIX:
# We manually unlink to bypass all the "resource tracker"
# nonsense meant for non-SC systems.
= self._shm
shm = shm.name
name try:
shm_unlink(name)except FileNotFoundError:
# Might be a teardown race where another process
# already unlinked it - this is fine, just log it
f'Shm for {name} already unlinked?')
log.warning(
# Also cleanup the index counters
if hasattr(self, '_first'):
try:
self._first.destroy()
except FileNotFoundError:
f'First index shm already unlinked?')
log.warning(
if hasattr(self, '_last'):
try:
self._last.destroy()
except FileNotFoundError:
f'Last index shm already unlinked?')
log.warning(
class SharedInt:
# ... existing code ...
def destroy(self) -> None:
if _USE_POSIX:
# We manually unlink to bypass all the "resource tracker"
# nonsense meant for non-SC systems.
= self._shm.name
name try:
shm_unlink(name)except FileNotFoundError:
# might be a teardown race here?
f'Shm for {name} already unlinked?') log.warning(
Implementation Notes
- This fix is platform-agnostic but particularly important on macOS where the shortened names make debugging harder
- The warnings help identify cleanup races during development
- Consider adding metrics/counters if cleanup races become frequent
4. Async Runtime (Trio/AsyncIO) Coordination
Problem
The TrioTaskExited
error occurs when trio tasks are cancelled while asyncio tasks are still running, indicating improper coordination between the two async runtimes.
Error Message
tractor._exceptions.TrioTaskExited: but the child `asyncio` task is still running?
>>
|_<Task pending name='Task-2' coro=<wait_on_coro_final_result()> ...>
Root Cause
tractor
uses “guest mode” to run trio as a guest in asyncio’s event loop (or vice versa). The error occurs when:
- A trio task is cancelled (e.g., user closes the UI)
- The cancellation propagates to cleanup handlers
- Cleanup tries to exit while asyncio tasks are still running
- The
translate_aio_errors
context manager detects this inconsistent state
Current State
This issue is partially resolved by the other fixes (socket credentials and shared memory), which eliminate the underlying errors that trigger premature cancellation. However, it may still occur in edge cases.
Potential Solutions
Option 1: Improve Cancellation Propagation (Tractor-level)
File: tractor/to_asyncio.py
async def translate_aio_errors(
chan,bool = False,
wait_on_aio_task: bool = False,
suppress_graceful_exits:
):'''
Context manager to translate asyncio errors to trio equivalents.
'''
try:
yield
except trio.Cancelled:
# When trio is cancelled, ensure asyncio tasks are also cancelled
if wait_on_aio_task:
# Give asyncio tasks a chance to cleanup
await trio.lowlevel.checkpoint()
# Check if asyncio task is still running
if aio_task and not aio_task.done():
# Cancel it gracefully
aio_task.cancel()
# Wait briefly for cancellation
with trio.move_on_after(0.5): # 500ms timeout
await wait_for_aio_task_completion(aio_task)
raise # Re-raise the cancellation
Option 2: Proper Shutdown Sequence (Application-level)
File: piker/brokers/ib/api.py
(or similar broker modules)
async def load_clients_for_trio(
client: Client,
...-> None:
) '''
Load asyncio client and keep it running for trio.
'''
try:
# Setup client
await client.connect()
# Keep alive - but make it cancellable
await trio.sleep_forever()
except trio.Cancelled:
# Explicit cleanup before propagating cancellation
"Shutting down asyncio client gracefully")
log.info(
# Disconnect client
if client.isConnected():
await client.disconnect()
# Small delay to let asyncio cleanup
await trio.sleep(0.1)
raise # Now safe to propagate
Option 3: Detection and Warning (Current Approach)
The current code detects the issue and raises a clear error. This is acceptable if: 1. The error is rare (only during abnormal shutdown) 2. It doesn’t cause data loss 3. Logs provide enough info for debugging
Recommended Approach
For piker: Implement Option 2 (proper shutdown sequence) in broker modules where asyncio is used.
For tractor: Consider Option 1 (improved cancellation propagation) as a library-level enhancement.
Testing
Test the fix by:
# Test graceful shutdown
async def test_asyncio_trio_shutdown():
async with open_channel_from(...) as (first, chan):
# Do some work
await chan.send(msg)
# Trigger cancellation
raise KeyboardInterrupt
# Should cleanup without TrioTaskExited error
Summary of Changes
Files Modified in Piker
piker/data/_sharedmem.py
- Added
_shorten_key_for_macos()
function - Modified
_Token
class to store originalkey
- Modified
_make_token()
to use shortened names on macOS - Added
FileNotFoundError
handling indestroy()
methods
- Added
piker/ui/_display.py
- Removed assertion that checked for ‘hist’ in shm name (incompatible with shortened names)
Files to Modify in Tractor (Recommended)
tractor/ipc/_uds.py
- Make socket credential imports platform-conditional
- Handle macOS-specific
LOCAL_PEERCRED
tractor/to_asyncio.py
(Optional)- Improve cancellation propagation between trio and asyncio
- Add graceful shutdown timeout for asyncio tasks
Platform Detection Pattern
Use this pattern consistently:
import sys
if sys.platform == 'darwin': # macOS
# macOS-specific code
pass
elif sys.platform == 'linux': # Linux
# Linux-specific code
pass
else:
# Other platforms / fallback
pass
Testing Checklist
- Test on macOS (Darwin)
- Test on Linux
- Test shared memory with names > 31 chars
- Test multi-process cleanup race conditions
- Test graceful shutdown (Ctrl+C)
- Test abnormal shutdown (kill signal)
-
Verify no memory leaks (check
/dev/shm
on Linux,ipcs -m
on macOS)
Additional Resources
- macOS System Headers:
/usr/include/sys/un.h
- Unix domain socket constants/usr/include/sys/posix_shm_internal.h
- Shared memory limits
- Python Documentation:
- Trio/AsyncIO:
Contributing
When implementing these fixes in your own project:
- Test thoroughly on both macOS and Linux
- Add platform guards to prevent cross-platform breakage
- Document platform-specific behavior in code comments
- Consider CI/CD testing on multiple platforms
- Handle edge cases gracefully with proper logging
If you find additional macOS-specific issues, please contribute to this guide!