piker/.claude/skills/piker_profiling.md

# Piker Profiling Subsystem Skill

Skill for using `piker.toolz.profile.Profiler` to measure
performance across distributed actor systems.

## Core Profiler API

### Basic Usage

```python
from piker.toolz.profile import (
    Profiler,
    pg_profile_enabled,
    ms_slower_then,
)

profiler = Profiler(
    msg='<description of profiled section>',
    disabled=False,  # IMPORTANT: enable explicitly!
    ms_threshold=0.0,  # show all timings, not just slow
)

# do work
some_operation()
profiler('step 1 complete')

# more work
another_operation()
profiler('step 2 complete')

# prints on exit:
# > Entering <description of profiled section>
#   step 1 complete: 12.34, tot:12.34
#   step 2 complete: 56.78, tot:69.12
# < Exiting <description of profiled section>, total: 69.12 ms
```

### Default Behavior Gotcha

**CRITICAL:** Profiler is disabled by default in many contexts!

```python
# BAD: might not print anything!
profiler = Profiler(msg='my operation')

# GOOD: explicit enable
profiler = Profiler(
    msg='my operation',
    disabled=False,  # force enable!
    ms_threshold=0.0,  # show all steps
)
```

### Profiler Output Format

```
> Entering <msg>
  <label 1>: <delta_ms>, tot:<cumulative_ms>
  <label 2>: <delta_ms>, tot:<cumulative_ms>
  ...
< Exiting <msg>, total time: <total_ms> ms
```

**Reading the output:**
- `delta_ms` = time since previous checkpoint
- `cumulative_ms` = time since profiler creation
- Final total = end-to-end time for entire profiled section

## Profiling Distributed Systems

Piker runs across multiple processes (actors). Each actor has
its own log output. To profile distributed operations:

### 1. Identify Actor Boundaries

**Common piker actors:**
- `pikerd` - main daemon process
- `brokerd` - broker connection actor
- `chart` - UI/graphics actor
- Client scripts - analysis/annotation clients

### 2. Add Profilers on Both Sides

**Server-side (chart actor):**
```python
# piker/ui/_remote_ctl.py
@tractor.context
async def remote_annotate(ctx):
    async with ctx.open_stream() as stream:
        async for msg in stream:
            profiler = Profiler(
                msg=f'Batch annotate {n} gaps',
                disabled=False,
                ms_threshold=0.0,
            )

            # handle request
            result = await handle_request(msg)
            profiler('request handled')

            await stream.send(result)
            profiler('result sent')
```

**Client-side (analysis script):**
```python
# piker/tsp/_annotate.py
async def markup_gaps(...):
    profiler = Profiler(
        msg=f'markup_gaps() for {n} gaps',
        disabled=False,
        ms_threshold=0.0,
    )

    await actl.redraw()
    profiler('initial redraw')

    # build specs
    specs = build_specs(gaps)
    profiler('built annotation specs')

    # IPC round-trip!
    result = await actl.add_batch(specs)
    profiler('batch IPC call complete')

    await actl.redraw()
    profiler('final redraw')
```

### 3. Correlate Timing Across Actors

**Example output correlation:**

**Client console:**
```
> Entering markup_gaps() for 1285 gaps
  initial redraw: 0.20ms, tot:0.20
  built annotation specs: 256.48ms, tot:256.68
  batch IPC call complete: 119.26ms, tot:375.94
  final redraw: 0.07ms, tot:376.02
< Exiting markup_gaps(), total: 376.04ms
```

**Server console (chart actor):**
```
> Entering Batch annotate 1285 gaps
  `np.searchsorted()` complete!: 0.81ms, tot:0.81
  `time_to_row` creation complete!: 98.45ms, tot:99.28
  created GapAnnotations item: 2.98ms, tot:102.26
< Exiting Batch annotate, total: 104.15ms
```

**Analysis:**
- Total client time: 376ms
- Server processing: 104ms
- IPC overhead + client spec building: 272ms
- Bottleneck: client-side spec building (256ms)

## Profiling Patterns

### Pattern: Function Entry/Exit

```python
async def my_function():
    profiler = Profiler(
        msg='my_function()',
        disabled=False,
        ms_threshold=0.0,
    )

    step1()
    profiler('step1')

    step2()
    profiler('step2')

    # auto-prints on exit
```

### Pattern: Loop Iterations

```python
# DON'T profile inside tight loops (overhead!)
for i in range(1000):
    profiler(f'iteration {i}')  # NO!

# DO profile around loops
profiler = Profiler(msg='processing 1000 items')
for i in range(1000):
    process(item[i])
profiler('processed all items')
```

### Pattern: Conditional Profiling

```python
# only profile when investigating specific issue
DEBUG_REPOSITION = True

def reposition(self, array):
    if DEBUG_REPOSITION:
        profiler = Profiler(
            msg='GapAnnotations.reposition()',
            disabled=False,
        )

    # ... do work

    if DEBUG_REPOSITION:
        profiler('completed reposition')
```

### Pattern: Teardown/Cleanup Profiling

```python
try:
    # ... main work
    pass
finally:
    profiler = Profiler(
        msg='Annotation teardown',
        disabled=False,
        ms_threshold=0.0,
    )

    cleanup_resources()
    profiler('resources cleaned')

    close_connections()
    profiler('connections closed')
```

## Integration with PyQtGraph

Some piker modules integrate with `pyqtgraph`'s profiling:

```python
from piker.toolz.profile import (
    Profiler,
    pg_profile_enabled,  # checks pyqtgraph config
    ms_slower_then,      # threshold from config
)

profiler = Profiler(
    msg='Curve.paint()',
    disabled=not pg_profile_enabled(),
    ms_threshold=ms_slower_then,
)
```

## Common Use Cases

### 1. IPC Request/Response Timing

```python
# Client side
profiler = Profiler(msg='Remote request')
result = await remote_call()
profiler('got response')

# Server side (in handler)
profiler = Profiler(msg='Handle request')
process_request()
profiler('request processed')
```

### 2. Batch Operation Optimization

```python
profiler = Profiler(msg='Batch processing')

# collect items
items = collect_all()
profiler(f'collected {len(items)} items')

# vectorized operation
results = numpy_batch_op(items)
profiler('numpy op complete')

# build result dict
output = {k: v for k, v in zip(keys, results)}
profiler('dict built')
```

### 3. Startup/Initialization Timing

```python
async def __aenter__(self):
    profiler = Profiler(msg='Service startup')

    await connect_to_broker()
    profiler('broker connected')

    await load_config()
    profiler('config loaded')

    await start_feeds()
    profiler('feeds started')

    return self
```

## Debugging Performance Regressions

When profiler shows unexpected slowness:

1. **Add finer-grained checkpoints**
   ```python
   # was:
   result = big_function()
   profiler('big_function done')

   # now:
   profiler = Profiler(msg='big_function internals')
   step1 = part_a()
   profiler('part_a')
   step2 = part_b()
   profiler('part_b')
   step3 = part_c()
   profiler('part_c')
   ```

2. **Check for hidden iterations**
   ```python
   # looks simple but might be slow!
   result = array[array['time'] == timestamp]
   profiler('array lookup')

   # reveals O(n) scan per call
   for ts in timestamps:  # outer loop
       row = array[array['time'] == ts]  # O(n) scan!
   ```

3. **Isolate IPC from computation**
   ```python
   # was: can't tell where time is spent
   result = await remote_call(data)
   profiler('remote call done')

   # now: separate phases
   payload = prepare_payload(data)
   profiler('payload prepared')

   result = await remote_call(payload)
   profiler('IPC complete')

   parsed = parse_result(result)
   profiler('result parsed')
   ```

## Performance Expectations

**Typical timings to expect:**

- IPC round-trip (local actors): 1-10ms
- NumPy binary search (10k array): <1ms
- Dict building (1k items, simple): 1-5ms
- Qt redraw trigger: 0.1-1ms
- Scene item removal (100s items): 10-50ms

**Red flags:**
- Linear array scan per item: 50-100ms+ for 1k items
- Dict comprehension with struct array: 50-100ms for 1k
- Individual Qt item creation: 5ms per item

## References

- `piker/toolz/profile.py` - Profiler implementation
- `piker/ui/_curve.py` - FlowGraphic paint profiling
- `piker/ui/_remote_ctl.py` - IPC handler profiling
- `piker/tsp/_annotate.py` - Client-side profiling

## Skill Maintenance

Update when:
- New profiling patterns emerge
- Performance expectations change
- New distributed profiling techniques discovered
- Profiler API changes

---

*Last updated: 2026-01-31*
*Session: Batch gap annotation optimization*