385 lines
8.2 KiB
Markdown
385 lines
8.2 KiB
Markdown
# Piker Profiling Subsystem Skill
|
|
|
|
Skill for using `piker.toolz.profile.Profiler` to measure
|
|
performance across distributed actor systems.
|
|
|
|
## Core Profiler API
|
|
|
|
### Basic Usage
|
|
|
|
```python
|
|
from piker.toolz.profile import (
|
|
Profiler,
|
|
pg_profile_enabled,
|
|
ms_slower_then,
|
|
)
|
|
|
|
profiler = Profiler(
|
|
msg='<description of profiled section>',
|
|
disabled=False, # IMPORTANT: enable explicitly!
|
|
ms_threshold=0.0, # show all timings, not just slow
|
|
)
|
|
|
|
# do work
|
|
some_operation()
|
|
profiler('step 1 complete')
|
|
|
|
# more work
|
|
another_operation()
|
|
profiler('step 2 complete')
|
|
|
|
# prints on exit:
|
|
# > Entering <description of profiled section>
|
|
# step 1 complete: 12.34, tot:12.34
|
|
# step 2 complete: 56.78, tot:69.12
|
|
# < Exiting <description of profiled section>, total: 69.12 ms
|
|
```
|
|
|
|
### Default Behavior Gotcha
|
|
|
|
**CRITICAL:** Profiler is disabled by default in many contexts!
|
|
|
|
```python
|
|
# BAD: might not print anything!
|
|
profiler = Profiler(msg='my operation')
|
|
|
|
# GOOD: explicit enable
|
|
profiler = Profiler(
|
|
msg='my operation',
|
|
disabled=False, # force enable!
|
|
ms_threshold=0.0, # show all steps
|
|
)
|
|
```
|
|
|
|
### Profiler Output Format
|
|
|
|
```
|
|
> Entering <msg>
|
|
<label 1>: <delta_ms>, tot:<cumulative_ms>
|
|
<label 2>: <delta_ms>, tot:<cumulative_ms>
|
|
...
|
|
< Exiting <msg>, total time: <total_ms> ms
|
|
```
|
|
|
|
**Reading the output:**
|
|
- `delta_ms` = time since previous checkpoint
|
|
- `cumulative_ms` = time since profiler creation
|
|
- Final total = end-to-end time for entire profiled section
|
|
|
|
## Profiling Distributed Systems
|
|
|
|
Piker runs across multiple processes (actors). Each actor has
|
|
its own log output. To profile distributed operations:
|
|
|
|
### 1. Identify Actor Boundaries
|
|
|
|
**Common piker actors:**
|
|
- `pikerd` - main daemon process
|
|
- `brokerd` - broker connection actor
|
|
- `chart` - UI/graphics actor
|
|
- Client scripts - analysis/annotation clients
|
|
|
|
### 2. Add Profilers on Both Sides
|
|
|
|
**Server-side (chart actor):**
|
|
```python
|
|
# piker/ui/_remote_ctl.py
|
|
@tractor.context
|
|
async def remote_annotate(ctx):
|
|
async with ctx.open_stream() as stream:
|
|
async for msg in stream:
|
|
profiler = Profiler(
|
|
msg=f'Batch annotate {n} gaps',
|
|
disabled=False,
|
|
ms_threshold=0.0,
|
|
)
|
|
|
|
# handle request
|
|
result = await handle_request(msg)
|
|
profiler('request handled')
|
|
|
|
await stream.send(result)
|
|
profiler('result sent')
|
|
```
|
|
|
|
**Client-side (analysis script):**
|
|
```python
|
|
# piker/tsp/_annotate.py
|
|
async def markup_gaps(...):
|
|
profiler = Profiler(
|
|
msg=f'markup_gaps() for {n} gaps',
|
|
disabled=False,
|
|
ms_threshold=0.0,
|
|
)
|
|
|
|
await actl.redraw()
|
|
profiler('initial redraw')
|
|
|
|
# build specs
|
|
specs = build_specs(gaps)
|
|
profiler('built annotation specs')
|
|
|
|
# IPC round-trip!
|
|
result = await actl.add_batch(specs)
|
|
profiler('batch IPC call complete')
|
|
|
|
await actl.redraw()
|
|
profiler('final redraw')
|
|
```
|
|
|
|
### 3. Correlate Timing Across Actors
|
|
|
|
**Example output correlation:**
|
|
|
|
**Client console:**
|
|
```
|
|
> Entering markup_gaps() for 1285 gaps
|
|
initial redraw: 0.20ms, tot:0.20
|
|
built annotation specs: 256.48ms, tot:256.68
|
|
batch IPC call complete: 119.26ms, tot:375.94
|
|
final redraw: 0.07ms, tot:376.02
|
|
< Exiting markup_gaps(), total: 376.04ms
|
|
```
|
|
|
|
**Server console (chart actor):**
|
|
```
|
|
> Entering Batch annotate 1285 gaps
|
|
`np.searchsorted()` complete!: 0.81ms, tot:0.81
|
|
`time_to_row` creation complete!: 98.45ms, tot:99.28
|
|
created GapAnnotations item: 2.98ms, tot:102.26
|
|
< Exiting Batch annotate, total: 104.15ms
|
|
```
|
|
|
|
**Analysis:**
|
|
- Total client time: 376ms
|
|
- Server processing: 104ms
|
|
- IPC overhead + client spec building: 272ms
|
|
- Bottleneck: client-side spec building (256ms)
|
|
|
|
## Profiling Patterns
|
|
|
|
### Pattern: Function Entry/Exit
|
|
|
|
```python
|
|
async def my_function():
|
|
profiler = Profiler(
|
|
msg='my_function()',
|
|
disabled=False,
|
|
ms_threshold=0.0,
|
|
)
|
|
|
|
step1()
|
|
profiler('step1')
|
|
|
|
step2()
|
|
profiler('step2')
|
|
|
|
# auto-prints on exit
|
|
```
|
|
|
|
### Pattern: Loop Iterations
|
|
|
|
```python
|
|
# DON'T profile inside tight loops (overhead!)
|
|
for i in range(1000):
|
|
profiler(f'iteration {i}') # NO!
|
|
|
|
# DO profile around loops
|
|
profiler = Profiler(msg='processing 1000 items')
|
|
for i in range(1000):
|
|
process(item[i])
|
|
profiler('processed all items')
|
|
```
|
|
|
|
### Pattern: Conditional Profiling
|
|
|
|
```python
|
|
# only profile when investigating specific issue
|
|
DEBUG_REPOSITION = True
|
|
|
|
def reposition(self, array):
|
|
if DEBUG_REPOSITION:
|
|
profiler = Profiler(
|
|
msg='GapAnnotations.reposition()',
|
|
disabled=False,
|
|
)
|
|
|
|
# ... do work
|
|
|
|
if DEBUG_REPOSITION:
|
|
profiler('completed reposition')
|
|
```
|
|
|
|
### Pattern: Teardown/Cleanup Profiling
|
|
|
|
```python
|
|
try:
|
|
# ... main work
|
|
pass
|
|
finally:
|
|
profiler = Profiler(
|
|
msg='Annotation teardown',
|
|
disabled=False,
|
|
ms_threshold=0.0,
|
|
)
|
|
|
|
cleanup_resources()
|
|
profiler('resources cleaned')
|
|
|
|
close_connections()
|
|
profiler('connections closed')
|
|
```
|
|
|
|
## Integration with PyQtGraph
|
|
|
|
Some piker modules integrate with `pyqtgraph`'s profiling:
|
|
|
|
```python
|
|
from piker.toolz.profile import (
|
|
Profiler,
|
|
pg_profile_enabled, # checks pyqtgraph config
|
|
ms_slower_then, # threshold from config
|
|
)
|
|
|
|
profiler = Profiler(
|
|
msg='Curve.paint()',
|
|
disabled=not pg_profile_enabled(),
|
|
ms_threshold=ms_slower_then,
|
|
)
|
|
```
|
|
|
|
## Common Use Cases
|
|
|
|
### 1. IPC Request/Response Timing
|
|
|
|
```python
|
|
# Client side
|
|
profiler = Profiler(msg='Remote request')
|
|
result = await remote_call()
|
|
profiler('got response')
|
|
|
|
# Server side (in handler)
|
|
profiler = Profiler(msg='Handle request')
|
|
process_request()
|
|
profiler('request processed')
|
|
```
|
|
|
|
### 2. Batch Operation Optimization
|
|
|
|
```python
|
|
profiler = Profiler(msg='Batch processing')
|
|
|
|
# collect items
|
|
items = collect_all()
|
|
profiler(f'collected {len(items)} items')
|
|
|
|
# vectorized operation
|
|
results = numpy_batch_op(items)
|
|
profiler('numpy op complete')
|
|
|
|
# build result dict
|
|
output = {k: v for k, v in zip(keys, results)}
|
|
profiler('dict built')
|
|
```
|
|
|
|
### 3. Startup/Initialization Timing
|
|
|
|
```python
|
|
async def __aenter__(self):
|
|
profiler = Profiler(msg='Service startup')
|
|
|
|
await connect_to_broker()
|
|
profiler('broker connected')
|
|
|
|
await load_config()
|
|
profiler('config loaded')
|
|
|
|
await start_feeds()
|
|
profiler('feeds started')
|
|
|
|
return self
|
|
```
|
|
|
|
## Debugging Performance Regressions
|
|
|
|
When profiler shows unexpected slowness:
|
|
|
|
1. **Add finer-grained checkpoints**
|
|
```python
|
|
# was:
|
|
result = big_function()
|
|
profiler('big_function done')
|
|
|
|
# now:
|
|
profiler = Profiler(msg='big_function internals')
|
|
step1 = part_a()
|
|
profiler('part_a')
|
|
step2 = part_b()
|
|
profiler('part_b')
|
|
step3 = part_c()
|
|
profiler('part_c')
|
|
```
|
|
|
|
2. **Check for hidden iterations**
|
|
```python
|
|
# looks simple but might be slow!
|
|
result = array[array['time'] == timestamp]
|
|
profiler('array lookup')
|
|
|
|
# reveals O(n) scan per call
|
|
for ts in timestamps: # outer loop
|
|
row = array[array['time'] == ts] # O(n) scan!
|
|
```
|
|
|
|
3. **Isolate IPC from computation**
|
|
```python
|
|
# was: can't tell where time is spent
|
|
result = await remote_call(data)
|
|
profiler('remote call done')
|
|
|
|
# now: separate phases
|
|
payload = prepare_payload(data)
|
|
profiler('payload prepared')
|
|
|
|
result = await remote_call(payload)
|
|
profiler('IPC complete')
|
|
|
|
parsed = parse_result(result)
|
|
profiler('result parsed')
|
|
```
|
|
|
|
## Performance Expectations
|
|
|
|
**Typical timings to expect:**
|
|
|
|
- IPC round-trip (local actors): 1-10ms
|
|
- NumPy binary search (10k array): <1ms
|
|
- Dict building (1k items, simple): 1-5ms
|
|
- Qt redraw trigger: 0.1-1ms
|
|
- Scene item removal (100s items): 10-50ms
|
|
|
|
**Red flags:**
|
|
- Linear array scan per item: 50-100ms+ for 1k items
|
|
- Dict comprehension with struct array: 50-100ms for 1k
|
|
- Individual Qt item creation: 5ms per item
|
|
|
|
## References
|
|
|
|
- `piker/toolz/profile.py` - Profiler implementation
|
|
- `piker/ui/_curve.py` - FlowGraphic paint profiling
|
|
- `piker/ui/_remote_ctl.py` - IPC handler profiling
|
|
- `piker/tsp/_annotate.py` - Client-side profiling
|
|
|
|
## Skill Maintenance
|
|
|
|
Update when:
|
|
- New profiling patterns emerge
|
|
- Performance expectations change
|
|
- New distributed profiling techniques discovered
|
|
- Profiler API changes
|
|
|
|
---
|
|
|
|
*Last updated: 2026-01-31*
|
|
*Session: Batch gap annotation optimization*
|