# Timeseries Optimization: NumPy & Polars Skill for high-performance timeseries processing using NumPy and Polars, with focus on patterns common in financial/trading applications. ## Core Principle: Vectorization Over Iteration **Never write Python loops over large arrays.** Always look for vectorized alternatives. ```python # BAD: Python loop (slow!) results = [] for i in range(len(array)): if array['time'][i] == target_time: results.append(array[i]) # GOOD: vectorized boolean indexing (fast!) results = array[array['time'] == target_time] ``` ## NumPy Structured Arrays Piker uses structured arrays for OHLCV data: ```python # typical piker array dtype dtype = [ ('index', 'i8'), # absolute sequence index ('time', 'f8'), # unix epoch timestamp ('open', 'f8'), ('high', 'f8'), ('low', 'f8'), ('close', 'f8'), ('volume', 'f8'), ] arr = np.array([(0, 1234.0, 100, 101, 99, 100.5, 1000)], dtype=dtype) # field access times = arr['time'] # returns view, not copy closes = arr['close'] ``` ### Structured Array Performance Gotchas **1. Field access in loops is slow** ```python # BAD: repeated struct field access per iteration for i, row in enumerate(arr): x = row['index'] # struct access per iteration! y = row['close'] process(x, y) # GOOD: extract fields once, iterate plain arrays indices = arr['index'] # extract once closes = arr['close'] for i in range(len(arr)): x = indices[i] # plain array indexing y = closes[i] process(x, y) ``` **2. Dict comprehensions with struct arrays** ```python # SLOW: field access per row in Python loop time_to_row = { float(row['time']): { 'index': float(row['index']), 'close': float(row['close']), } for row in matched_rows # struct field access! } # FAST: extract to plain arrays first times = matched_rows['time'].astype(float) indices = matched_rows['index'].astype(float) closes = matched_rows['close'].astype(float) time_to_row = { t: {'index': idx, 'close': cls} for t, idx, cls in zip(times, indices, closes) } ``` ## Timestamp Lookup Patterns ### Linear Scan (O(n)) - Avoid! ```python # BAD: O(n) scan through entire array for target_ts in timestamps: # m iterations matches = array[array['time'] == target_ts] # O(n) scan # Total: O(m * n) - catastrophic for large datasets! ``` **Performance:** - 1000 lookups × 10k array = 10M comparisons - Timing: ~50-100ms for 1k lookups ### Binary Search (O(log n)) - Good! ```python # GOOD: O(m log n) using searchsorted import numpy as np time_arr = array['time'] # extract once ts_array = np.array(timestamps) # binary search for all timestamps at once indices = np.searchsorted(time_arr, ts_array) # bounds check and exact match verification valid_mask = ( (indices < len(array)) & (time_arr[indices] == ts_array) ) valid_indices = indices[valid_mask] matched_rows = array[valid_indices] ``` **Requirements for `searchsorted()`:** - Input array MUST be sorted (ascending by default) - Works on any sortable dtype (floats, ints, etc) - Returns insertion indices (not found = len(array)) **Performance:** - 1000 lookups × 10k array = ~10k comparisons - Timing: <1ms for 1k lookups - **~100-1000x faster than linear scan** ### Hash Table (O(1)) - Best for Multiple Lookups! If you'll do many lookups on same array, build dict once: ```python # build lookup once time_to_idx = { float(array['time'][i]): i for i in range(len(array)) } # O(1) lookups for target_ts in timestamps: idx = time_to_idx.get(target_ts) if idx is not None: row = array[idx] ``` **When to use:** - Many repeated lookups on same array - Array doesn't change between lookups - Can afford upfront dict building cost ## Vectorized Boolean Operations ### Basic Filtering ```python # single condition recent = array[array['time'] > cutoff_time] # multiple conditions with &, | filtered = array[ (array['time'] > start_time) & (array['time'] < end_time) & (array['volume'] > min_volume) ] # IMPORTANT: parentheses required around each condition! # (operator precedence: & binds tighter than >) ``` ### Fancy Indexing ```python # boolean mask mask = array['close'] > array['open'] # up bars up_bars = array[mask] # integer indices indices = np.array([0, 5, 10, 15]) selected = array[indices] # combine boolean + fancy indexing mask = array['volume'] > threshold high_vol_indices = np.where(mask)[0] subset = array[high_vol_indices[::2]] # every other ``` ## Common Financial Patterns ### Gap Detection ```python # assume sorted by time time_diffs = np.diff(array['time']) expected_step = 60.0 # 1-minute bars # find gaps larger than expected gap_mask = time_diffs > (expected_step * 1.5) gap_indices = np.where(gap_mask)[0] # get gap start/end times gap_starts = array['time'][gap_indices] gap_ends = array['time'][gap_indices + 1] ``` ### Rolling Window Operations ```python # simple moving average (close) window = 20 sma = np.convolve( array['close'], np.ones(window) / window, mode='valid', ) # alternatively, use stride tricks for efficiency from numpy.lib.stride_tricks import sliding_window_view windows = sliding_window_view(array['close'], window) sma = windows.mean(axis=1) ``` ### OHLC Resampling (NumPy) ```python # resample 1m bars to 5m bars def resample_ohlc(arr, old_step, new_step): n_bars = len(arr) factor = int(new_step / old_step) # truncate to multiple of factor n_complete = (n_bars // factor) * factor arr = arr[:n_complete] # reshape into chunks reshaped = arr.reshape(-1, factor) # aggregate OHLC opens = reshaped[:, 0]['open'] highs = reshaped['high'].max(axis=1) lows = reshaped['low'].min(axis=1) closes = reshaped[:, -1]['close'] volumes = reshaped['volume'].sum(axis=1) return np.rec.fromarrays( [opens, highs, lows, closes, volumes], names=['open', 'high', 'low', 'close', 'volume'], ) ``` ## Polars Integration Piker is transitioning to Polars for some operations. ### NumPy ↔ Polars Conversion ```python import polars as pl # numpy to polars df = pl.from_numpy( arr, schema=['index', 'time', 'open', 'high', 'low', 'close', 'volume'], ) # polars to numpy (via arrow) arr = df.to_numpy() # piker convenience from piker.tsp import np2pl, pl2np df = np2pl(arr) arr = pl2np(df) ``` ### Polars Performance Patterns **Lazy evaluation:** ```python # build query lazily lazy_df = ( df.lazy() .filter(pl.col('volume') > 1000) .with_columns([ (pl.col('close') - pl.col('open')).alias('change') ]) .sort('time') ) # execute once result = lazy_df.collect() ``` **Groupby aggregations:** ```python # resample to 5-minute bars resampled = df.groupby_dynamic( index_column='time', every='5m', ).agg([ pl.col('open').first(), pl.col('high').max(), pl.col('low').min(), pl.col('close').last(), pl.col('volume').sum(), ]) ``` ### When to Use Polars vs NumPy **Use Polars when:** - Complex queries with multiple filters/joins - Need SQL-like operations (groupby, window functions) - Working with heterogeneous column types - Want lazy evaluation optimization **Use NumPy when:** - Simple array operations (indexing, slicing) - Direct memory access needed (e.g., SHM arrays) - Compatibility with Qt/pyqtgraph (expects NumPy) - Maximum performance for numerical computation ## Memory Considerations ### Views vs Copies ```python # VIEW: shares memory (fast, no copy) times = array['time'] # field access subset = array[10:20] # slicing reshaped = array.reshape(-1, 2) # COPY: new memory allocation filtered = array[array['time'] > cutoff] # boolean indexing sorted_arr = np.sort(array) # sorting casted = array.astype(np.float32) # type conversion # force copy when needed explicit_copy = array.copy() ``` ### In-Place Operations ```python # modify in-place (no new allocation) array['close'] *= 1.01 # scale prices array['volume'][mask] = 0 # zero out specific rows # careful: compound operations may create temporaries array['close'] = array['close'] * 1.01 # creates temp! array['close'] *= 1.01 # true in-place ``` ## Performance Checklist When optimizing timeseries operations: - [ ] Is the array sorted? (enables binary search) - [ ] Are you doing repeated lookups? (build hash table) - [ ] Are struct fields accessed in loops? (extract to plain arrays) - [ ] Are you using boolean indexing? (vectorized vs loop) - [ ] Can operations be batched? (minimize round-trips) - [ ] Is memory being copied unnecessarily? (use views) - [ ] Are you using the right tool? (NumPy vs Polars) ## Common Bottlenecks and Fixes ### Bottleneck: Timestamp Lookups ```python # BEFORE: O(n*m) - 100ms for 1k lookups for ts in timestamps: matches = array[array['time'] == ts] # AFTER: O(m log n) - <1ms for 1k lookups indices = np.searchsorted(array['time'], timestamps) ``` ### Bottleneck: Dict Building from Struct Array ```python # BEFORE: 100ms for 3k rows result = { float(row['time']): { 'index': float(row['index']), 'close': float(row['close']), } for row in matched_rows } # AFTER: <5ms for 3k rows times = matched_rows['time'].astype(float) indices = matched_rows['index'].astype(float) closes = matched_rows['close'].astype(float) result = { t: {'index': idx, 'close': cls} for t, idx, cls in zip(times, indices, closes) } ``` ### Bottleneck: Repeated Field Access ```python # BEFORE: 50ms for 1k iterations for i, spec in enumerate(specs): start_row = array[array['time'] == spec['start_time']][0] end_row = array[array['time'] == spec['end_time']][0] process(start_row['index'], end_row['close']) # AFTER: <5ms for 1k iterations # 1. Build lookup once time_to_row = {...} # via searchsorted # 2. Extract fields to plain arrays beforehand indices_arr = array['index'] closes_arr = array['close'] # 3. Use lookup + plain array indexing for spec in specs: start_idx = time_to_row[spec['start_time']]['array_idx'] end_idx = time_to_row[spec['end_time']]['array_idx'] process(indices_arr[start_idx], closes_arr[end_idx]) ``` ## References - NumPy structured arrays: https://numpy.org/doc/stable/user/basics.rec.html - `np.searchsorted`: https://numpy.org/doc/stable/reference/generated/numpy.searchsorted.html - Polars: https://pola-rs.github.io/polars/ - `piker.tsp` - timeseries processing utilities - `piker.data._formatters` - OHLC array handling ## Skill Maintenance Update when: - New vectorization patterns discovered - Performance bottlenecks identified - Polars migration patterns emerge - NumPy best practices evolve --- *Last updated: 2026-01-31* *Session: Batch gap annotation optimization* *Key win: 100ms → 5ms dict building via field extraction*