# Polars Integration Patterns Polars usage patterns for piker's timeseries processing, including NumPy interop. ## NumPy <-> Polars Conversion ```python import polars as pl # numpy to polars df = pl.from_numpy( arr, schema=[ 'index', 'time', 'open', 'high', 'low', 'close', 'volume', ], ) # polars to numpy (via arrow) arr = df.to_numpy() # piker convenience from piker.tsp import np2pl, pl2np df = np2pl(arr) arr = pl2np(df) ``` ## Polars Performance Patterns ### Lazy Evaluation ```python # build query lazily lazy_df = ( df.lazy() .filter(pl.col('volume') > 1000) .with_columns([ ( pl.col('close') - pl.col('open') ).alias('change') ]) .sort('time') ) # execute once result = lazy_df.collect() ``` ### Groupby Aggregations ```python # resample to 5-minute bars resampled = df.groupby_dynamic( index_column='time', every='5m', ).agg([ pl.col('open').first(), pl.col('high').max(), pl.col('low').min(), pl.col('close').last(), pl.col('volume').sum(), ]) ``` ## When to Use Polars vs NumPy ### Use Polars when: - Complex queries with multiple filters/joins - Need SQL-like operations (groupby, window fns) - Working with heterogeneous column types - Want lazy evaluation optimization ### Use NumPy when: - Simple array operations (indexing, slicing) - Direct memory access needed (e.g., SHM arrays) - Compatibility with Qt/pyqtgraph (expects NumPy) - Maximum performance for numerical computation