piker

Commit Graph

Author	SHA1	Message	Date
Tyler Goodlet	58c096bfad	Bleh go back to using pdbp for REPL in anal	2023-06-27 13:41:47 -04:00
Tyler Goodlet	9eeea51165	Define shm buffer sizing in `.data.history` Also adjust sizing such that the history buffer will backfill the last six years by default (in 1m OHLC) and the hft buffer will do only 3 days worth. Also ensure the fsp layer passes the src shm's buffer size when allocating since the size is now required by allocators in the shm apis.	2023-06-27 13:41:47 -04:00
Tyler Goodlet	33ec27715b	Sync shm mod with dev version in `tractor`, drop buffer sizing vars, require `size: int` to all allocators	2023-06-27 13:41:47 -04:00
Tyler Goodlet	e1be098406	Only hard re-render `Viz`s matching backfill deats Avoid unnecessarily re-rendering the wrong (1min OHLC history) chart and/or other such charts with update tasks listening to the sampler stream. Instead only redraw in tasks which are updating vizs which match the actual details of the backfill event. We can probably also eventually match against a range tuple (emitted in the msg) and then have the task further only update the formatter layer unless the range is actually in view?	2023-06-27 13:41:47 -04:00
Tyler Goodlet	dd3e4b5a1f	Emit backfill details in broadcasts Send both the `Viz.name` and `timeframe: int` so that the UI side can match against them and only update a lone curve in a single plot.	2023-06-27 13:41:47 -04:00
Tyler Goodlet	2a1835843f	Drop `wap_in_history` stuff from display loop It's no longer part of the default OHLCV array-buffer schema and just generally we should be processing and managing any non source data in the FSP subsystem(s) despite it maybe being provided as a default by some backends.	2023-06-27 13:41:47 -04:00
Tyler Goodlet	8947932289	Use last 16 steps in period detection, not first 16..	2023-06-27 13:41:47 -04:00
Tyler Goodlet	0484e97382	Try to not overrun shm during gap backfilling..	2023-06-27 13:41:47 -04:00
Tyler Goodlet	5251561e20	TOCHERRY: into #486 , add polars/apache deps for nix	2023-06-27 13:41:47 -04:00
Tyler Goodlet	937d8c410d	binance: add futes API link, freeze the agg tradez struct	2023-06-27 13:41:47 -04:00
Tyler Goodlet	75ff3921b6	ib: fix mega borked hist queries on gappy assets Explains why stuff always seemed wrong before XD Previously whenever a time-gappy asset (like a stock due to it's venue operating hours) was being loaded, we weren't querying for a "durations worth" of bars and this was causing all sorts of actual gaps in our data set that shouldn't exist.. Fix that by always attempting to retrieve a min aggregate-time's worth/duration of bars/datums in the history manager. Actually, i implemented this in both the feed and api layers for this backend since it doesn't seem to strictly work just implementing it at the `Client.bars()` level, not sure why but.. Also, buncha `ruff` linting cleanups and fix the logger nameeee, lel.	2023-06-27 13:41:47 -04:00
Tyler Goodlet	c8f8724887	Mask out all the duplicate frame detection	2023-06-27 13:41:47 -04:00
Tyler Goodlet	c1546eb043	Add note about appending parquet files on write	2023-06-27 13:41:47 -04:00
Tyler Goodlet	f8ab3bde35	Allow sampler step events to overrun; only 1s period	2023-06-27 13:41:47 -04:00
Tyler Goodlet	c1201c164c	Parametrize index margin around gap detection segment	2023-06-27 13:41:47 -04:00
Tyler Goodlet	a575e67fab	Go back to just opening sampler stream inside history update task?	2023-06-27 13:41:47 -04:00
Tyler Goodlet	34dd6ffc22	Add a configurable timeout around backend live feed startup For now make it a larger value but ideally in the long run we can tune it to specific backends and expose it in the config(s).	2023-06-27 13:41:47 -04:00
Tyler Goodlet	fda7111305	Import from new `.data._timeseries` mod for anal	2023-06-27 13:41:47 -04:00
Tyler Goodlet	8233d12afb	Detect and fill time gaps in tsdb history For now, just detect and fill in gaps (via fresh backend queries) in the shm buffer but eventually i'm pretty sure we can just write these direct to the parquet file as well. Use the new `.data._timeseries.detect_null_time_gap()` to find and fill in the `ShmArray` index range, re-check it and enter a prompt if it didn't totally fill. Also, - do a massive cleanup and removal of all unused/commented code. - drop the duplicate frames tracking, don't think we need it after removing multi-frame concurrent queries. - change backfill loop variable `end_dt` -> `last_start_dt` which is more semantically correct. - fix logic to backfill any missing sub-sequence portion for any frame query that overruns the shm buffer prependable space by detecting the available rows left to insert and only push those. - add a new `shm_push_in_between()` helper to match.	2023-06-27 13:41:47 -04:00
Tyler Goodlet	f25248c871	Add `.data._timeseries` utility mod Org all the new (time) gap detection routines here and also move in the `slice_from_time()` epoch -> index converter routine from `._pathops` B)	2023-06-27 13:41:47 -04:00
Tyler Goodlet	54f8a615fc	Use `code.interact()` in anal subcmd for now	2023-06-27 13:41:47 -04:00
Tyler Goodlet	2dbcecdac7	Generalize time-gap detector to accept unit and threshold	2023-06-27 13:41:47 -04:00
Tyler Goodlet	0dcfcea6ee	Finally get partial backfills after tsdb load workinnn It took a little while (and a lot of commenting out of old no longer needed code) but, this gets tsdb (from parquet file) loading before final backfilling from the most recent history frame until the most recent tsdb time stamp! More or less all the convoluted concurrency shit we had for coping with `marketstore` IPC junk is no longer needed, particularly all the query size limits and accompanying load loops.. The recent frame loading technique/order has now changed though since we'd like to show charts asap once tsdb history loads. The new load sequence is as follows: - load mr (most recent) frame from backend. - load existing history (one shot) from the "tsdb" aka parquet files with `polars`. - backfill the gap part from the mr frame back to the tsdb start incrementally by making (hacky) `ShmArray.push(start=<blah>)` calls and not updating the `._first.value` while doing it XD Dirtier deatz: - make `tsdb_backfill()` run per timeframe in a separate task. - drop all the loop through timeframes and insert `dts_per_tf` crap. - only spawn a subtask for the `start_backfill()` call which in turn only does the gap backfilling as mentioned above. - mask out all the code related to being limited to certain query sizes (over gRPC) as was restricted by marketstore.. not gonna go through what all of that was since it's probably getting deleted in a follow up commit. - buncha off-by-one tweaks to do with backfilling the gap from mr frame to tsdb start.. mostly tinkered it to get it all right but seems to be working correctly B) - still use the `broadcast_all()` msg stuff when doing the gap backfill though don't have it really working yet on the UI side (since previously we were relying on the shm first/last values.. so this will be "coming soon" :)	2023-06-27 13:41:47 -04:00
Tyler Goodlet	7a5c43d01a	Support injecting a `info: dict` to `Sampler.broadcast_all()` calls	2023-06-27 13:41:47 -04:00
Tyler Goodlet	f1252983e4	kucoin: support start and end dt based bars queries	2023-06-27 13:41:47 -04:00
Tyler Goodlet	6dc3ed8d6a	Expose a `force_reformat: bool` up through graphics stack	2023-06-27 13:41:47 -04:00
Tyler Goodlet	4f4860cfb0	Update shm.push() type sig style	2023-06-27 13:41:47 -04:00
Tyler Goodlet	1e683a4b91	Another guard around sampling subscriber popped race..	2023-06-27 13:41:47 -04:00
Tyler Goodlet	9fd412f631	Add basic time-sampling gap detection via `polars` For OHLCV time series we normally presume a uniform sampling period (1s or 60s by default) and it's handy to have tools to ensure a series is gapless or contains expected gaps based on (legacy) market hours. For this we leverage `polars`: - add `.nativedb.with_dts()` a datetime-from-epoch-time-column frame "column-expander" which inserts datetime-casted, epoch-diff and dt-diff columns. - add `.nativedb.detect_time_gaps()` which filters to any larger then expected sampling period rows. - wrap the above (for now) in a `piker store anal` (analysis) cmd which atm always enters a breakpoint for tinkering. Supporting storage client additions: - add a `detect_period()` helper for extracting expected OHLC time step. - add new `NativedbStorageClient` methods and attrs to provide for the above: - `.mk_path()` to only deliver a parquet-file path for use in other methods. - `._dfs` to house cached `pl.DataFrame`s loaded from `.parquet` files. - `.as_df()` which loads cached frames or loads them from disk and then caches (for next use). - `_write_ohlcv()` a private-sync version of the public equivalent meth since we don't currently have any actual async file IO underneath; add a flag for whether to return as a `numpy.ndarray`.	2023-06-27 13:41:47 -04:00
Tyler Goodlet	d027ad5a4f	Whenever there is overlays, set a title on main chart price-y axis!	2023-06-27 13:41:47 -04:00
Tyler Goodlet	106ebe94bf	Drop marketstore and tina install from readme, add polars and apache!	2023-06-27 13:41:47 -04:00
Tyler Goodlet	d2accdac9b	Drop remaining mkts nonsense from `store delete`	2023-06-27 13:41:47 -04:00
Tyler Goodlet	c020ab76be	Clean out marketstore specifics - drop buncha cruft from `store ls` cmd and make it work for multi-backend fqme listing. - including adding an `.address` to the mkts client which shows the grpc socketaddr details. - change defauls to new `'nativedb'. - drop 'marketstore' from built-in backend list (for now)	2023-06-27 13:41:47 -04:00
Tyler Goodlet	c52e889fe5	First draft history loading rework It was a concurrency-hack mess somewhat due to all sorts of limitations imposed by marketstore (query size limits, strange datetime/timestamp errors, slow table loads for large queries..) and we can drastically simplify. There's still some issues with getting new backfills (not yet in storage) correctly prepended: there's sometimes little gaps due to shm races when reading history indexing vs. when the live-feed startup finishes. We generally need tests for all this and likely a better rework of the feed layer's init such that we're showing history in chart afap instead of waiting on backfills or the live feed to come up. Much more to come B)	2023-06-27 13:41:47 -04:00
Tyler Goodlet	0ba3c798d7	Drop `bar_wap` from default ohlc field set Turns out no backend (including kraken) requires it and really this kinda of measure should be implemented and recorded from our fsp layer instead of (hackily) sometimes expecting it to be in "source data".	2023-06-27 13:41:47 -04:00
Tyler Goodlet	7b4f4bf804	First draft `.storage.nativedb.` using parquet files After much frustration with a particular tsdb (cough) this instead implements a new native-file (and apache tech based) backend which stores time series in parquet files (for now) using the `polars` apis (since we plan to use that lib as well for processing). Note this code is currently very rough and in draft mode. Details: - add conversion routines for going from `polars.DataFrame` to `numpy.ndarray` and back. - lay out a simple file-name as series key symbology: `fqme.<datadescriptions>.parquet`, though probably it will evolve. - implement the entire `StorageClient` interface as it stands. - adjust `storage.cli` cmds to instead expect to use this new backend, which means it's a complete mess XD Main benefits/motivation: - wayy faster load times with no "datums to load limit" required. - smaller space footprint and we haven't even touched compression settings yet! - wayyy more compatible with other systems which can lever the apache ecosystem. - gives us finer grained control over the filesystem usage so we can choose to swap out stuff like the replication system or networking access.	2023-06-27 13:41:47 -04:00
Tyler Goodlet	8de92179da	kucoin: fix missing default fields def import	2023-06-27 13:41:47 -04:00
Tyler Goodlet	94733c4a0b	A PoC tsdb prototype: `parqdb` using `polars` Turns out just (over)writing `.parquet` files with >= 1M datums is like less then a second, and we can likely speed up appends using `fastparquet` (usage coming soon). Includes: - a new `clone` CLI subcmd to test this all out by ad-hoc copy of (literally hardcoded to a daemon-actor specific shm allocation X) an existing `/dev/shm/<ShmArray>` and push to `.parquet` file. - code to convert from our `ShmArray.array: np.ndarray` -> `polars.DataFrame` (thanks SO). - timing checks around the file IO and np -> polars conversion. - a `read` subcmd which i was using to test the sync `pymarketstore` client against our async one to see if the issues from https://github.com/pikers/piker/issues/443 were resolved, but nope!	2023-06-27 13:41:47 -04:00
Tyler Goodlet	7d1cc47db9	ROFL, even using `pymarketstore`'s json-RPC it's borked.. Turns out trying to switch to the old sync client and going back to using the old json-RPC API (after having had to patch the upstream repo to not import gRPC machinery to avoid crashes..) I'm basically getting the exact same issues. New tinkering results does possibly tell some new stuff: - the EOF error seems to indeed be due to trying fetch records which haven't been written (properly) - like asking for a `end=<epoch_int>` that is earlier then the earliest record. - the "snappy input corrupt" error seems to have something to do with the `Params.end` field not being an `int` and/or the int precision not being chosen correctly? - toying with this a bunch manually shows that the internals of the client (particularly `.build_query()` stuff) is parsing/calcing the `Epoch` and `Nanoseconds` values out incorrectly.. which is likely part of the problem. - we also changed `anyio_marketstore.MarketStoreclient.build_query()` logic when removing `pandas` a while back, which also seems to be part of the problem on the async side, however reverting those changes also didn't fix the issue entirely; likely something else more subtle going on (maybe with the write vs. read `Epoch` field type we pass?). Despite all this malarky, we're already underway more or less obsoleting this whole thing with a much less complex approach of using apache parquet files and modern filesystem tools to get a more flexible and numerics-native dataframe-oriented tsdb B)	2023-06-27 13:41:47 -04:00
Tyler Goodlet	9859f601ca	Invert data provider's OHLCV field defs Turns out the reason we were originally making the `time: float` column in our ohlcv arrays was bc that's what only ib uses XD (and/or 🤦) Instead we changed the default field type to be an `int` (which is also more correct to avoid `float` rounding/precision discrepancies) and thus do not need to override it in all other (crypto) backends (except `ib`). Now we only do the customization (via `._ohlc_dtype`) to `float` only for `ib` for now (though pretty sure we can also not do that eventually as well..)!	2023-06-27 13:41:47 -04:00
Tyler Goodlet	af64152640	.data.history: update to new naming -> `._source.def_iohlcv_fields` -> `.storage.StorageClient`	2023-06-27 13:41:47 -04:00
Tyler Goodlet	bf21d2e329	Rename default OHLCV `np.dtype` descriptions Use `def_iohlcv_fields` for a name and instead of copying and inserting the index field pop it for the non-index version. Drop creating `np.dtype()` instances since `numpy`'s apis accept both input forms so this is simpler on our end.	2023-06-27 13:41:47 -04:00
Tyler Goodlet	848577488e	Add public config dir getter	2023-06-27 13:41:47 -04:00
Tyler Goodlet	e82538eded	.data: export ohlc dtypes at top level	2023-06-27 13:41:47 -04:00
Tyler Goodlet	8ccb8b0744	kucoin: drop shm-array `numpy` dtype def, our default is the same	2023-06-27 13:41:47 -04:00
Tyler Goodlet	e83de2906f	Relegate old marketstore cli eps to masked module	2023-06-27 13:41:47 -04:00
Tyler Goodlet	33c464524b	Lower the paper engine order-cancel latency	2023-06-27 13:41:47 -04:00
Tyler Goodlet	cb774e5a5d	Re-implement `piker store` CLI with `typer` Turns out you can mix and match `click` with `typer` so this moves what was the `.data.cli` stuff into `storage.cli` and uses the integration api to make it all work B) New subcmd: `piker store` - add `piker store ls` which lists all fqme keyed time-series from backend. - add `store delete` to remove any such key->time-series. - now uses a nursery for multi-timeframe concurrency B) Mask out all the old `marketstore` specific subcmds for now (streaming, ingest, storesh, etc..) in anticipation of moving them into a subpkg-module and make sure to import the sub-cmd module in our top level cli package. Other `.storage` api tweaks: - drop the reraising with custom error (for now). - rename `Storage` -> `StorageClient` (or should it be API?).	2023-06-27 13:41:47 -04:00
Tyler Goodlet	1ec9b0565f	Move `.data.cli` to `.storage.cli`	2023-06-27 13:41:47 -04:00
Tyler Goodlet	7ab97fb21d	Add marketstore client as storage-backend module To kick off our (tsdb) storage backends this adds our first implementing a new `Storage(Protocol)` client interface. Going foward, the top level `.storage` pkg-module will now expose backend agnostic APIs and helpers whilst specific backend implementations will adhere to that middle-ware layer. Deats: - add `.storage.marketstore.Storage` as the first client implementation, moving all needed (import) dependencies out from `.service.marketstore` as well as `.ohlc_key_map` and `get_client()`. - move root `conf.toml` loading from `.data.history` into `.storage.__init__.open_storage_client()` which now takes in a `name: str` and does all the work of loading the correct backend module, its config, and determining if a service-instance can be contacted and a client loaded; in the case where this fails we raise a new `StorageConnectionError`. - add a new `.storage.get_storagemod()` just like we have for brokers. - make `open_storage_client()` also return the backend module such that the history-data layer can make backend specific calls as needed (eg. ohlc_key_map). - fall back to a basic non-tsdb backfill when `open_storage_client()` raises the new connection error.	2023-06-27 13:41:47 -04:00

1 2 3 4 5 ...

3960 Commits (58c096bfad2c299fba45b80c77973e85dc8a0082) All Branches Search

3960 Commits (58c096bfad2c299fba45b80c77973e85dc8a0082)

All Branches