piker

Commit Graph

Author	SHA1	Message	Date
Tyler Goodlet	659649ec48	Bah, fix nursery indents for maybe tsdb backloading Can't ref `dt_eps` and `tsdb_entry` if they don't exist.. like for 1s sampling from `binance` (which dne). So make sure to add better logic guard and only open the finaly backload nursery if we actually need to fill the gap between latest history and where tsdb history ends. TO CHERRY #486	2023-12-18 19:46:59 -05:00
Tyler Goodlet	f7cc43ee0b	Add pauses to `store anal/ldshm` only on bad segs Particularly halting before maybe writing the repaired timeseries history in `store anal` to optionally allow user to avoid writing to storage.	2023-12-18 11:56:57 -05:00
Tyler Goodlet	f5dc21d3f4	Adjust all `.tsp` imports to use new sub-pkg Also toss in a poll loop around the `hist_shm: ShmArray` backfill read-check in the `.data.allocate_persisten_feed()` init to cope with possible racy-ness from the increased tsdb history loading concurrency now implemented.	2023-12-18 11:54:28 -05:00
Tyler Goodlet	4568c55f17	Create `piker.tsp` "time series processing" subpkg Move `.data.history` -> `.tsp.__init__.py` for now as main pkg-mod and `.data.tsp` -> `.tsp._anal` (for analysis). Obviously follow commits will change surrounding codebase (imports) to match..	2023-12-18 11:53:27 -05:00
Tyler Goodlet	d5d68f75ea	ib: only raise first quote timeout err after tries Previously we were actually failing silently too fast instead of actually trying multiple times (now we do for 100) before finally raising any timeout in the final loop `else:` block.	2023-12-18 11:45:19 -05:00
Tyler Goodlet	1f9a497637	Fixup symcache annot for kucoin as well	2023-12-15 16:01:31 -05:00
Tyler Goodlet	40c5d88a9b	Fixup symcache type annots; no more `Pair` type	2023-12-15 16:00:51 -05:00
Tyler Goodlet	8989c73a93	Move `iter_dfs_from_shms` into `.data.history` Thinking about just moving all of that module (after a content breakup) to a new `.piker.tsp` which will mostly depend on the `.data` and `.storage` sub-pkgs; the idea is to move biz-logic for tsdb IO/mgmt and orchestration with real-time (shm) buffers and the graphics layer into a common spot for both manual analysis/research work and better separation of low level data structure primitives from their higher level usage. Add a better `data.history` mod doc string in prep for this move as well as clean out a bunch of legacy commented cruft from the `trimeter` and `marketstore` days. TO CHERRY #486 (if we can)	2023-12-15 15:53:02 -05:00
Tyler Goodlet	3639f360c3	Reactivate forced viz updates from sampler broadcasts in hist display loop	2023-12-15 13:59:19 -05:00
Tyler Goodlet	afd0781b62	Add (shm) abs index to `ContextLabel`	2023-12-15 13:57:10 -05:00
Tyler Goodlet	ba154ef413	ib: don't bother with recursive not-enough-bars queries for now, causes more problems then it solves..	2023-12-15 13:56:42 -05:00
Tyler Goodlet	97e2403fb1	Rework backfiller and null-segment task conc For each timeframe open a sub-nursery to do the backfilling + tsdb load + null-segment scanning in an effort to both speed up load time (though we need to reverse the current order to really make it faster rn since moving to the much faster parquet file backend) and do concurrent time-gap/null-segment checking of tsdb history while mrf (most recent frame) history is backfilling. The details are more or less just `trio` related task-func composition tricks and a reordering of said funcs for optimal startup latency. Also commented the `back_load_from_tsdb()` task for now since it's unused.	2023-12-15 13:11:00 -05:00
Tyler Goodlet	a4084d6a0b	Bleh, fix another off-by-one issue in `np.argwhere()` Apparently it returns the index of the prior zero-row (prolly since we do the backward difference) so ensure `fi_zgaps += 1`.. Also fix remaining edge case handling when there's only 2 zero-segs which was borked after a refactor to the special case blocks (like a single zero row) prior to the `absi_zsegs` building loop AND make sure to always return abs indices OUTSIDE the zero seg, i.e. the indices of the non-zero row just before and just after so that the history backfiller can use non-zero timestamps to generate range datetimes for backend frame queries. Add much more detailed doc-comments with a small ascii diagram to explain how all these somewhat subtle vec ops work. Also toss in some sanity checks on the output indices to ensure they don't point to zero (time) valued rows when used to read the frame.	2023-12-15 12:48:50 -05:00
Tyler Goodlet	83bdca46a2	Wrap null-gap detect and fill in async gen Call it `iter_null_segs()` (for now?) and use in the final (sequential) stage of the `.history.start_backfill()` task-func. Delivers abs, frame-relative, and equiv time stamps on each iteration pertaining to each detected null-segment to make it easy to do piece-wise history queries for each. Further, - handle edge case in `get_null_segs()` where there is only 1 zeroed row value, in which case we deliver `absi_zsegs` as a single pair of the same index value and, - when this occurs `iter_null_seqs()` delivers `None` for all the `start_` related indices/timestamps since all `get_hist()` routines (delivered by `open_history_client()`) should handle it as being a "get max history from this end_dt" type query. - add note about needing to do time gap handling where there's a gap in the timeseries-history that isn't actually IN the data-history.	2023-12-13 18:29:06 -05:00
Tyler Goodlet	c129f5bb4a	Finally write a general purpose null-gap detector! Using a bunch of fancy `numpy` vec ops (and ideally eventually extending the same to `polars`) this is a first draft of `get_null_segs()` a `col: str` field-value-is-zero detector which filters to all zero-valued input frame segments and returns the corresponding useful slice-indexes: - gap absolute (in shm buffer terms) index-endpoints as `absi_zsegs` for slicing to each null-segment in the src frame. - ALL abs indices of rows with zeroed `col` values as `absi_zeros`. - the full set of the input frame's row-entries (view) which are null valued for the chosen `col` as `zero_t`. Use this new null-segment-detector in the `.data.history.start_backfill()` task to attempt to fill null gaps that might be extant from some prior backfill attempt. Since `get_null_segs()` should now deliver a sequence of slices for each gap we don't really need to have the `while gap_indices:` loop any more, so just move that to the end-of-func and warn log (for now) if all gaps aren't eventually filled. TODO: -[ ] do the null-seg detection and filling concurrently from most-recent-frame backfilling. -[ ] offer the same detection in `.storage.cli` cmds for manual tsp anal. -[ ] make the graphics layer actually update correctly when null-segs are filled (currently still broken somehow in the `Viz` caching layer?) CHERRY INTO #486	2023-12-13 15:26:33 -05:00
Tyler Goodlet	c4853a3fee	Drop inter-method NL	2023-12-13 09:27:23 -05:00
Tyler Goodlet	f274c3db3b	Import `np2pl()` from `.data.tsp` Also toss in todo for a timeseries search CLI cmd which can be handy when doing offine store mgmt.	2023-12-13 09:25:44 -05:00
Tyler Goodlet	b95932ea09	`.data.history`: run `.tsp.dedupe()` in backloader In an effort to catch out-of-order and/or partial-frame-duplicated segments, add some `.tsp` calls throughout the backloader tasks including a call to the new `.sort_diff()` to catch the out-of-order history cases.	2023-12-12 19:57:46 -05:00
Tyler Goodlet	e8bf4c6e04	Return the `.len()` diff from `dedupe()` instead Since the `diff: int` serves as a predicate anyway (when `0` nothing duplicate was detected) might as well just return it directly since it's likely also useful for the caller when doing deeper anal. Also, handle the zero-diff case by just returning early with a copy of the input frame and a `diff=0`. CHERRY INTO #486	2023-12-12 16:48:56 -05:00
Tyler Goodlet	8e4d1a48ed	Bleh, fix ib's `Client.bars()` recursion.. Turns out this was the main source of all sorts of gaps and overlaps in history frame backfilling. The original idea was that when a gap causes not enough (1m) bars to be delivered (like over a weekend or holiday) when we just implicitly do another frame query to try and at least fill out the default duration (normally 1-2 days). Doing the recursion sloppily was causing all sorts of stupid problems.. It's kinda obvious now what was wrong in hindsight: - always pass the sampling period (timeframe) when recursing - adjust the logic to not be mutex with the no-data case (since it already is mutex..) - pack to the `numpy` array BEFORE the recursive call to ensure the `end_dt: DateTime` is selected and passed correctly! Toss in some other helpfuls: - more explicit `pendulum` typing imports - some masked out sorted-diffing checks (that can be enabled when debugging out-of-order frame issues) - always error log about less-than time step mismatches since we should never have time-diff steps smaller then specified in the `sample_period_s`!	2023-12-12 16:19:21 -05:00
Tyler Goodlet	b03eceebef	data.tsp: drop masked `return` one liner	2023-12-11 20:11:42 -05:00
Tyler Goodlet	f7a8d79b7b	Add `NativeStorageClient._cache_df()` use it in `.write_ohlcv()` for caching on writes as well	2023-12-11 20:10:53 -05:00
Tyler Goodlet	49c458710e	Move `numpy` <-> `polars` converters into `.data.tsp` Yet again these are (going to be) generally useful in the data proc layer as well as going forward with (possibly) moving the history and shm rt-processing layer to apache (arrow or other) shared-ds equivalents.	2023-12-11 17:53:31 -05:00
Tyler Goodlet	b94582cb35	Move `dedupe()` to `.data.tsp` (so it has pals) Includes a rename of `.data._timeseries` -> `.data.tsp` for "time series processing", making it a public sub-mod; it contains a highly useful set of data-frame and `numpy.ndarray` ops routines in various subsystems Bo	2023-12-11 16:24:27 -05:00
Tyler Goodlet	7311000846	Facepalm, set `was_deduped` as bool not the deduped frame..	2023-12-11 13:18:10 -05:00
Tyler Goodlet	e719733f97	Comment out overlap case block for now too?	2023-12-08 19:08:10 -05:00
Tyler Goodlet	cb941a5554	BABOSO.. fix last history frame overlap slicing! I guess since i started supporting the whole "allow a gap between the latest tsdb sample and the latest retrieved history frame" the overlap slicing has been completely borked XD where we've been sticking in duplicate history samples and this has caused all sorts of down stream time-series processing issues.. So fix that but ensuring whenever there IS an overlap between history in the latest frame and the tsdb that we always prefer the latest frame's data and slice OUT the tsdb's duplicate indices.. CHERRY TO #486	2023-12-08 18:56:38 -05:00
Tyler Goodlet	2d72a052aa	Woops, make sure non-disti mode still works wen maybe getting `pikerd` XD	2023-12-08 17:43:52 -05:00
Tyler Goodlet	2eeef2a123	Add `dedupe()` to help with gap detection/resolution Think i finally figured out the weird issue without out-of-order OHLC history getting jammed in the wrong place: - gap is detected in parquet/offline ts (likely due to a zero dt or other gap), - query for history in the gap is made BUT that frame is then inserted in the shm buffer at the end (likely using array int-entry indexing) which inserts it at the wrong location, - later this out-of-order frame is written to the storage layer (parquet) and then is repeated on further reboots with the original gap causing further queries for the same frame on every history backfill. A set of tools useful for detecting these issues and annotating them nicely on chart part of this patch's intent: - `dedupe()` will detect any dt gaps, deduplicate datetime rows and return the de-duplicated df along with gaps table. - use this in both `piker store anal` such that we potentially resolve and backfill the gaps correctly if some rows were removed. - possibly also use this to detect the backfilling error in logic at the time of backfilling the frame instead of after the fact (which would require re-writing the shm array from something like `store ldshm` and would be a manual post-hoc solution, not a fix to the original issue..	2023-12-08 15:11:34 -05:00
Tyler Goodlet	b6d2550f33	Add datetime col de-duplicator	2023-12-08 14:38:27 -05:00
Tyler Goodlet	b9af6176c5	Factor `TimeseriesNotFound` to top level TO CHERRY into #486	2023-12-07 12:31:14 -05:00
Tyler Goodlet	dd0167b9a5	Make `fsp.cascade()` expect src/dst `Flume`s Been meaning to this for a while, and there's still a few design / interface kinks (like `.mkt: MktPair` which should be better generalized?) but this flips over all of the fsp chaining engine to operate on the higher level `Flume` APIs via the newly cobbled `Cascade` thinger..	2023-12-06 17:53:35 -05:00
Tyler Goodlet	9e71e0768f	Define and pass a default `Flume._readonly: bool` Allows opening with `.from_msg(readonly=False)` for write permissions making underlyig shm arrays readonly. Also, make sure to pop the `ShmArray` field entries prior to msg-ization, not sure how that worked with the `Feed.flumes` equivalent..but?	2023-12-06 17:25:49 -05:00
Tyler Goodlet	6029f39a3f	Allow `MktPair.from/to_msg()` to still do `.dst: str` for fsp flumes	2023-12-06 17:09:52 -05:00
Tyler Goodlet	656e2c6a88	fsp: intro a `Cascade` type that connects `Flume`s of streams	2023-12-05 16:59:07 -05:00
Tyler Goodlet	9245d24b47	ib: add `.pause()` on symbol query overruns to aid in fixing the issue	2023-12-04 13:10:15 -05:00
Tyler Goodlet	22bd83943b	.storage: support `store anal --pdb` flag	2023-12-04 13:00:33 -05:00
Tyler Goodlet	b94931bbdd	Fix `Portal.channel: Channel` attr name error	2023-12-04 13:00:04 -05:00
Tyler Goodlet	239c1c457e	Sort fqme suggestions pre-print	2023-12-04 11:34:39 -05:00
Tyler Goodlet	24a54a7085	Add `TimeseriesNotFound` for fqme lookup failures A common usage error is to run `piker anal mnq.cme.ib` where the CLI passed fqme is not actually fully-qualified (in this case missing an expiry token) and we get an underlying `FileNotFoundError` from the `StorageClient.read_ohlcv()` call. In such key misses, scan the existing `StorageClient._index` for possible matches and report in a `raise from` the new error. CHERRY into #486	2023-12-04 11:22:55 -05:00
Tyler Goodlet	ebd1eb114e	Port runtime init to new `tractor.Actor.reg_addrs` related changes	2023-11-21 15:18:52 -05:00
Tyler Goodlet	d3dab17939	order_mode: fix to avoid `Dialog.uuid` on null dialog..	2023-10-20 13:57:52 -04:00
Tyler Goodlet	cadc200818	Always ignore untracked-order error msgs from `brokerd`	2023-10-16 13:15:12 -04:00
Tyler Goodlet	363c8dfdb1	Default spec registrar set as empty addr list Since it probably IS sane to just assume a root-actor-as-registrar listening on the localhost as a default, AND allows NOT expecting every caller of `open_piker_runtime()` to not have to pass an addr set XD This makes a bucha CLI shit work again after breakage due to no default..	2023-10-03 13:36:22 -04:00
Tyler Goodlet	00c046c280	Factor transport-ep parser/loader into helper For now def it `.cli.load_trans_eps()` just inside the pkg mod; only loads the ep for `pikerd` which currently acts as the main service-actor registrar per host. Delegate to this new `.load_trans_eps()` as-it-was-used from the `pikerd` cmd body and add fresh support for `piker chart --maddr <addr: str>` using the routine in the body of the `piker.cli.cli` cmd group after loading the `conf.toml::network` section B) Also, toss in runtime debug mode wrapping around `piker chart` using the new `tractor.devx.maybe_open_crash_handler()` and pull the switch from a `--pdb` flag now factored into the `.cli.cli` click group.	2023-10-03 10:00:01 -04:00
Tyler Goodlet	9165515811	ib: more detailed comments on wait-for-quote-task todo	2023-10-02 17:57:47 -04:00
Tyler Goodlet	543c11f377	ib: only normalize and log first quote if it arrives	2023-10-01 19:14:08 -04:00
Tyler Goodlet	637d33d7cc	Make `.config.load_accounts()` load `brokers.toml`..	2023-10-01 19:09:15 -04:00
Tyler Goodlet	e5fdb33e31	Port cache-`dict` search to new `rapidfuzz` api	2023-10-01 17:46:46 -04:00
Tyler Goodlet	81a8cd1685	binance: always load the `brokers.toml` file since default is `conf.toml` now	2023-10-01 17:37:09 -04:00

1 2 3 4 5 ...

3789 Commits (659649ec48f81805142fb047bf16f0b004a66a7d)