piker

Commit Graph

Author	SHA1	Message	Date
Tyler Goodlet	a4084d6a0b	Bleh, fix another off-by-one issue in `np.argwhere()` Apparently it returns the index of the prior zero-row (prolly since we do the backward difference) so ensure `fi_zgaps += 1`.. Also fix remaining edge case handling when there's only 2 zero-segs which was borked after a refactor to the special case blocks (like a single zero row) prior to the `absi_zsegs` building loop AND make sure to always return abs indices OUTSIDE the zero seg, i.e. the indices of the non-zero row just before and just after so that the history backfiller can use non-zero timestamps to generate range datetimes for backend frame queries. Add much more detailed doc-comments with a small ascii diagram to explain how all these somewhat subtle vec ops work. Also toss in some sanity checks on the output indices to ensure they don't point to zero (time) valued rows when used to read the frame.	2023-12-15 12:48:50 -05:00
Tyler Goodlet	83bdca46a2	Wrap null-gap detect and fill in async gen Call it `iter_null_segs()` (for now?) and use in the final (sequential) stage of the `.history.start_backfill()` task-func. Delivers abs, frame-relative, and equiv time stamps on each iteration pertaining to each detected null-segment to make it easy to do piece-wise history queries for each. Further, - handle edge case in `get_null_segs()` where there is only 1 zeroed row value, in which case we deliver `absi_zsegs` as a single pair of the same index value and, - when this occurs `iter_null_seqs()` delivers `None` for all the `start_` related indices/timestamps since all `get_hist()` routines (delivered by `open_history_client()`) should handle it as being a "get max history from this end_dt" type query. - add note about needing to do time gap handling where there's a gap in the timeseries-history that isn't actually IN the data-history.	2023-12-13 18:29:06 -05:00
Tyler Goodlet	c129f5bb4a	Finally write a general purpose null-gap detector! Using a bunch of fancy `numpy` vec ops (and ideally eventually extending the same to `polars`) this is a first draft of `get_null_segs()` a `col: str` field-value-is-zero detector which filters to all zero-valued input frame segments and returns the corresponding useful slice-indexes: - gap absolute (in shm buffer terms) index-endpoints as `absi_zsegs` for slicing to each null-segment in the src frame. - ALL abs indices of rows with zeroed `col` values as `absi_zeros`. - the full set of the input frame's row-entries (view) which are null valued for the chosen `col` as `zero_t`. Use this new null-segment-detector in the `.data.history.start_backfill()` task to attempt to fill null gaps that might be extant from some prior backfill attempt. Since `get_null_segs()` should now deliver a sequence of slices for each gap we don't really need to have the `while gap_indices:` loop any more, so just move that to the end-of-func and warn log (for now) if all gaps aren't eventually filled. TODO: -[ ] do the null-seg detection and filling concurrently from most-recent-frame backfilling. -[ ] offer the same detection in `.storage.cli` cmds for manual tsp anal. -[ ] make the graphics layer actually update correctly when null-segs are filled (currently still broken somehow in the `Viz` caching layer?) CHERRY INTO #486	2023-12-13 15:26:33 -05:00
Tyler Goodlet	c4853a3fee	Drop inter-method NL	2023-12-13 09:27:23 -05:00
Tyler Goodlet	f274c3db3b	Import `np2pl()` from `.data.tsp` Also toss in todo for a timeseries search CLI cmd which can be handy when doing offine store mgmt.	2023-12-13 09:25:44 -05:00
Tyler Goodlet	b95932ea09	`.data.history`: run `.tsp.dedupe()` in backloader In an effort to catch out-of-order and/or partial-frame-duplicated segments, add some `.tsp` calls throughout the backloader tasks including a call to the new `.sort_diff()` to catch the out-of-order history cases.	2023-12-12 19:57:46 -05:00
Tyler Goodlet	e8bf4c6e04	Return the `.len()` diff from `dedupe()` instead Since the `diff: int` serves as a predicate anyway (when `0` nothing duplicate was detected) might as well just return it directly since it's likely also useful for the caller when doing deeper anal. Also, handle the zero-diff case by just returning early with a copy of the input frame and a `diff=0`. CHERRY INTO #486	2023-12-12 16:48:56 -05:00
Tyler Goodlet	8e4d1a48ed	Bleh, fix ib's `Client.bars()` recursion.. Turns out this was the main source of all sorts of gaps and overlaps in history frame backfilling. The original idea was that when a gap causes not enough (1m) bars to be delivered (like over a weekend or holiday) when we just implicitly do another frame query to try and at least fill out the default duration (normally 1-2 days). Doing the recursion sloppily was causing all sorts of stupid problems.. It's kinda obvious now what was wrong in hindsight: - always pass the sampling period (timeframe) when recursing - adjust the logic to not be mutex with the no-data case (since it already is mutex..) - pack to the `numpy` array BEFORE the recursive call to ensure the `end_dt: DateTime` is selected and passed correctly! Toss in some other helpfuls: - more explicit `pendulum` typing imports - some masked out sorted-diffing checks (that can be enabled when debugging out-of-order frame issues) - always error log about less-than time step mismatches since we should never have time-diff steps smaller then specified in the `sample_period_s`!	2023-12-12 16:19:21 -05:00
Tyler Goodlet	b03eceebef	data.tsp: drop masked `return` one liner	2023-12-11 20:11:42 -05:00
Tyler Goodlet	f7a8d79b7b	Add `NativeStorageClient._cache_df()` use it in `.write_ohlcv()` for caching on writes as well	2023-12-11 20:10:53 -05:00
Tyler Goodlet	49c458710e	Move `numpy` <-> `polars` converters into `.data.tsp` Yet again these are (going to be) generally useful in the data proc layer as well as going forward with (possibly) moving the history and shm rt-processing layer to apache (arrow or other) shared-ds equivalents.	2023-12-11 17:53:31 -05:00
Tyler Goodlet	b94582cb35	Move `dedupe()` to `.data.tsp` (so it has pals) Includes a rename of `.data._timeseries` -> `.data.tsp` for "time series processing", making it a public sub-mod; it contains a highly useful set of data-frame and `numpy.ndarray` ops routines in various subsystems Bo	2023-12-11 16:24:27 -05:00
Tyler Goodlet	7311000846	Facepalm, set `was_deduped` as bool not the deduped frame..	2023-12-11 13:18:10 -05:00
Tyler Goodlet	e719733f97	Comment out overlap case block for now too?	2023-12-08 19:08:10 -05:00
Tyler Goodlet	cb941a5554	BABOSO.. fix last history frame overlap slicing! I guess since i started supporting the whole "allow a gap between the latest tsdb sample and the latest retrieved history frame" the overlap slicing has been completely borked XD where we've been sticking in duplicate history samples and this has caused all sorts of down stream time-series processing issues.. So fix that but ensuring whenever there IS an overlap between history in the latest frame and the tsdb that we always prefer the latest frame's data and slice OUT the tsdb's duplicate indices.. CHERRY TO #486	2023-12-08 18:56:38 -05:00
Tyler Goodlet	2d72a052aa	Woops, make sure non-disti mode still works wen maybe getting `pikerd` XD	2023-12-08 17:43:52 -05:00
Tyler Goodlet	2eeef2a123	Add `dedupe()` to help with gap detection/resolution Think i finally figured out the weird issue without out-of-order OHLC history getting jammed in the wrong place: - gap is detected in parquet/offline ts (likely due to a zero dt or other gap), - query for history in the gap is made BUT that frame is then inserted in the shm buffer at the end (likely using array int-entry indexing) which inserts it at the wrong location, - later this out-of-order frame is written to the storage layer (parquet) and then is repeated on further reboots with the original gap causing further queries for the same frame on every history backfill. A set of tools useful for detecting these issues and annotating them nicely on chart part of this patch's intent: - `dedupe()` will detect any dt gaps, deduplicate datetime rows and return the de-duplicated df along with gaps table. - use this in both `piker store anal` such that we potentially resolve and backfill the gaps correctly if some rows were removed. - possibly also use this to detect the backfilling error in logic at the time of backfilling the frame instead of after the fact (which would require re-writing the shm array from something like `store ldshm` and would be a manual post-hoc solution, not a fix to the original issue..	2023-12-08 15:11:34 -05:00
Tyler Goodlet	b6d2550f33	Add datetime col de-duplicator	2023-12-08 14:38:27 -05:00
Tyler Goodlet	b9af6176c5	Factor `TimeseriesNotFound` to top level TO CHERRY into #486	2023-12-07 12:31:14 -05:00
Tyler Goodlet	dd0167b9a5	Make `fsp.cascade()` expect src/dst `Flume`s Been meaning to this for a while, and there's still a few design / interface kinks (like `.mkt: MktPair` which should be better generalized?) but this flips over all of the fsp chaining engine to operate on the higher level `Flume` APIs via the newly cobbled `Cascade` thinger..	2023-12-06 17:53:35 -05:00
Tyler Goodlet	9e71e0768f	Define and pass a default `Flume._readonly: bool` Allows opening with `.from_msg(readonly=False)` for write permissions making underlyig shm arrays readonly. Also, make sure to pop the `ShmArray` field entries prior to msg-ization, not sure how that worked with the `Feed.flumes` equivalent..but?	2023-12-06 17:25:49 -05:00
Tyler Goodlet	6029f39a3f	Allow `MktPair.from/to_msg()` to still do `.dst: str` for fsp flumes	2023-12-06 17:09:52 -05:00
Tyler Goodlet	656e2c6a88	fsp: intro a `Cascade` type that connects `Flume`s of streams	2023-12-05 16:59:07 -05:00
Tyler Goodlet	b8065a413b	ib: update ibc.ini from latest upstream template	2023-12-05 16:57:38 -05:00
Tyler Goodlet	9245d24b47	ib: add `.pause()` on symbol query overruns to aid in fixing the issue	2023-12-04 13:10:15 -05:00
Tyler Goodlet	22bd83943b	.storage: support `store anal --pdb` flag	2023-12-04 13:00:33 -05:00
Tyler Goodlet	b94931bbdd	Fix `Portal.channel: Channel` attr name error	2023-12-04 13:00:04 -05:00
Tyler Goodlet	239c1c457e	Sort fqme suggestions pre-print	2023-12-04 11:34:39 -05:00
Tyler Goodlet	24a54a7085	Add `TimeseriesNotFound` for fqme lookup failures A common usage error is to run `piker anal mnq.cme.ib` where the CLI passed fqme is not actually fully-qualified (in this case missing an expiry token) and we get an underlying `FileNotFoundError` from the `StorageClient.read_ohlcv()` call. In such key misses, scan the existing `StorageClient._index` for possible matches and report in a `raise from` the new error. CHERRY into #486	2023-12-04 11:22:55 -05:00
Tyler Goodlet	ebd1eb114e	Port runtime init to new `tractor.Actor.reg_addrs` related changes	2023-11-21 15:18:52 -05:00
Tyler Goodlet	29ce8de462	Use new container image mentioned on IBC thread	2023-10-29 13:21:32 -04:00
Tyler Goodlet	d3dab17939	order_mode: fix to avoid `Dialog.uuid` on null dialog..	2023-10-20 13:57:52 -04:00
Tyler Goodlet	cadc200818	Always ignore untracked-order error msgs from `brokerd`	2023-10-16 13:15:12 -04:00
Tyler Goodlet	363c8dfdb1	Default spec registrar set as empty addr list Since it probably IS sane to just assume a root-actor-as-registrar listening on the localhost as a default, AND allows NOT expecting every caller of `open_piker_runtime()` to not have to pass an addr set XD This makes a bucha CLI shit work again after breakage due to no default..	2023-10-03 13:36:22 -04:00
Tyler Goodlet	00c046c280	Factor transport-ep parser/loader into helper For now def it `.cli.load_trans_eps()` just inside the pkg mod; only loads the ep for `pikerd` which currently acts as the main service-actor registrar per host. Delegate to this new `.load_trans_eps()` as-it-was-used from the `pikerd` cmd body and add fresh support for `piker chart --maddr <addr: str>` using the routine in the body of the `piker.cli.cli` cmd group after loading the `conf.toml::network` section B) Also, toss in runtime debug mode wrapping around `piker chart` using the new `tractor.devx.maybe_open_crash_handler()` and pull the switch from a `--pdb` flag now factored into the `.cli.cli` click group.	2023-10-03 10:00:01 -04:00
Tyler Goodlet	9165515811	ib: more detailed comments on wait-for-quote-task todo	2023-10-02 17:57:47 -04:00
Tyler Goodlet	543c11f377	ib: only normalize and log first quote if it arrives	2023-10-01 19:14:08 -04:00
Tyler Goodlet	637d33d7cc	Make `.config.load_accounts()` load `brokers.toml`..	2023-10-01 19:09:15 -04:00
Tyler Goodlet	e5fdb33e31	Port cache-`dict` search to new `rapidfuzz` api	2023-10-01 17:46:46 -04:00
Tyler Goodlet	81a8cd1685	binance: always load the `brokers.toml` file since default is `conf.toml` now	2023-10-01 17:37:09 -04:00
Tyler Goodlet	a382f01c85	Move tsdb section to `service.tsdb.name` and get host from `.maddrs`	2023-10-01 17:23:39 -04:00
Tyler Goodlet	653348fcd8	Use `.service.find_service()` instead of of `tractor.find_actor()` in pape-eng	2023-10-01 16:10:37 -04:00
Tyler Goodlet	e139d2e259	Set `registry_addrs` in CLI (click) context-config Since `tractor` and our runtime internals is now moved to multihomed semantics, do the same in the CLI / config entrypoints. Also, try using the new `tractor.devx.maybe_open_crash_handler()` around the `pikerd` CLI.	2023-10-01 15:42:31 -04:00
Tyler Goodlet	7258d57c69	Only warn on mismatched `open_registry()` input addrs When a new (actor) caller opens the registry there are 2 possible cases: 1. - some task already opened the registry during init and set the global superset of registrar addrs that are expected to be used, 2. - some task after the init task opens with a subset of addrs. 3. - some task after init opens with a disjoint set - should be an error? In the 2nd case we don't want to error since the may just not need to know about other registrar (multi-homed) addrs and thus only needs specific access - so only warn about the diff in that case. If the caller is requesting some disjoint set then we still runtime raise. Adjust `find_service()` to allow a null `registry_addrs` input in which case we fail over to using whatever pre-set the `Registry.addrs` has; makes it simple for actors that don't want/need to know about the global registrar set for their actor tree. Also, always set pass `tractor.find_actor(only_first=True)` (for now).	2023-10-01 15:36:17 -04:00
Tyler Goodlet	5d081a40d5	Port to new `parse_maddr()` name	2023-09-29 15:20:56 -04:00
Tyler Goodlet	fcececce19	Move multi-addr parser mod to `tractor`	2023-09-29 14:33:15 -04:00
Tyler Goodlet	b6ac6069fe	Temporarily use crash handler around search CLI ep	2023-09-29 14:02:17 -04:00
Tyler Goodlet	a98f5877bc	ui._exec: use new `get_runtime_vars()` name	2023-09-28 12:31:24 -04:00
Tyler Goodlet	50ddef0985	data.feed: dynamically load `ui._search` mod for headless installs	2023-09-28 12:30:10 -04:00
Tyler Goodlet	b1cde3df49	config: make `conf.toml` the default load target	2023-09-28 12:29:07 -04:00

1 2 3 4 5 ...

4392 Commits (6fa0d4bcf3bb5abd06b8faecd5da0f6bb9b45cb6) All Branches Search

4392 Commits (6fa0d4bcf3bb5abd06b8faecd5da0f6bb9b45cb6)

All Branches