In an effort to catch out-of-order and/or partial-frame-duplicated
segments, add some `.tsp` calls throughout the backloader tasks
including a call to the new `.sort_diff()` to catch the out-of-order
history cases.
Since the `diff: int` serves as a predicate anyway (when `0` nothing
duplicate was detected) might as well just return it directly since it's
likely also useful for the caller when doing deeper anal.
Also, handle the zero-diff case by just returning early with a copy of
the input frame and a `diff=0`.
CHERRY INTO #486
Turns out this was the main source of all sorts of gaps and overlaps
in history frame backfilling. The original idea was that when a gap
causes not enough (1m) bars to be delivered (like over a weekend or
holiday) when we just implicitly do another frame query to try and at
least fill out the default duration (normally 1-2 days). Doing the
recursion sloppily was causing all sorts of stupid problems..
It's kinda obvious now what was wrong in hindsight:
- always pass the sampling period (timeframe) when recursing
- adjust the logic to not be mutex with the no-data case (since it
already is mutex..)
- pack to the `numpy` array BEFORE the recursive call to ensure the
`end_dt: DateTime` is selected and passed correctly!
Toss in some other helpfuls:
- more explicit `pendulum` typing imports
- some masked out sorted-diffing checks (that can be enabled when
debugging out-of-order frame issues)
- always error log about less-than time step mismatches since we should never
have time-diff steps **smaller** then specified in the
`sample_period_s`!
Yet again these are (going to be) generally useful in the data proc
layer as well as going forward with (possibly) moving the history and
shm rt-processing layer to apache (arrow or other) shared-ds
equivalents.
Includes a rename of `.data._timeseries` -> `.data.tsp` for "time series
processing", making it a public sub-mod; it contains a highly useful set
of data-frame and `numpy.ndarray` ops routines in various subsystems Bo
I guess since i started supporting the whole "allow a gap between
the latest tsdb sample and the latest retrieved history frame" the
overlap slicing has been completely borked XD where we've been sticking
in duplicate history samples and this has caused all sorts of down
stream time-series processing issues..
So fix that but ensuring whenever there IS an overlap between history in
the latest frame and the tsdb that we always prefer the latest frame's
data and slice OUT the tsdb's duplicate indices..
CHERRY TO #486
Think i finally figured out the weird issue without out-of-order OHLC
history getting jammed in the wrong place:
- gap is detected in parquet/offline ts (likely due to a zero dt or
other gap),
- query for history in the gap is made BUT that frame is then inserted
in the shm buffer **at the end** (likely using array int-entry
indexing) which inserts it at the wrong location,
- later this out-of-order frame is written to the storage layer
(parquet) and then is repeated on further reboots with the original
gap causing further queries for the same frame on every history
backfill.
A set of tools useful for detecting these issues and annotating them
nicely on chart part of this patch's intent:
- `dedupe()` will detect any dt gaps, deduplicate datetime rows and
return the de-duplicated df along with gaps table.
- use this in both `piker store anal` such that we potentially
resolve and backfill the gaps correctly if some rows were removed.
- possibly also use this to detect the backfilling error in logic at
the time of backfilling the frame instead of after the fact (which
would require re-writing the shm array from something like `store
ldshm` and would be a manual post-hoc solution, not a fix to the
original issue..
Been meaning to this for a while, and there's still a few design
/ interface kinks (like `.mkt: MktPair` which should be better
generalized?) but this flips over all of the fsp chaining engine
to operate on the higher level `Flume` APIs via the newly cobbled
`Cascade` thinger..
Allows opening with `.from_msg(readonly=False)` for write permissions
making underlyig shm arrays readonly. Also, make sure to pop the
`ShmArray` field entries prior to msg-ization, not sure how that worked
with the `Feed.flumes` equivalent..but?
A common usage error is to run `piker anal mnq.cme.ib` where the CLI
passed fqme is not actually fully-qualified (in this case missing an
expiry token) and we get an underlying `FileNotFoundError` from the
`StorageClient.read_ohlcv()` call. In such key misses, scan the existing
`StorageClient._index` for possible matches and report in a `raise from`
the new error.
CHERRY into #486
Since it probably IS sane to just assume a root-actor-as-registrar
listening on the localhost as a default, AND allows NOT expecting every
caller of `open_piker_runtime()` to not have to pass an addr set XD
This makes a bucha CLI shit work again after breakage due to no
default..
For now def it `.cli.load_trans_eps()` just inside the pkg mod; only
loads the ep for `pikerd` which currently acts as the main service-actor
registrar per host. Delegate to this new `.load_trans_eps()`
as-it-was-used from the `pikerd` cmd body and add fresh support for
`piker chart --maddr <addr: str>` using the routine in the body of the
`piker.cli.cli` cmd group after loading the `conf.toml::network` section
B)
Also, toss in runtime debug mode wrapping around `piker chart` using the
new `tractor.devx.maybe_open_crash_handler()` and pull the switch from
a `--pdb` flag now factored into the `.cli.cli` click group.
Since `tractor` and our runtime internals is now moved to multihomed semantics,
do the same in the CLI / config entrypoints.
Also, try using the new `tractor.devx.maybe_open_crash_handler()` around
the `pikerd` CLI.
When a new (actor) caller opens the registry there are 2 possible cases:
1. - some task already opened the registry during init and set the global
superset of registrar addrs that are expected to be used,
2. - some task after the init task opens with a subset of addrs.
3. - some task after init opens with a disjoint set - should be an error?
In the 2nd case we don't want to error since the may just not need to
know about other registrar (multi-homed) addrs and thus only needs
specific access - so only warn about the diff in that case. If the
caller is requesting some disjoint set then we still runtime raise.
Adjust `find_service()` to allow a null `registry_addrs` input in which
case we fail over to using whatever pre-set the `Registry.addrs` has;
makes it simple for actors that don't want/need to know about the global
registrar set for their actor tree. Also, always set pass
`tractor.find_actor(only_first=True)` (for now).
This commit requires an equivalent commit in `tractor` which adds
multi-homed transport server support to the runtime and thus the ability
ability to listen on multiple (embedded protocol) addrs / networks as
well as exposing registry actors similarly. Multiple bind addresses can
now be (bare bones) specified either in the `conf.toml:[network]`
section, or passed on the `pikerd` CLI.
This patch specifically requires the ability to pass a `registry_addrs:
list[tuple]` into `tractor.open_root_actor()` as well as adjusts all
internal runtime routines to do the same, mostly inside the `.service`
pkg.
Further details include:
- adding a new `.service._multiaddr` parser module (which will likely be
moved into `tractor`'s core) which supports loading lib2p2 style
"multiaddresses" both from the `conf.toml` and the `pikerd` CLI as
per,
- reworking the `pikerd` cmd to accept a new `--maddr`/`-m` param that
accepts multiaddresses.
- adjust the actor-registry subsys to support multi-homing by also
accepting a list of addrs to its top level API eps.
- various internal name changes to reflect the multi-address interface
changes throughout.
- non-working CLI tweaks to `piker chart` (ui-client cmds) to begin
accepting maddrs.
- dropping all elasticsearch and marketstore flags / usage from `pikerd`
for now since we're planning to drop mkts and elasticsearch will be an
optional dep in the future.