I guess since i started supporting the whole "allow a gap between
the latest tsdb sample and the latest retrieved history frame" the
overlap slicing has been completely borked XD where we've been sticking
in duplicate history samples and this has caused all sorts of down
stream time-series processing issues..
So fix that but ensuring whenever there IS an overlap between history in
the latest frame and the tsdb that we always prefer the latest frame's
data and slice OUT the tsdb's duplicate indices..
CHERRY TO #486
Think i finally figured out the weird issue without out-of-order OHLC
history getting jammed in the wrong place:
- gap is detected in parquet/offline ts (likely due to a zero dt or
other gap),
- query for history in the gap is made BUT that frame is then inserted
in the shm buffer **at the end** (likely using array int-entry
indexing) which inserts it at the wrong location,
- later this out-of-order frame is written to the storage layer
(parquet) and then is repeated on further reboots with the original
gap causing further queries for the same frame on every history
backfill.
A set of tools useful for detecting these issues and annotating them
nicely on chart part of this patch's intent:
- `dedupe()` will detect any dt gaps, deduplicate datetime rows and
return the de-duplicated df along with gaps table.
- use this in both `piker store anal` such that we potentially
resolve and backfill the gaps correctly if some rows were removed.
- possibly also use this to detect the backfilling error in logic at
the time of backfilling the frame instead of after the fact (which
would require re-writing the shm array from something like `store
ldshm` and would be a manual post-hoc solution, not a fix to the
original issue..
Been meaning to this for a while, and there's still a few design
/ interface kinks (like `.mkt: MktPair` which should be better
generalized?) but this flips over all of the fsp chaining engine
to operate on the higher level `Flume` APIs via the newly cobbled
`Cascade` thinger..
Allows opening with `.from_msg(readonly=False)` for write permissions
making underlyig shm arrays readonly. Also, make sure to pop the
`ShmArray` field entries prior to msg-ization, not sure how that worked
with the `Feed.flumes` equivalent..but?
A common usage error is to run `piker anal mnq.cme.ib` where the CLI
passed fqme is not actually fully-qualified (in this case missing an
expiry token) and we get an underlying `FileNotFoundError` from the
`StorageClient.read_ohlcv()` call. In such key misses, scan the existing
`StorageClient._index` for possible matches and report in a `raise from`
the new error.
CHERRY into #486
Since it probably IS sane to just assume a root-actor-as-registrar
listening on the localhost as a default, AND allows NOT expecting every
caller of `open_piker_runtime()` to not have to pass an addr set XD
This makes a bucha CLI shit work again after breakage due to no
default..
For now def it `.cli.load_trans_eps()` just inside the pkg mod; only
loads the ep for `pikerd` which currently acts as the main service-actor
registrar per host. Delegate to this new `.load_trans_eps()`
as-it-was-used from the `pikerd` cmd body and add fresh support for
`piker chart --maddr <addr: str>` using the routine in the body of the
`piker.cli.cli` cmd group after loading the `conf.toml::network` section
B)
Also, toss in runtime debug mode wrapping around `piker chart` using the
new `tractor.devx.maybe_open_crash_handler()` and pull the switch from
a `--pdb` flag now factored into the `.cli.cli` click group.
Since `tractor` and our runtime internals is now moved to multihomed semantics,
do the same in the CLI / config entrypoints.
Also, try using the new `tractor.devx.maybe_open_crash_handler()` around
the `pikerd` CLI.
When a new (actor) caller opens the registry there are 2 possible cases:
1. - some task already opened the registry during init and set the global
superset of registrar addrs that are expected to be used,
2. - some task after the init task opens with a subset of addrs.
3. - some task after init opens with a disjoint set - should be an error?
In the 2nd case we don't want to error since the may just not need to
know about other registrar (multi-homed) addrs and thus only needs
specific access - so only warn about the diff in that case. If the
caller is requesting some disjoint set then we still runtime raise.
Adjust `find_service()` to allow a null `registry_addrs` input in which
case we fail over to using whatever pre-set the `Registry.addrs` has;
makes it simple for actors that don't want/need to know about the global
registrar set for their actor tree. Also, always set pass
`tractor.find_actor(only_first=True)` (for now).
This commit requires an equivalent commit in `tractor` which adds
multi-homed transport server support to the runtime and thus the ability
ability to listen on multiple (embedded protocol) addrs / networks as
well as exposing registry actors similarly. Multiple bind addresses can
now be (bare bones) specified either in the `conf.toml:[network]`
section, or passed on the `pikerd` CLI.
This patch specifically requires the ability to pass a `registry_addrs:
list[tuple]` into `tractor.open_root_actor()` as well as adjusts all
internal runtime routines to do the same, mostly inside the `.service`
pkg.
Further details include:
- adding a new `.service._multiaddr` parser module (which will likely be
moved into `tractor`'s core) which supports loading lib2p2 style
"multiaddresses" both from the `conf.toml` and the `pikerd` CLI as
per,
- reworking the `pikerd` cmd to accept a new `--maddr`/`-m` param that
accepts multiaddresses.
- adjust the actor-registry subsys to support multi-homing by also
accepting a list of addrs to its top level API eps.
- various internal name changes to reflect the multi-address interface
changes throughout.
- non-working CLI tweaks to `piker chart` (ui-client cmds) to begin
accepting maddrs.
- dropping all elasticsearch and marketstore flags / usage from `pikerd`
for now since we're planning to drop mkts and elasticsearch will be an
optional dep in the future.
Apparently they're being massive cucks and changing their futes pair
schema again now adding a `NEXT_QUARTER` contract type which we weren't
handling at all. The good news is falling back to an old symcache file
would have prevented this from crashing.
Add a new `FutesPair.expiry: str` field so that `.bs_fqme` can simply
call it during the summary FQME-ification output rendering..
A helper for scanning a "pairs table" that most backends should expose
as part of their (internal) symbology set using `rapidfuzz` over
a `dict[str, Struct]` input table.
Also expose the `data.types.Struct` at the subpkg top level.
Previously we were assuming that the `Client._contracts: dict[str,
Contract]` would suffice this directly, which obviously isn't true XD
Also,
- add the `NSE` venue to skip list.
- use new `rapidfuzz.process.extract()` lib API.
- only get con deats for non null exchange names..