Namely handling backends which do not provide a default "frame
size-duration" in their init-config by making the backfiller guess the
value based on the first frame received.
Deats,
- adjust `start_backfill()` to take a more explicit
`def_frame_duration: Duration` expected to be unpacked from any
backend hist init-config by the `tsdb_backfill()` caller which now
also computes a value from the first received frame when the config
section isn't provided.
- in `start_backfill()` we now always expect the `def_frame_duration`
input and always decrement the query range by this value whenever
a `NoData` is raised by the provider-backend paired with an explicit
`log.warning()` about the handling.
- also relay any `DataUnavailable.args[0]` message from the provider
in the handler.
- repair "gap reporting" which checks for expected frame duration vs.
that received with much better humanized logging on the missing
segment using `pendulum.Interval/Duration.in_words()` output.
The `open_history_client()` provider endpoint can *optionally*
deliver a `frame_types: dict[int, pendulum.Duration]` subsection in its
`config: dict[str, dict]` (as was implemented with the `ib` backend).
This allows the `tsp` backfilling machinery to use this "recommended
frame duration" to subtract from the `last_start_dt` any time a `NoData`
gap is signalled by the `get_hist()` call allowing gaps to be ignored
safely without missing history by knowing the next earliest dt we can
query from using the `end_dt`. However, currently all crypto$ providers
haven't implemented this feat yet..
As such only try to use the `frame_types` feature if provided when
handling `NoData` conditions inside `tsp.start_backfill()` and otherwise
raise as normal.
Was orig for debugging an issue with `kucoin` i think but definitely
shouldn't be left in XD
Also add `'perpetual_future'` to `start_backfill()` input literal set.
Presuming the data provider gives us a config with a `frame_types: dict`
(indicating frame sizes per query/request) we try to be clever and
decrement our submitted `end_dt: DateTime` based on it.. hoping for the
best on the next attempt.
Thanks to oremanj in the `trio` room for this hot style tip which i much
prefer to have less LOC and places to change sub-pkg name exports!
Also drop expecting a `gaps` frame output from `dedupe()`.
Turns out we were always filtering to time gaps longer then a day smh..
Instead tweak `detect_time_gaps()` to only return venue-gaps when
a `gap_dt_unit: str` is passed and pass `'days'` (like it was by default
before) from `dedupe()` though we should really pass in an actual venue
gap duration in the future.
Got borked by the logic re-factoring to get more conc going around
tsdb vs. latest frame loads with nested nurseries. So, repair all that
such that we can still backfill symbols previously not loaded as well as
drop all the `_FeedBus` instance passing to subtasks where it's
definitely not needed.
Toss in a pause point around sampler stream `'backfilling'` msgs as well
since there's seems to be a weird ctx-cancelled propagation going on
when a feed client disconnects during backfill and this might be where
the src `tractor.ContextCancelled` is getting bubbled from?
Can't ref `dt_eps` and `tsdb_entry` if they don't exist.. like for 1s
sampling from `binance` (which dne). So make sure to add better logic
guard and only open the finaly backload nursery if we actually need to
fill the gap between latest history and where tsdb history ends.
TO CHERRY #486
Move `.data.history` -> `.tsp.__init__.py` for now as main pkg-mod
and `.data.tsp` -> `.tsp._anal` (for analysis).
Obviously follow commits will change surrounding codebase (imports) to
match..