Since it appears impossible to compute the recurrence relations for PPU
(at least sanely) without using embedded `polars.List` elements, this
instead just implements price-per-unit and break-even-price calcs
doing a plain-ol-for-loop imperative approach with logic branching.
I burned wayy too much time trying to implement this in some kinda
`polars` DF native way without luck, so hopefuly someone smarter can
come in and make it work at some point xD
Resolves a related bullet in #515
Took a little while to get right using declarative style but it's
finally workin and seems (mostly correct B)
Computes the ppu (price per unit) using the PnL since last
net-zero-cumsize (aka the pnl from open to close) and uses it to calc
the pnl-per-exit trade (using the ppu).
Next up, bep (break even price both) per position and maybe since
ledger start or an arbitrary ref point?
We don't need it in `detect_time_gaps()` since doing straight up
datetime diffs in `polars` already has a humanized `str` representation
but with higher precision like '2d 1h 24m 1s' B)
Since we need `.get_mkt_info()` to remain symmetric across calls with
different fqme inputs, and binance generally uses upper case for it's
symbology keys, we always upper the FQME related tokens for both
symcaching and general search purposes.
Also don't set `_atype` on mkt pairs since it should be fully handled
via the dst asset loading in `Client._cache_pairs()`.
In cases where a brokerd backend doesn't yet support a symcache we need
to do manual `.get_mkt_info()` queries and stash them in a table that we
pass in for the mkt failover lookup to `Account.update_from_ledger()`.
Set the `PaperBoi._mkts` to this table for use on real-time ledger
writes in `.fake_fill()`.
Since some backends are going to have the issue of supporting multiple
venues for a given "position distinguishing instrument", like IB, we
can't presume that every `Position` can be uniquely keyed by
a `MktPair.fqme` (since the venue part can change and still be the same
"pair" relationship in accounting terms) so instead presume the
"backend system's market id" is the unique key (at least for now)
instead of the fqme.
More practically we use the `bs_mktid` to groupby-partition the per
pair DFs from the trades ledger and attempt to scan-match the input
fqme (in `ledger disect` cli) against the fqme column values set.
Not sure why i ever thought it would work otherwise but, obviously if
you're replicating a `Position` from a **summary** (IPC) msg we
need to wipe any prior clearing events from the events history..
The main use for this loading mechanism is precisely if you don't have
local access to the txn ledger and need to represent a position from
a summary 🤦
Also, never bother with ledger file fqme "rewriting" if the backend has
no symcache support (yet) since obviously there's then no symbol set to
search for a better key xD
Instead of casting to `dict`s and rewriting event names in the
`push_tradesies()` handler, be transparent with event names (also
defining and piker-equivalent mapping them in a redefined `_statuses`
table) and types
passing them directly to the `deliver_trade_events()` task and generally
make event handler blocks much easier to grok with type annotations. To
deal with the causality dilemma of *when to emit a pos msg* due to
needing all of `execDetailsEvent, commissionReportEvent, positionEvent`
but having no guarantee on received order, we implement a small task
`clears: dict[Contract, tuple[Position, Fill]]` tracker table and (as
before) only emit a position event once the "cost" can be accessed for
the fill. We now ALWAYS relay any `Position` update from IB directly to
ensure (at least) the cumsize is correct (since it appears we still have
ongoing issues with computing this correctly via `.accounting.Position`
updates..).
Further related adjustments:
- load (fiat) balances and startup positions into a new `IbAcnt` struct.
- change `update_and_audit_pos_msg()` to blindly forward ib position
event updates for the **the size** since it should always be
considered the true gospel for accounting!
- drop ib-has-no-position handling since it should never occur..
- move `update_ledger_from_api_trades()` to the `.ledger` submod and do
processing of ib_insync `Fill` related objects instead of dict-casted
versions instead doing the casting in
`api_trades_to_ledger_entries()`.
- `norm_trade()`: add `symcache.mktmaps[bs_mktid] = mkt` in since it
turns out API (and sometimes FLEX) records don't contain the listing
exchange/venue thus making it impossible to map an asset pair in the
"position sense" (i.e. over multiple venues: qqq.nasdaq, qqq.arca,
qqq.directedge) to an fqme when doing offline ledger processing;
instead use frickin IB's internal int-id so there's no discrepancy.
- also much better handle futures mkt trade flex records such that
parsed `MktPair.fqme` is consistent.
Since getting a global symcache result from the API is basically
impossible, we ad-hoc fill out the needed client tables on demand per
client code queries to the mkt info EP.
Also, use `unpack_fqme()` in fqme (search) pattern parser instead of
hacky `str.partition()`.
Add new `Client` attr tables to better stash `Contract` lookup results
normally mapped from some in put FQME;
- `._contracts: dict[str, Contract]` for any input pattern (fqme).
- `._cons: dict[str, Contract] = {}` for the `.conId: int` inputs.
- `_cons2mkts: bidict[Contract, MktPair]` for mapping back and forth
between ib and piker internal pair types.
Further,
- type out as many ib_insync internal types as possible mostly for
contract related objects.
- change `Client.trades()` -> `.get_fills()` and return directly the
result from `IB.fill()`.
- start flipping over internals to `Position.cumsize`
- allow passing in a `_mktmap_table` to `Account.update_from_ledger()`
for cases where the caller wants to per-call-dyamically insert the
`MktPair` via a one-off table (cough IB).
- use `polars.from_dicts()` in `.calc.open_ledger_dfs()`. and wrap the
whole func in a new `toolz.open_crash_handler()`.
Since there's a growing list of top level mods which are more or less
utils/tools for working with the runtime; begin to move them into a new
subpkg starting with a new `.toolz.debug`.
Start with,
- a new `open_crash_handller()` for doing breakpoints around blocks that
might error.
- move in what was `piker._profile` into `.toolz.profile` and adjust all
importing appropriately.
Define and bind in the `tx_sort()` routine to be used by
`open_trade_ledger()` when datetime sorting trade records.
Further deats:
- always use the IB reported position size (since apparently our ledger
based accounting is getting rekt on occasion..).
- better ib pos msg formatting when there's mismatches with the piker
equivalent.
- never emit zero-size pos msgs (in terms of strict ib pos sizing) since
when there's piker ledger sizing errors we'll send the wrong thing to
the ems and its clients..
Since apparently rendering to dict from a sorted generator func clearly
doesn't preserve the order when using a `dict`-comprehension.. Further,
there's really no reason to strictly return a `dict`. Adjust
`.calc.ppu()` to make the return value instead a `list[tuple[str,
dict]]`; this results in the current df cumsum values matching the
original impl and the existing `binance.paper` unit tests now passing XD
Other details that fix a variety of nonsense..
- adjust all `.clearsitems()` consumers to the new list output.
- use `str(pendulum.now())` in `Position.from_msg()` since adding
multiples with an `unknown` str will obviously discard them, facepalm.
- fix `.calc.ppu()` to NOT short circuit when `accum_size` is 0; it's
been causing all sorts of incorrect size outputs in the clearing
table.. lel, this is what fixed the unit test!
Move in the obvious things XD
- all the specially defined venue tables from `.api`.
- some parser funcs: `con2fqme()` and `parse_patt2fqme()`.
- the `get_mkt_info()` and `open_symbol_search()` broker eps.
- the `_asset_type_map` table which converts to `.accounting.Asset`
compat keys for each contract/security.
Since there's no easy way to support it yet, we bypass symbology caching
in for now and instead allow the `ib.ledger` routines to fill in
`MktPair` and `Asset` entries ad-hoc for the purposes of txn ledger
processing.
Some backends like `ib` don't have an obvious (nor practical) way to
easily download the entire symbology set available from all its mkt
venues. For such backends loading might require a non-std approach (like
using the contract search from some input mkt-key set) and can't be
expected to necessarily be supported out of the box. As such, allow
annotating a broker sub-pkg module with a `_no_symcache: bool = True`
attr which will make `open_symcache()` yield early with an empty
`SymbologyCache` instance for use by the caller to fill in the mkt and
assets tables in whatever ad-hoc way desired.
The list is `open_symcache()`, `get_symcache()`, `SymbologyCache`, and
`Stuct` which seems more or less fine to make part of the public
namespace. Also, make `._timeseries.t_unit` an instance of literal to make
`ruff` happy?
This was more involved then expected but on the bright side, is going to
help drive a more general `Account` update/processing/loading API
providing for all the high-level txn update methods needed for any
backend to generically update the participant's account *state* via
an input ledger/txn set B)
Key changes to enable `SymbologyCache` compat:
- adjust `Client` pairs / assets lookup tables to include a duplicate
keying of all assets and "asset pairs" using the (chitty) default key
set that kraken ships which is NOT the `.altname` no `.wsname` keys;
the "default ReST response keys" i guess?
- `._AssetPairs` and `._Assets` are *these ^* rest-key sets delivered
verbatim from the endpoint responses,
- `._pairs` and `._assets` the equivalent value-sets keyed by piker
style FQME-looking keys (now provided via the new
`.kraken.symbols.Pair.bs_fqme: str` and the delivered `'altname'`
field (for assets) respectively.
- re-implement `.get_assets()` and `.get_mkt_pairs()` to appropriately
delegate to internal methods and these new (multi-keyed) tables to
deliver the cacheable set of symbology info.
- adjust `.feed.get_mkt_info()` to handle parsing of both fqme-style and
wtv(-the-shit-stupid) kraken key set a caller passes via
a key-matches-first-table-style-scan after pre-processing the
input `fqme: str`; also do the `Asset` lookups from the new
`Pair.bs_dst/src_asset: str` fields which should always map correctly
to an internal asset entry delivered by `Client.get_assets()`.
Dirty impl deatz:
- add new `.kraken.symbols` and move the newly refined `Pair` there.
- add `.kraken.ledger` and move in the factored out ledger processing
routines.
- also move out what was the `has_pp()` and large chung of nested-ish
looking acnt-position verification logic blocks into a new
`verify_balances()` B)
Provides for fully isolated symbology caching in a flat TOML table
without special case handling B)
Also explicitly define `.bs_mktid: str` which is now used by the
symcache to to key-index the backend specific pair set and thus provides
for round-trip marshalling without special knowledge of any backend
schema.
Previously the cum-size calc(s) was in the `disect` CLI but it's better
stuffed into the backing df converter. Also, ensure that whenever
a `dt` field is type-detected as a `str` we parse it to `DateTime`.
For testing (and probably hacking) it's handy to be able to point
somewhere other the default user-config dir for a ledger or account file
to test offline processing apis from `.accounting` subsystems. For now
it's a private optional named-arg: `_fp: Path` and it's obviously passed
down into the `load_account()` config getter.
Note that in the non-paper account case `Account.update_from_ledger()`
will use the ledger's `.symcache` and `.iter_txns()` method to acquite
actual txn-structs to compute positions.
Since each broker backend generally needs to define a specific
field-name-schema to determine the exact instantiation arguments to
`Transaction`, we generally need each backend to define an endpoint
function to conduct this transaction from an input `dict[str, Any]`
received either directly from provided ledger APIs or from previously
stored `.accounting._ledger` saved trades ledger TOML files.
To accomplish this we now require backends to declare a new routine:
```python
def norm_trade(
tid: str, # the uuid for the transaction
txdict: dict, # the input record-dict
# a table of mkt-symbols to backend
# struct objects which define the (meta-data) for the backend specific
# venue's symbology
pairs: dict[str, Struct],
) -> Transaction:
...
```
which implements that record conversion (at least for trades)
and can thus be used in `TransactionLedger.iter_txns()` which requires
"some code" to implement the loading from a serialization format (aka
the input `dict` record) to our local `Transaction` struct, normally
also using a `Pair`-struct table defined (and maybe previously cached)
by the specific backend such our (normalization layer's) `MktPair`'s
fields can be set.
For the case of our `.clearing._paper_engine` we def the routine to
simply extract the exact same fields from the TOML ledger records that
we previously had written (to it) and define it in that module.
Also, we always pass `pairs=SymbologyCache.pairs: dict[str, Struct]` on
norm trade calls such that offline ledger and accounting processing
clients can use a previously cached symbology set without having to
necessarily start the async-actor runtime to query the actual backend API
if the data has already been saved locally on the system B)
Other related:
- always passthrough kwargs in overridden `.to_dict()` method.
- only do fqme related trade record field name rewrites/names when
operating on a paper ledger; normally a backend's records don't
contain these.
- fix `pendulum.DateTime` type annots.
- just deliver `Transaction`s from `.iter_txns()`
Since some backends have multiple venues keyed by the same
symbol-pair-name, AND often the market/symbol info for those different
market-venues is entirely different (cough binance), we will have to
(sometimes) save the struct namespace-path as str for lookup when
deserializing a symcache to object form.
NOTE: this change is reliant on the following `tractor` dev commit which
improves support for constructing a path from object-instance:
bee2c36072
Add a backend(-wide) default struct path stored as a (TOML top level)
field `pair_ns_path: str` in the serialized `dict`-table as well as
allow for a per pair-`Struct` value optionally defined on each type def;
the global is only used if none was defined per struct via a `ns_path:
str`.
Further deats:
- don't write non-struct-member fields to dict for TOML file cache.
- always keep object forms, well as objects (in tables).. XD
- factor cache loading from `dict` (and thus from TOML or presumably any
other interchange form) into a `@classmethod` constructor method B)
- all choosing the subtable for `.search()` by name.
Still kinda borked since i don't think there actually is a (per venue)
"get-all-symbologies" endpoint.. so we're likely gonna have to figure
out either how to hack it or provide a bypass in ledger processing?
Deatz:
- use new `Account` type name, rename endpoint vars to match and
obviously use any new method name(s).
- mask out split ratio handling for now.
- async open the symcache prior to ledger processing (again, for now).
- drop passing `Transaction.sym`.
- fix parser set for dt-sorter since apparently 2022 and back had
a `date` field instead?
Since we don't want to be doing a `trio.run()` from async code (being
already in the `tractor` runtime and all); for now just put a top level
block wrapping async enter until we figure out to embed it (likely)
inside `open_account()` and pass the ref to `open_trade_ledger()`.
Require passing an explicit flag when entering from sync code with an
extra super duper explicit runtime error to indicate how to use in the
async case as well!
Also, do rewrites of both the fqme (from best match in the symcache
according to search - the worst case) or from the `bs_mktid` field if
it exists (should only be true for paper engine accounts) AND the
`bs_mktid` for paper accounts if it seems un-fully-qualified.
Turns in order to make things much cleaner from inside-the-runtime usage
we do probably want to just make the manager async so that we can
generate the cache on demand from async UI inits as well as daemon
actors.. So change to that and instead make `get_symcache()` the helper
that should ONLY be called from sync funcs / offline ledger processing
utils!
Instead of constructing them (previously manually) in `.get_mkt_info()` ep,
just call `.get_assets()` and do key lookups for assets to hand directly
to the `.src/dst` of `MktPair`.
Refine fqme input parsing to match:
- adjust parsing logic to only use `unpack_fqme()` on the input fqme
token.
- set `.mkt_mode: str` to the derivs venue when an expiry token is
detected in the fqme.
- pass the parsed `expiry: str` to `Client.exch_info()` to ensure
a deriv venue (table) is used for pair lookup.
- skip any "DEFI" venue or other unknown asset type cases (since binance
doesn't seem to define some assets anywhere?).
Also, just use the `Client._pairs` unified table for search input since
the first call to `.exch_info()` won't necessarily contain the most
up-to-date state whereas `._pairs` always will.
Meaning we add the `Client.get_assets()` and `.get_mkt_pairs()` methods.
Also implement `.exch_info()` to take in a `expiry: str` to detect
whether to look up a derivative venue instead of spot.
In support of all this we now explicitly key all assets (via
`._cache_pairs() during the populate of `._venue2assets` sub-tables)
with their `.bs_dst_asset: str` value to ensure, for ex., a spot
`BTCUSDT` has a distinct value from any futures contracts with the same
`Pair.symbol: str` value!
Also, ensure we always create a `brokers.toml` (from template) if DNE
and binance is the user's first used backend XD
Add `bs_src/dst_asset: str` properties which provide for unique keying
into futures vs. spot venues by offering a `.venue: str` property which,
for non-spot delivers normally an expiry suffix (eg. '.PERP') and for
spot just delivers the bair chain-token key.
This enables keying multiple venues with the same mkt pairs easily in
a global flat key->pair table needed as part of supporting a symcache.
So you can do a `Struct1` - `Struct2` and we dump a little diff `list`
of tuples for anal on the REPL B)
Prolly can be broken out into it's own micro-patch?
Drop all the old `polars` (groupby + agg related) mangling to get a df
per fqme by delegating to the new routine and add in the `.cumsum()`ing
(per frame) as a first start on computing pps using dfs instead of
python dicts + loops as in `ppu()`.
To isolate it from the ledger/account mods and bc it is actually for
doing (eventual) position calcs / anal, might as well put it in this
mod. Add in the old-masked `ensure_state()` method content in case we
want to use it later for testing. Also tighten up the parser loading
inside `dyn_parse_to_dt()`.
Rename `open_pps()` -> `open_account()` for obvious reasons as well as
expect a bit tighter integration with `SymbologyCache` and consequently
`LedgerTransaction` in order to drop `Transaction.sym: MktPair`
dependence when compiling / allocating new `Position`s from a ledger.
Also we drop a bunch of prior attrs and do some cleaning,
- `Position.first_clear_dt` we no longer sort during insert.
- `._clears` now replaces by `._events` table.
- drop the now masked `.ensure_state()` method (eventually moved to
`.calc` submod for maybe-later-use).
- drop `.sym=` from all remaining txns init calls.
- clean out the `Position.add_clear()` method and only add the provided
txn directly to the `._events` table.
Improve some `Account` docs and interface:
- fill out the main type descr.
- add the backend broker modules as `Account.mod` allowing to drop
`.brokername` as input and instead wrap as a `@property`.
- make `.update_from_trans()` now a new `.update_from_ledger()` and
expect either of a `TransactionLedger` (user-dict) or a dict of txns;
in the latter case if we have not been also passed a symcache as input
then runtime error since the symcache is necessary to allocate
positions.
- also, delegate to `TransactionLedger.iter_txns()` instead of
a manual datetime sorted iter-loop.
- drop all the clears datetime don't-insert-if-earlier-then-first
logic.
- rename `.to_toml()` -> `.prep_toml()`.
- drop old `PpTable` alias.
- rename `load_pps_from_ledger()` -> `load_account_from_ledger()` and
make it only deliver the account instance and also move out all the
`polars.DataFrame` related stuff (to `.calc`).
And tweak some account clears table formatting,
- store datetimes as TOML native equivs.
- drop `be_price` fixing.
- obvsly drop `.ensure_state()` call to pps.
Turns out we don't really need it directly for most "txn processing" AND
if we do it's usually related to some `Account`-ing related calcs; which
means we can instead just rely on the new `SymbologyCache` lookup to get
it when needed. So, basically just get rid of it and rely instead on the
`.fqme` to be the god-key to getting `MktPair` info (from the cache).
Further, extend the `TransactionLedger` to contain much more info on the
pertaining backend:
- `.mod` mapping to the (pkg) py mod.
- `.filepath` pointing to the actual ledger TOML file.
- `_symcache` for doing any needed asset or mkt lookup as mentioned
above.
- rename `.iter_trans()` -> `.iter_txns()` and allow passing in
a symcache or using the init provided one.
- rename `.to_trans()` similarly.
- delegate paper account txn processing to the `.clearing._paper_engine`
mod's `norm_trade()` (and expect this similarly from other backends!)
- use new `SymbologyCache.search()` to find the best but
un-fully-qualified fqme for a given `txdict` being processed when
writing a config (aka always try to expand to the most verbose `.fqme`
possible).
- add a `rewrite: bool` control to `open_trade_ledger()`.
Since we now fully support interchange-as-dict-msg, use the msg codec
API and drop manual `Asset` unpacking. Also, wrap `get_symcache()` in
a `pdbp` crash handler block for now B)
As part of loading the cache we can now fill the asset sub-tables:
`.mktmaps` and `.assets` with their deserialized struct instances!
In theory this might be possible for the backend defined `Pair` structs
as well but we need to figure out probably an endpoint to offer
the conversion?
Also, add a `SymbologyCache.search()` which allows sync code to scan the
existing (known via cache) symbol set just like how async code can use the
(much slower) `open_symbol_search()` ctx endpoint 💥
Previously we weren't necessarily serializing mkt pairs (for IPC msging)
entirely bc the assets `.src/.dst` were being sent just by their
str-names. This now properly supports fully serializing `Asset`s as
`dict`-msgs such that use of `MktPair.to_dict()` can be transmitted over
`tractor.MsgStream`s and deserialized entirely back to struct from on
the receiver end.
Deats:
- implement `Asset.to_dict()` and `.from_msg()`
- adjust `MktPair.to_dict()` and `.from_msg()` to use these methods.
- drop all the hacky "if .src/.dst is str" handling.
- add better `MktPair.from_fqme()` input handling for expiry and venue;
ensure that either can be extracted from passed fqme *and* if so they
are also popped from any duplicate passed in `**kwargs**`.
For starters rename the cache type to `SymbologyCache` and fill out its
interface to include an (async) `.reload()` which can be used to populate
the in-mem asset-table sets such that any tractor-runtime task can
actually directly call it. Use a symcache file name schema of
`_cache/<backend>.symcache.toml`.
Dirtier deatz:
- make `.open_symcache()` a `@cm` such that it can be used from sync code
and will actually call `trio.run()` in the case where it needs to do a
full (re)load; also don't write on exit only on reloads.
- add `.get_symcache()` a simple non-ctx-mngr reader which again can
mostly be called willy-nilly from sync code without the full runtime
being up (but likely will only work if symcache files already exist
for the backend).
New mod is `.data._symcache` and it needs backend clients to declare
`Client.get_assets()` and `.get_mkt_pairs()` to generate the cache files
which now go in the config dir under `_cache/`.
We're probably going to move to implementing all accounting using
`polars.DataFrame` and friends and thus this rejig preps for a much more
"stateless" implementation of our `Position` type and its internal
pos-accounting metrics: `ppu` and `cumsize`.
Summary:
- wrt to `._pos.Position`:
- rename `.size`/`.accum_size` to `.cumsize` to be more in line
with `polars.DataFrame.cumsum()`.
- make `Position.expiry` delegate to the underlying `.mkt: MktPair`
handling (hopefully) all edge cases..
- change over to a new `._events: dict[str, Transaction]` in prep
for #510 (and friends) and enforce a new `Transaction.etype: str`
which is by default `clear`.
- add `.iter_by_type()` which iterates, filters and sorts the
entries in `._events` from above.
- add `Position.clearsdict()` which returns the dict-ified and
datetime-sorted table which can more-or-less be stored in the
toml account file.
- add `.minimized_clears()` a new (and close) version of the old
method which always grabs at least one clear before
a position-side-polarity-change.
- mask-drop `.ensure_state()` since there is no more `.size`/`.price`
state vars (per say) as we always re-calc the ppu and cumsize from
the clears records on every read.
- `.add_clear` no longer does bisec insorting since all sorting is
done on position properties *reads*.
- move the PPU (price per unit) calculator to a new `.accounting.calcs`
as well as add in the `iter_by_dt()` clearing transaction sorted
iterator.
- also make some fixes to this to handle both lists of `Transaction`
as well as `dict`s as before.
- start rename of `PpTable` -> `Account` and make a note about adding
a `.balances` table.
- always `float()` the transaction size/price values since it seems if
they get processed as `tomlkit.Integer` there's some suuper weird
double negative on read-then-write to the clears table?
- something like `cumsize = -1` -> `cumsize = --1` !?!?
- make `load_pps_from_ledger()` work again but now includes some very
very first draft `polars` df processing from a transaction ledger.
- use this from the `accounting.cli.disect` subcmd which is also in
*super early draft* mode ;)
- obviously as mentioned in the `Position` section, add the new `.calcs`
module with a `.ppu()` calculator func B)
Also finally adds full `FeedInit` and `MktPair` support for this backend
by handling:
- all "currency" fields for each `Contract` by constructing
and `Asset` and setting the `MktPair.src` with `.atype='fiat'`.
- always render the `MktPair.src` name in the `.fqme` for fiat pairs
(aka forex) but never for other instruments.
In an effort to properly support fiat pairs (aka forex) as well as more
generally insert a fully-qualified `MktPair` in for the
`Transaction.sys`. Note that there's a bit of special handling for API
`Contract`s-as-dict records vs. flex-report-from-xml equivalents.
No point having duplicate data when we already stash the `expiry` on the
mkt info type and can just read it (and cast to `datetime` obj).
Further this fixes a regression caused by converting `._clears` to
a list by adding a `._events: dict[str, Transaction]` which prevents
double entering transactions based on checking the events table for the
existing id.. Further add a sanity check that all events are popped
(for now) after serializing the clearing table for the toml account
file.
In the longer run, ideally we don't have the separate sequences ._clears
and ._events by choosing a better data structure (sorted unique set of
mkt events) maybe a specially used `polars.DataFrame` (which we kind
need eventually anyway)?
When you look at usage we don't end up really needing clear entries to
be keyed by their `Transaction.tid`, instead it's much more important to
ensure the time sorted order of trade-clearing transactions such that
position properties such as the size and ppu are calculated correctly.
Thus, this instead simplified the `.clears` table to a list of clear
dict entries making a bunch of things simpler:
- object form `Position._clears` compared to the offline TOML schema
(saved in account files) is now data-structure-symmetrical.
- `Position.add_clear()` now uses `bisect.insort()` to
datetime-field-sort-insert into the *list* which saves having to worry
about sorting on every sequence *read*.
Further deats:
- adjust `.accounting._ledger.iter_by_dt()` to expect an input `list`.
- change `Position.iter_clears()` to iterate only the clearing entry
dicts without yielding a key/tid; no more tuples.
- drop `Position.to_dict()` since parent `Struct` already implements it.
Since it may be handy to get the latest ticks first, add a `reverse:
bool` to `iterticks()` and add some cleaner logic and a proper doc
string to `frame_ticks()`.
Allows for tracking paper engine orders despite the ems not necessarily
being opened by the current order mode instance (UI) in "paper"
execution mode; useful for tracking bots/strats running against the same
EMS daemon.
Like you'd think:
- `load_ledger()` -> ._ledger
- `load_accounrt()` -> ._pos
Also fixup the old `load_pps_from_ledger()` and expose it from a new
`.accounting.cli.disect` cli cmd for trying to figure out why pp calcs
are totally mucked on stupid ib..
Should be the final production backend to switch this over B)
Also tidy up the `update_and_audit_msgs()` validator to log vs. raise
when `validate: bool` is set; turn it off by default to avoid raises
until we figure out wtf is up with ib ledger processing or wtv..
We might as well start standardizing on `brokerd` init such that it can
be used more generally in client code (such as the `.accounting.cli`
stuff).
Deats of `broker_init()` impl:
- loads appropriate py pkg module,
- reads any declared `__enable_modules__: listr[str]` which will be
passed to `tractor.ActorNursery.start_actor(enabled_modules=<this>)`
- loads the `.brokers._daemon._setup_persistent_brokerd
As expected the `accounting.cli` tools can now import directly from this
new location and use the common daemon fixture definition.
After #520 we've moved to better supporting explicit venues for cex
backends which is important where a provider offers both spot and
derivatives markets (kraken, binance, kucoin) and we need to distinguish
which is being traded given a common asset pair (eg. BTC/USDT). So, make
this work for `kraken`'s brokerd such that requests and pre-existing
live order are (un)packed to/from EMS messaging form.
Since crypto backends now also may expand an FQME like `xbteur.kraken`
-> `xbteur.spot.kraken` (by filling in the venue token), we need to use
this identifier when looking up per-market order dialogs or submitting
new requests. The simple fix is to simply look up that expanded from
from the `Feed.flumes` table which is always keyed by the `MktPair.fqme:
str` - the expanded form.
Drop the older `dict[str, ChainMap]` prototype we had since the new
`OrderDialogs` built-out while adding `binance` order support is more
refined and general. Also, handle new and now expect `.spot` venue token
in FQMEs since kraken too has futes markets that we'll likely want to
support eventually.
Use dynamic lookups instead by mapping to the correct http session and
endpoints path using the venue routing/mode key. This let's us simplify
from 3 methods down to a single `Client._api()` which either can be
passed the `venue: str` explicitly by the caller (as is needed in the
`._cache_pairs()` case) or falls back to the client's current
`.mkt_mode: str` setting B)
Deatz:
- add couple more tables to suffice all authed-endpoint use cases:
- `.venue2configkey: dict[str, str]` which maps the venue key to the
`brokers.toml` subsection which should be used for auth creds and
testnet config.
- `.confkey2venuekeys: dict[str, list[str]]` which maps each config
subsection key to the list of venue name keys for doing config to
venues lookup.
- always build out testnet sessions for spot and futes venues (though if
not set the sessions obviously won't ever be used).
- add and use new `config.ConfigurationError` custom exceptions when api
creds are missing.
- rename `action: str` to `method: str` in `._api()` since it's the
proper ReST term and switch what was "method" to be `endpoint: str`.
- mask out `.get_positions()` since we can get that from a user stream
wss request (and are doing that).
- (in theory) import and use spot testnet url as necessary.
Do parsing of the `'symbol'` and check for an `_<expiry>` suffix, in
which case we re-format in capitalized FQME style, do the
`Client._pairs[str, Pair]` lookup and then send the `Pair.bs_fqme` in
the `Order.fqme: str` field.
It's finally a decent little design / interface and definitely can be
used in other backends like `kraken` which rolled something lower level
but more or less the same without a wrapper class.
As you'd expect query and sync the EMS with existing live orders
reported by the market venue by packing them in `Status` msgs and
sending over the order dialog stream before starting the handler tasks.
XXX CAVEAT:
- there appears to be no way (at least on the usdtm market/venue) to
distinguish between different contracts such as perps vs. the
quarterlies?
- for now we just assume that the perp is being used since
there's no indicator otherwise in the 'symbol' field?
- we should maybe open an issue with the futures-connector project to
see how they'd recommend solving this discrepancy?
Only a couple tweaks to make this work according to the docs:
https://binance-docs.github.io/apidocs/futures/en/#modify-order-trade
- use a PUT request.
- provide the original user id in a `'origClientOrderId'` msg field.
- don't expect the same oid in the PUT response.
Other broker-mode related details:
- don't call `OrderDialogs.add_msg()` until after the existing check
since we want to check against the *last* msgs contents not the new
request.
- ensure we pass the `modify=True` flag in the edit case.
There was one trick which was that it seems that binance will often send
the account/position update event over the user stream *before* the
actual clearing (aka FILLED) order update event, so make sure we put an
entry in the `dialogs: OrderDialogs` as soon as an order request comes
in such that even if the account update arrives first the
`BrokerdPosition` msg can be relayed without delay / order event order
considerations.
Was using the wrong key before from our old code (not sure how that
slipped back in.. prolly doing too many git stashes XD), so fix that to
properly match against order update events with 'ORDER_TRADE_UPDATE'.
Also, don't match on the types we want to *cast to*, that's not how
match syntax works (facepalm), so we have to typecast prior to EMS msg
creation / downstream logic.
Further,
- try not bothering with binance's own internal `'orderId'` field
tracking since they seem to support just using your own user version
for all ctl endpoints? (thus we only need to track the EMS `.oid`s B)
- log all event update msgs for now.
- pop order dialogs on 'closed' statuses.
- wrap cancel requests in an error handler block since it seems the EMS
is double sending requests from the client?
For crypto derivatives (at least futes), yes they are margined, but
generally not around a single unit of vlm (like equities or commodities
futes) so don't pre-set the order mode allocator to use a #unit limit,
$limit is fine.
Since the request handler task will work concurrently across venues
(spot, futes, margin) we need to be sure that we look up the correct
venue to update the order dialog and this is naturally determined by the
FQME-style symbol in the `BrokerdOrder` msg; the best way to map that
symbol-key to the correct venue/`Pair` is by using said `._pairs:
ChainMap`.
Further, handle limit order errors by catching and relaying back an
error response to the EMS. Fix the "account name" to be `binance.usdtm`
so that we can eventually and explicitly support all venues by name.
Untested fully but has ostensibly working position and balance loading
(by delegating entirely to binance's internals for that) and an MVP ems
order request handler; still need to fill out the order status update
task implementation..
Notes:
- uses user data stream for all per account balance and position tracking.
- no support yet for `piker.accounting` position tracking.
- no support yet for full order / position real-time update via user
stream.
Since we need them for accounting and since we can get them directly
from the usdtm futes `exchangeInfo` ep, just preload all asset info that
we can during initial `Pair` caching. Cache the asset infos inside a new per venue
`Client._venues2assets: dict[str, dict[str, Asset | None]]` and mostly
be pedantic with the spot asset list for now since futes seems much
smaller and doesn't include transaction precision info.
Further:
- load a testnet http session if `binance.use_testnet.futes = true`.
- add testnet support for all non-data endpoints.
- hardcode user stream methods to work for usdtm futes for the moment.
- add logging around order request calls.
As just added for binance move to using an explicit `.<venue>.kraken`
style for spot markets which makes the current spot symbology expand to
`<PAIR>.SPOT` from the new `Pair.bs_fqme: str`. Reasons for why are
laid out in the equivalent patch for binance. Obviously this also primes
for supporting kraken's futures venue APIs as well 🏄https://docs.futures.kraken.com/#introduction
Detalles:
- add `.spot.kraken` parsing to `get_mkt_info()` so that if the venue
token is not passed by caller we implicitly expand it in.
- change `normalize()` to only return the `quote: dict` not the topic
key.
- rewrite live feed msg loop to use `match:` syntax B)
Since there are indeed multiple futures (perp swaps) contracts including
a set with expiry, we need a way to distinguish through search and
`FutesPair` lookup which contract we're requesting. To solve this extend
the `FutesPair` and `SpotPair` to include a `.bs_fqme` field similar to
`MktPair` and key the `Client._pairs: ChainMap`'s backing tables with
these expanded fqmes. For example the perp swap now expands to
`btcusdt.usdtm.perp` which fills in the venue as `'usdtm'` (the
usd-margined fututes market) and the expiry as `'perp'` (as before).
This allows distinguishing explicitly from, for ex., coin-margined
contracts which could instead (since we haven't added the support yet)
fqmes of the sort `btcusdt.<coin>m.perp.binance` thus making it explicit
and obvious which contract is which B)
Further we interpolate the venue token to `spot` for spot markets going
forward, which again makes cex spot markets explicit in symbology; we'll
need to add this as well to other cex backends ;)
Other misc detalles:
- change USD-M futes `MarketType` key to `'usdtm_futes'`.
- add `Pair.bs_fqme: str` for all pair subtypes with particular
special contract handling for futes including quarterlies, perps and
the weird "DEFI" ones..
- drop `OHLC.bar_wap` since it's no longer in the default time-series
schema and we weren't filling it in here anyway..
- `Client._pairs: ChainMap` is now a read-only fqme-re-keyed view into
the underlying pairs tables (which themselves are ideally keyed
identically cross-venue) which we populate inside `Client.exch_info()`
which itself now does concurrent pairs info fetching via a new
`._cache_pairs()` using a `trio` task per API-venue.
- support klines history query across all venues using same
`Client.mkt_mode_req[Client.mkt_mode]` style as we're doing for
`.exch_info()` B)
- use the venue specific klines history query limits where documented.
- handle new FQME venue / expiry fields inside `get_mkt_info()` ep such
that again the correct `Client.mkt_mode` is selected based on parsing
the desired spot vs. derivative contract.
- do venue-specific-WSS-addr lookup based on output from
`get_mkt_info()`; use usdtm venue WSS addr if a `FutesPair` is loaded.
- set `topic: str` to the `.bs_fqme` value in live feed quotes!
- use `Pair.bs_fqme: str` values for fuzzy-search input set.
The beginning of supporting multi-markets through a common API client.
Change to futes market mode in the client if `.perp.` is matched in the
fqme. Currently the exchange info and live feed ws impl will swap out
for their usd-margin futures market equivalent (endpoints).
Not sure why it seemed like futures pairs didn't have this field but add
it to the parent `Pair` type as well as drop the overridden
`.price/size_tick` fields instead doing the same as in spot as well.
Also moves the `MarketType: Literal` (for the `Client.mkt_mode: str`)
and adds a pair type lookup table for exchange info loading.
Add the usd-futes "Pair" type and thus ability to load all exchange
(info for) contracts settled in USDT. Luckily we don't seem to have to
modify anything in the `Client` interface (yet) other then a new
`.mkt_mode: str` which determines which endpoint set to make requests.
Obviously data received from endpoints will likely need diff handling as
per below.
Deats:
- add a bunch more API and WSS top level domains to `.api` with comments
- start a `.binance.schemas` module to house the structs for loading
different `Pair` subtypes depending on target market: `SpotPair`,
`FutesPair`, .. etc. and implement required `MktPair` fields on the
new futes type for compatibility with the clearing layer.
- add `Client.mkt_mode: str` and a method lookup for endpoint parent
paths depending on market via `.mkt_req: dict`
Also related to live feeds,
- drop `Struct` typecasting instead opting for specific fields both for
speed and simplicity atm.
- breakout `subscribe()` into module level acm from being embedded
closure.
- for now swap over the ws feed to be strictly the futes ep (while
testing) and set the `.mkt_mode = 'usd_futes'`.
- hack in `Client._pairs` to only load `FutesPair`s until we figure out
whether we want separate `Client` instances per market or not..
Instead of having a buncha logic branches for 'get', 'post', etc. just
pass the `method: str` and do a attr lookup on the `asks` sesh.
Also, adjust the `trades_dialogue()` ep to switch to paper mode when no
client API key is detected/loaded.
First draft originally by @guilledk but update by myself 2 years later
xD. Will crash at runtime but at least has the machinery to setup signed
requests for auth-ed endpoints B)
Also adds a generic `NoSignature` error for when credentials are not
present in `brokers.toml` but user is trying to access auth-ed eps with
the client.
Since we only ever want to do incremental y-range calcs based on the
price always skip any tick types emitted by the data daemon which aren't
defined in the fundamental set. Further, toss in a new `debug_n_trade:
bool` toggle which by default turns off all loggin and profiler calls;
if you want to do profiling this has to now be adjusted manually!
Since crypto backends now also may expand an FQME like `xbteur.kraken`
-> `xbteur.spot.kraken` (by filling in the venue token), we need to use
this identifier when looking up per-market order dialogs or submitting
new requests. The simple fix is to simply look up that expanded from
from the `Feed.flumes` table which is always keyed by the `MktPair.fqme:
str` - the expanded form.
This was actually incorrect prior, we were rounding triggered limit
orders with the `.size_tick` value's digits when we should have been
using the `.price_tick` (facepalm). So fix that and compute the rounding
number of digits (as passed to the round(<value>, ndigits=<here>)`
builtin) and store it in the `DarkBook.triggers` tuples so that at
trigger/match time the round call is done *just prior* to msg send to
`brokerd` given the last known live L1 queue price.
Not sure how this lasted so long without complaint (literally since we
added history 1m OHLC it seems; guess it means most backends are pretty
tolerant XD ) but we've been sending 2 cancels per order (dialog) due to
the mirrored lines on each chart: 1s and 1m. This fixes that by
reworking the `OrderMode` methods to be a bit more sane and less
conflated with the graphics (lines) layer.
Deatz:
- add new methods:
- `.oids_from_lines()` line -> oid extraction,
- `.cancel_orders()` which makes the order client cancel requests from
a `oids: list[str]`.
- re-impl `.cancel_all_orders()` and `.cancel_orders_under_cursor()` to
use the above methods thus fixing the original bug B)
Since we want to be able to support user-configurable vnc socketaddrs,
this preps for passing the piker client direct into the vnc hacker
routine so that we can (eventually load) and read the ib brokers config
settings into the client and then read those in the `asyncvnc` task
spawner.
It's been getting setup in the `brokerd` daemon-actor spawn task for
a while now and worker tasks already get a ref to that global log
instance so they don't need to care (in data or trading) task spawn
endpoints.
Also move to the new `open_trade_dialog()` naming for working broker
backends B)
Discovered due to originally having a history loading bug between
btcusdt futes display where the same time series was being loaded into
the graphics system, this avoids the issue where 2 (or more) curves are
measured to have the same dispersion and thus do not get added as unique
entries to the `overlay_table: dict[float, tuple]` during the scaling
phase..
Practically speaking this should never really be a problem if the curves
(and their backing timeseries) are indeed unique but keying the
overlay table by the dispersion and the `Viz` is a minimal performance
hit when looping the sorted table and is a lot nicer then you **do want
to show** duplicate curves then having one overlay just not be ranged
correctly at all XD
Instead of effectively (and poorly) duplicating the trade dialog setup
logic, just use the new helper we exposed in the EMS module B)
Also, handle paper accounts that have no ledger / positions existing.
As part of bringing the brokerd agnostic APIs up to date and modernizing
wrapping CLIs, this adds a new sub-cmd to allow more or less directly
calling the `.get_mkt_info()` broker mod endpoint and dumping the both
the backend specific `Pair`-ish and `.accounting.MktPair` normalized
version to console.
Deatz:
- make the click config's `brokermods` entry a `dict`
- make `.brokers.core.mkt_info()` strip the broker name part from the
input fqme before calling the backend.
Connecting to a `brokerd` daemon's trading dialog via a helper `@acm`
func is handy so that arbitrary trading middleware clients **and** the
ems can setup a trading dialog and, at the least, query existing
position state; this is in fact our immediate need when simply querying
for an account's position status in the `.accounting.cli.ledger` cli.
It's now exposed (for now) as `.clearing._ems.open_brokerd_dialog()` and
is called by the `Router.maybe_open_brokerd_dialog()` for every new
relay allocation or paper-account engine instance.
Changed from the old `store clone` to instead simply load any shm buffer
matching a user provided `FQME: str` pattern; writing to parquet file is
only done if an explicit option flag is passed by user.
Implement new `iter_dfs_from_shms()` generator which allows interatively
loading both 1m and 1s buffers delivering the `Path`, `ShmArray` and
`polars.DataFrame` instances per matching file B)
Also add a todo for a `NativeStorageClient.clear_range()` method.
Also adjust sizing such that the history buffer will backfill the last
six years by default (in 1m OHLC) and the hft buffer will do only 3 days
worth. Also ensure the fsp layer passes the src shm's buffer size when
allocating since the size is now required by allocators in the shm apis.
Avoid unnecessarily re-rendering the wrong (1min OHLC history) chart
and/or other such charts with update tasks listening to the sampler
stream. Instead only redraw in tasks which are updating vizs which match
the actual details of the backfill event.
We can probably also eventually match against a range tuple (emitted in
the msg) and then have the task further only update the formatter layer
unless the range is actually in view?
It's no longer part of the default OHLCV array-buffer schema and just
generally we should be processing and managing **any** non source data
in the FSP subsystem(s) despite it maybe being provided as a default by
some backends.
Explains why stuff always seemed wrong before XD
Previously whenever a time-gappy asset (like a stock due to it's venue
operating hours) was being loaded, we weren't querying for a "durations
worth" of bars and this was causing all sorts of actual gaps in our
data set that shouldn't exist..
Fix that by always attempting to retrieve a min aggregate-time's
worth/duration of bars/datums in the history manager. Actually,
i implemented this in both the feed and api layers for this backend
since it doesn't seem to strictly work just implementing it at the
`Client.bars()` level, not sure why but..
Also, buncha `ruff` linting cleanups and fix the logger nameeee, lel.
For now, just detect and fill in gaps (via fresh backend queries)
*in the shm buffer* but eventually i'm pretty sure we can just write
these direct to the parquet file as well.
Use the new `.data._timeseries.detect_null_time_gap()` to find and fill
in the `ShmArray` index range, re-check it and enter a prompt if it
didn't totally fill.
Also,
- do a massive cleanup and removal of all unused/commented code.
- drop the duplicate frames tracking, don't think we need it after
removing multi-frame concurrent queries.
- change backfill loop variable `end_dt` -> `last_start_dt` which is
more semantically correct.
- fix logic to backfill any missing sub-sequence portion for any frame
query that overruns the shm buffer prependable space by detecting
the available rows left to insert and only push those.
- add a new `shm_push_in_between()` helper to match.
It took a little while (and a lot of commenting out of old no longer
needed code) but, this gets tsdb (from parquet file) loading *before*
final backfilling from the most recent history frame until the most
recent tsdb time stamp!
More or less all the convoluted concurrency shit we had for coping with
`marketstore` IPC junk is no longer needed, particularly all the query
size limits and accompanying load loops.. The recent frame loading
technique/order *has* now changed though since we'd like to show charts
asap once tsdb history loads.
The new load sequence is as follows:
- load mr (most recent) frame from backend.
- load existing history (one shot) from the "tsdb" aka parquet files
with `polars`.
- backfill the gap part from the mr frame back to the tsdb start
incrementally by making (hacky) `ShmArray.push(start=<blah>)` calls
and *not* updating the `._first.value` while doing it XD
Dirtier deatz:
- make `tsdb_backfill()` run per timeframe in a separate task.
- drop all the loop through timeframes and insert `dts_per_tf` crap.
- only spawn a subtask for the `start_backfill()` call which in turn
only does the gap backfilling as mentioned above.
- mask out all the code related to being limited to certain query sizes
(over gRPC) as was restricted by marketstore.. not gonna go through
what all of that was since it's probably getting deleted in a follow
up commit.
- buncha off-by-one tweaks to do with backfilling the gap from mr frame
to tsdb start.. mostly tinkered it to get it all right but seems to be
working correctly B)
- still use the `broadcast_all()` msg stuff when doing the gap backfill
though don't have it really working yet on the UI side (since
previously we were relying on the shm first/last values.. so this will
be "coming soon" :)
For OHLCV time series we normally presume a uniform sampling period
(1s or 60s by default) and it's handy to have tools to ensure a series
is gapless or contains expected gaps based on (legacy) market hours.
For this we leverage `polars`:
- add `.nativedb.with_dts()` a datetime-from-epoch-time-column frame
"column-expander" which inserts datetime-casted, epoch-diff and
dt-diff columns.
- add `.nativedb.detect_time_gaps()` which filters to any larger then
expected sampling period rows.
- wrap the above (for now) in a `piker store anal` (analysis) cmd which
atm always enters a breakpoint for tinkering.
Supporting storage client additions:
- add a `detect_period()` helper for extracting expected OHLC time step.
- add new `NativedbStorageClient` methods and attrs to provide for the above:
- `.mk_path()` to **only** deliver a parquet-file path for use in
other methods.
- `._dfs` to house cached `pl.DataFrame`s loaded from `.parquet` files.
- `.as_df()` which loads cached frames or loads them from disk and
then caches (for next use).
- `_write_ohlcv()` a private-sync version of the public equivalent
meth since we don't currently have any actual async file IO
underneath; add a flag for whether to return as a `numpy.ndarray`.
- drop buncha cruft from `store ls` cmd and make it work for
multi-backend fqme listing.
- including adding an `.address` to the mkts client which shows the
grpc socketaddr details.
- change defauls to new `'nativedb'.
- drop 'marketstore' from built-in backend list (for now)
It was a concurrency-hack mess somewhat due to all sorts of limitations
imposed by marketstore (query size limits, strange datetime/timestamp
errors, slow table loads for large queries..) and we can drastically
simplify. There's still some issues with getting new backfills (not yet
in storage) correctly prepended: there's sometimes little gaps due to shm
races when reading history indexing vs. when the live-feed startup
finishes.
We generally need tests for all this and likely a better rework of the
feed layer's init such that we're showing history in chart afap instead
of waiting on backfills or the live feed to come up.
Much more to come B)
Turns out no backend (including kraken) requires it and really this
kinda of measure should be implemented and recorded from our fsp layer
instead of (hackily) sometimes expecting it to be in "source data".
After much frustration with a particular tsdb (cough) this instead
implements a new native-file (and apache tech based) backend which
stores time series in parquet files (for now) using the `polars` apis
(since we plan to use that lib as well for processing).
Note this code is currently **very** rough and in draft mode.
Details:
- add conversion routines for going from `polars.DataFrame` to
`numpy.ndarray` and back.
- lay out a simple file-name as series key symbology:
`fqme.<datadescriptions>.parquet`, though probably it will evolve.
- implement the entire `StorageClient` interface as it stands.
- adjust `storage.cli` cmds to instead expect to use this new backend,
which means it's a complete mess XD
Main benefits/motivation:
- wayy faster load times with no "datums to load limit" required.
- smaller space footprint and we haven't even touched compression
settings yet!
- wayyy more compatible with other systems which can lever the apache
ecosystem.
- gives us finer grained control over the filesystem usage so we can
choose to swap out stuff like the replication system or networking
access.
Turns out just (over)writing `.parquet` files with >= 1M datums is like
less then a second, and we can likely speed up appends using
`fastparquet` (usage coming soon).
Includes:
- a new `clone` CLI subcmd to test this all out by ad-hoc copy of
(literally hardcoded to a daemon-actor specific shm allocation X) an
existing `/dev/shm/<ShmArray>` and push to `.parquet` file.
- code to convert from our `ShmArray.array: np.ndarray` ->
`polars.DataFrame` (thanks SO).
- timing checks around the file IO and np -> polars conversion.
- a `read` subcmd which i was using to test the sync `pymarketstore`
client against our async one to see if the issues from
https://github.com/pikers/piker/issues/443 were resolved, but nope!
Turns out trying to switch to the old sync client and going back to
using the old json-RPC API (after having had to patch the upstream repo
to not import gRPC machinery to avoid crashes..) I'm basically getting
the exact same issues.
New tinkering results does possibly tell some new stuff:
- the EOF error seems to indeed be due to trying fetch records which haven't been
written (properly) - like asking for a `end=<epoch_int>` that is
earlier then the earliest record.
- the "snappy input corrupt" error seems to have something to do with
the `Params.end` field not being an `int` and/or the int precision not
being chosen correctly?
- toying with this a bunch manually shows that the internals of the
client (particularly `.build_query()` stuff) is parsing/calcing the
`Epoch` and `Nanoseconds` values out incorrectly.. which is likely
part of the problem.
- we also changed `anyio_marketstore.MarketStoreclient.build_query()`
logic when removing `pandas` a while back, which also seems to be
part of the problem on the async side, however reverting those
changes also didn't fix the issue entirely; likely something else
more subtle going on (maybe with the write vs. read `Epoch` field
type we pass?).
Despite all this malarky, we're already underway more or less obsoleting
this whole thing with a much less complex approach of using apache
parquet files and modern filesystem tools to get a more flexible and
numerics-native dataframe-oriented tsdb B)
Turns out the reason we were originally making the `time: float` column in our
ohlcv arrays was bc that's what **only** ib uses XD (and/or 🤦)
Instead we changed the default field type to be an `int` (which is also
more correct to avoid `float` rounding/precision discrepancies) and thus
**do not need to override it** in all other (crypto) backends (except
`ib`). Now we only do the customization (via `._ohlc_dtype`) to `float`
only for `ib` for now (though pretty sure we can also not do that
eventually as well..)!
Use `def_iohlcv_fields` for a name and instead of copying and inserting
the index field pop it for the non-index version. Drop creating
`np.dtype()` instances since `numpy`'s apis accept both input forms so
this is simpler on our end.
Turns out you can mix and match `click` with `typer` so this moves what
was the `.data.cli` stuff into `storage.cli` and uses the integration
api to make it all work B)
New subcmd: `piker store`
- add `piker store ls` which lists all fqme keyed time-series from backend.
- add `store delete` to remove any such key->time-series.
- now uses a nursery for multi-timeframe concurrency B)
Mask out all the old `marketstore` specific subcmds for now (streaming,
ingest, storesh, etc..) in anticipation of moving them into
a subpkg-module and make sure to import the sub-cmd module in our top
level cli package.
Other `.storage` api tweaks:
- drop the reraising with custom error (for now).
- rename `Storage` -> `StorageClient` (or should it be API?).
To kick off our (tsdb) storage backends this adds our first implementing
a new `Storage(Protocol)` client interface. Going foward, the top level
`.storage` pkg-module will now expose backend agnostic APIs and helpers
whilst specific backend implementations will adhere to that middle-ware
layer.
Deats:
- add `.storage.marketstore.Storage` as the first client implementation,
moving all needed (import) dependencies out from
`.service.marketstore` as well as `.ohlc_key_map` and `get_client()`.
- move root `conf.toml` loading from `.data.history` into
`.storage.__init__.open_storage_client()` which now takes in a `name:
str` and does all the work of loading the correct backend module, its
config, and determining if a service-instance can be contacted and
a client loaded; in the case where this fails we raise a new
`StorageConnectionError`.
- add a new `.storage.get_storagemod()` just like we have for brokers.
- make `open_storage_client()` also return the backend module such that
the history-data layer can make backend specific calls as needed (eg.
ohlc_key_map).
- fall back to a basic non-tsdb backfill when `open_storage_client()`
raises the new connection error.
The plan is to offer multiple tsdb and other storage backends (for
a variety of use cases) and expose them similarly to how we do for
broker and data providers B)