It's been getting setup in the `brokerd` daemon-actor spawn task for
a while now and worker tasks already get a ref to that global log
instance so they don't need to care (in data or trading) task spawn
endpoints.
Also move to the new `open_trade_dialog()` naming for working broker
backends B)
Discovered due to originally having a history loading bug between
btcusdt futes display where the same time series was being loaded into
the graphics system, this avoids the issue where 2 (or more) curves are
measured to have the same dispersion and thus do not get added as unique
entries to the `overlay_table: dict[float, tuple]` during the scaling
phase..
Practically speaking this should never really be a problem if the curves
(and their backing timeseries) are indeed unique but keying the
overlay table by the dispersion and the `Viz` is a minimal performance
hit when looping the sorted table and is a lot nicer then you **do want
to show** duplicate curves then having one overlay just not be ranged
correctly at all XD
Instead of effectively (and poorly) duplicating the trade dialog setup
logic, just use the new helper we exposed in the EMS module B)
Also, handle paper accounts that have no ledger / positions existing.
As part of bringing the brokerd agnostic APIs up to date and modernizing
wrapping CLIs, this adds a new sub-cmd to allow more or less directly
calling the `.get_mkt_info()` broker mod endpoint and dumping the both
the backend specific `Pair`-ish and `.accounting.MktPair` normalized
version to console.
Deatz:
- make the click config's `brokermods` entry a `dict`
- make `.brokers.core.mkt_info()` strip the broker name part from the
input fqme before calling the backend.
Connecting to a `brokerd` daemon's trading dialog via a helper `@acm`
func is handy so that arbitrary trading middleware clients **and** the
ems can setup a trading dialog and, at the least, query existing
position state; this is in fact our immediate need when simply querying
for an account's position status in the `.accounting.cli.ledger` cli.
It's now exposed (for now) as `.clearing._ems.open_brokerd_dialog()` and
is called by the `Router.maybe_open_brokerd_dialog()` for every new
relay allocation or paper-account engine instance.
Changed from the old `store clone` to instead simply load any shm buffer
matching a user provided `FQME: str` pattern; writing to parquet file is
only done if an explicit option flag is passed by user.
Implement new `iter_dfs_from_shms()` generator which allows interatively
loading both 1m and 1s buffers delivering the `Path`, `ShmArray` and
`polars.DataFrame` instances per matching file B)
Also add a todo for a `NativeStorageClient.clear_range()` method.
Also adjust sizing such that the history buffer will backfill the last
six years by default (in 1m OHLC) and the hft buffer will do only 3 days
worth. Also ensure the fsp layer passes the src shm's buffer size when
allocating since the size is now required by allocators in the shm apis.
Avoid unnecessarily re-rendering the wrong (1min OHLC history) chart
and/or other such charts with update tasks listening to the sampler
stream. Instead only redraw in tasks which are updating vizs which match
the actual details of the backfill event.
We can probably also eventually match against a range tuple (emitted in
the msg) and then have the task further only update the formatter layer
unless the range is actually in view?
It's no longer part of the default OHLCV array-buffer schema and just
generally we should be processing and managing **any** non source data
in the FSP subsystem(s) despite it maybe being provided as a default by
some backends.
Explains why stuff always seemed wrong before XD
Previously whenever a time-gappy asset (like a stock due to it's venue
operating hours) was being loaded, we weren't querying for a "durations
worth" of bars and this was causing all sorts of actual gaps in our
data set that shouldn't exist..
Fix that by always attempting to retrieve a min aggregate-time's
worth/duration of bars/datums in the history manager. Actually,
i implemented this in both the feed and api layers for this backend
since it doesn't seem to strictly work just implementing it at the
`Client.bars()` level, not sure why but..
Also, buncha `ruff` linting cleanups and fix the logger nameeee, lel.
For now, just detect and fill in gaps (via fresh backend queries)
*in the shm buffer* but eventually i'm pretty sure we can just write
these direct to the parquet file as well.
Use the new `.data._timeseries.detect_null_time_gap()` to find and fill
in the `ShmArray` index range, re-check it and enter a prompt if it
didn't totally fill.
Also,
- do a massive cleanup and removal of all unused/commented code.
- drop the duplicate frames tracking, don't think we need it after
removing multi-frame concurrent queries.
- change backfill loop variable `end_dt` -> `last_start_dt` which is
more semantically correct.
- fix logic to backfill any missing sub-sequence portion for any frame
query that overruns the shm buffer prependable space by detecting
the available rows left to insert and only push those.
- add a new `shm_push_in_between()` helper to match.
It took a little while (and a lot of commenting out of old no longer
needed code) but, this gets tsdb (from parquet file) loading *before*
final backfilling from the most recent history frame until the most
recent tsdb time stamp!
More or less all the convoluted concurrency shit we had for coping with
`marketstore` IPC junk is no longer needed, particularly all the query
size limits and accompanying load loops.. The recent frame loading
technique/order *has* now changed though since we'd like to show charts
asap once tsdb history loads.
The new load sequence is as follows:
- load mr (most recent) frame from backend.
- load existing history (one shot) from the "tsdb" aka parquet files
with `polars`.
- backfill the gap part from the mr frame back to the tsdb start
incrementally by making (hacky) `ShmArray.push(start=<blah>)` calls
and *not* updating the `._first.value` while doing it XD
Dirtier deatz:
- make `tsdb_backfill()` run per timeframe in a separate task.
- drop all the loop through timeframes and insert `dts_per_tf` crap.
- only spawn a subtask for the `start_backfill()` call which in turn
only does the gap backfilling as mentioned above.
- mask out all the code related to being limited to certain query sizes
(over gRPC) as was restricted by marketstore.. not gonna go through
what all of that was since it's probably getting deleted in a follow
up commit.
- buncha off-by-one tweaks to do with backfilling the gap from mr frame
to tsdb start.. mostly tinkered it to get it all right but seems to be
working correctly B)
- still use the `broadcast_all()` msg stuff when doing the gap backfill
though don't have it really working yet on the UI side (since
previously we were relying on the shm first/last values.. so this will
be "coming soon" :)
For OHLCV time series we normally presume a uniform sampling period
(1s or 60s by default) and it's handy to have tools to ensure a series
is gapless or contains expected gaps based on (legacy) market hours.
For this we leverage `polars`:
- add `.nativedb.with_dts()` a datetime-from-epoch-time-column frame
"column-expander" which inserts datetime-casted, epoch-diff and
dt-diff columns.
- add `.nativedb.detect_time_gaps()` which filters to any larger then
expected sampling period rows.
- wrap the above (for now) in a `piker store anal` (analysis) cmd which
atm always enters a breakpoint for tinkering.
Supporting storage client additions:
- add a `detect_period()` helper for extracting expected OHLC time step.
- add new `NativedbStorageClient` methods and attrs to provide for the above:
- `.mk_path()` to **only** deliver a parquet-file path for use in
other methods.
- `._dfs` to house cached `pl.DataFrame`s loaded from `.parquet` files.
- `.as_df()` which loads cached frames or loads them from disk and
then caches (for next use).
- `_write_ohlcv()` a private-sync version of the public equivalent
meth since we don't currently have any actual async file IO
underneath; add a flag for whether to return as a `numpy.ndarray`.
- drop buncha cruft from `store ls` cmd and make it work for
multi-backend fqme listing.
- including adding an `.address` to the mkts client which shows the
grpc socketaddr details.
- change defauls to new `'nativedb'.
- drop 'marketstore' from built-in backend list (for now)
It was a concurrency-hack mess somewhat due to all sorts of limitations
imposed by marketstore (query size limits, strange datetime/timestamp
errors, slow table loads for large queries..) and we can drastically
simplify. There's still some issues with getting new backfills (not yet
in storage) correctly prepended: there's sometimes little gaps due to shm
races when reading history indexing vs. when the live-feed startup
finishes.
We generally need tests for all this and likely a better rework of the
feed layer's init such that we're showing history in chart afap instead
of waiting on backfills or the live feed to come up.
Much more to come B)
Turns out no backend (including kraken) requires it and really this
kinda of measure should be implemented and recorded from our fsp layer
instead of (hackily) sometimes expecting it to be in "source data".
After much frustration with a particular tsdb (cough) this instead
implements a new native-file (and apache tech based) backend which
stores time series in parquet files (for now) using the `polars` apis
(since we plan to use that lib as well for processing).
Note this code is currently **very** rough and in draft mode.
Details:
- add conversion routines for going from `polars.DataFrame` to
`numpy.ndarray` and back.
- lay out a simple file-name as series key symbology:
`fqme.<datadescriptions>.parquet`, though probably it will evolve.
- implement the entire `StorageClient` interface as it stands.
- adjust `storage.cli` cmds to instead expect to use this new backend,
which means it's a complete mess XD
Main benefits/motivation:
- wayy faster load times with no "datums to load limit" required.
- smaller space footprint and we haven't even touched compression
settings yet!
- wayyy more compatible with other systems which can lever the apache
ecosystem.
- gives us finer grained control over the filesystem usage so we can
choose to swap out stuff like the replication system or networking
access.
Turns out just (over)writing `.parquet` files with >= 1M datums is like
less then a second, and we can likely speed up appends using
`fastparquet` (usage coming soon).
Includes:
- a new `clone` CLI subcmd to test this all out by ad-hoc copy of
(literally hardcoded to a daemon-actor specific shm allocation X) an
existing `/dev/shm/<ShmArray>` and push to `.parquet` file.
- code to convert from our `ShmArray.array: np.ndarray` ->
`polars.DataFrame` (thanks SO).
- timing checks around the file IO and np -> polars conversion.
- a `read` subcmd which i was using to test the sync `pymarketstore`
client against our async one to see if the issues from
https://github.com/pikers/piker/issues/443 were resolved, but nope!
Turns out trying to switch to the old sync client and going back to
using the old json-RPC API (after having had to patch the upstream repo
to not import gRPC machinery to avoid crashes..) I'm basically getting
the exact same issues.
New tinkering results does possibly tell some new stuff:
- the EOF error seems to indeed be due to trying fetch records which haven't been
written (properly) - like asking for a `end=<epoch_int>` that is
earlier then the earliest record.
- the "snappy input corrupt" error seems to have something to do with
the `Params.end` field not being an `int` and/or the int precision not
being chosen correctly?
- toying with this a bunch manually shows that the internals of the
client (particularly `.build_query()` stuff) is parsing/calcing the
`Epoch` and `Nanoseconds` values out incorrectly.. which is likely
part of the problem.
- we also changed `anyio_marketstore.MarketStoreclient.build_query()`
logic when removing `pandas` a while back, which also seems to be
part of the problem on the async side, however reverting those
changes also didn't fix the issue entirely; likely something else
more subtle going on (maybe with the write vs. read `Epoch` field
type we pass?).
Despite all this malarky, we're already underway more or less obsoleting
this whole thing with a much less complex approach of using apache
parquet files and modern filesystem tools to get a more flexible and
numerics-native dataframe-oriented tsdb B)
Turns out the reason we were originally making the `time: float` column in our
ohlcv arrays was bc that's what **only** ib uses XD (and/or 🤦)
Instead we changed the default field type to be an `int` (which is also
more correct to avoid `float` rounding/precision discrepancies) and thus
**do not need to override it** in all other (crypto) backends (except
`ib`). Now we only do the customization (via `._ohlc_dtype`) to `float`
only for `ib` for now (though pretty sure we can also not do that
eventually as well..)!
Use `def_iohlcv_fields` for a name and instead of copying and inserting
the index field pop it for the non-index version. Drop creating
`np.dtype()` instances since `numpy`'s apis accept both input forms so
this is simpler on our end.
Turns out you can mix and match `click` with `typer` so this moves what
was the `.data.cli` stuff into `storage.cli` and uses the integration
api to make it all work B)
New subcmd: `piker store`
- add `piker store ls` which lists all fqme keyed time-series from backend.
- add `store delete` to remove any such key->time-series.
- now uses a nursery for multi-timeframe concurrency B)
Mask out all the old `marketstore` specific subcmds for now (streaming,
ingest, storesh, etc..) in anticipation of moving them into
a subpkg-module and make sure to import the sub-cmd module in our top
level cli package.
Other `.storage` api tweaks:
- drop the reraising with custom error (for now).
- rename `Storage` -> `StorageClient` (or should it be API?).
To kick off our (tsdb) storage backends this adds our first implementing
a new `Storage(Protocol)` client interface. Going foward, the top level
`.storage` pkg-module will now expose backend agnostic APIs and helpers
whilst specific backend implementations will adhere to that middle-ware
layer.
Deats:
- add `.storage.marketstore.Storage` as the first client implementation,
moving all needed (import) dependencies out from
`.service.marketstore` as well as `.ohlc_key_map` and `get_client()`.
- move root `conf.toml` loading from `.data.history` into
`.storage.__init__.open_storage_client()` which now takes in a `name:
str` and does all the work of loading the correct backend module, its
config, and determining if a service-instance can be contacted and
a client loaded; in the case where this fails we raise a new
`StorageConnectionError`.
- add a new `.storage.get_storagemod()` just like we have for brokers.
- make `open_storage_client()` also return the backend module such that
the history-data layer can make backend specific calls as needed (eg.
ohlc_key_map).
- fall back to a basic non-tsdb backfill when `open_storage_client()`
raises the new connection error.
The plan is to offer multiple tsdb and other storage backends (for
a variety of use cases) and expose them similarly to how we do for
broker and data providers B)
Was borked on linux if you didn't provide the setting in `conf.toml` due
to some logic errors. Fix that by rejigging `DpiAwareFont` internal
variables:
- add new `._font_size_calc_key: str` which was the old `._font_size`
and is only used when no explicit font size is set by the user in the
`conf.toml` config:
- this is the "key" that is used to lookup a calculation function
which attempts to compute a best fit font size given the measured
system displays DPI settings and dimensions.
- make the `._font_size: int` the **actual** font size integer that is
cached and passed to `Qt` to set the size.
- this is overridden by user config now if defined.
- change the input kwarg `font_size: str` to the constructor to better
change the input kwarg `font_size: str` to the constructor to better
named private `_font_size_key: str` which gets set to the new
`._font_size_calc_key`.
Also, adjust all client code which instantiates `DpiAwareFont` to use
the new `_font_size_key` kwarg input so nothing breaks XD
Was only really borked for higher-precision but lower priced assets
(like TLOS or peeneez) which have a `MktPair.price_tick_digits >= 2`.
The issue was using the wrong attr, the `size_tick_digits`..
Including changing to `LinkedSplits.mkt: MktPair` and adding an explicit
setter method for setting it and being sure that nothing breaks
in the display system init!
For this commit we leave in warning access to `LinkedSplits.symbol` but
will remove in following commit.
Only stuff left was the allocator stuff. Drop the top level subpkg
exports and finally kill off the awkwardly named
`Symbol.lot_size_digits` properties XD
Expose a bunch more util funcs at subpkg top level, do some typing in
allocator method internals.
Contract matching in live setup was borked; switch to
`MktPair.dst.atype` matching, don't override the `cmdty` "venue" (a
weird special case) in `get_mkt_info()` otherwise lookup will fail..
Stash it for now in the (now mutable by default) `.shm_write_opts` and
have the new `Flume._has_vlm: bool` (only set to false internally by
feed layer) which can be read via new public `.has_vlm()` predicate.
Move out the old `.ui/_fsp` helper logic to this flume method.
Using the new `._ahab.start_ahab_service()` mngr of course, and now
support user config overrides (such that our defaults can be modified by
a keen user, say using a config file, or for testing). This is where the
functionality moved out of the `pikerd` init has been moved - instead of
being triggered by bool flag inputs to that factory.
For marketstore actually support overriding the entire yaml config via
runtime `_yaml_config_str: str` formatting with any passed user dict,
primarily focussing on supporting override of the sockaddrs for testing.
Allows for using the `Services.cancel_service()` api for explicit
cancellation in tests and eventually for remote teardown. Change
`.start_ahab()` to an `@acm` `start_ahab_service()` and just yield back
the same values we were returning prior. Also fix the logging (level) to
actually reflect what's passed in - we weren't using the correct name
/ instance from the `.sevice` subpkg..
`Flume.mkt.fqme` might not be exactly the same as the local
version now since we've had to add some hacks to certain backends
(cough ib) to handle `MktPair.src` not being set as an `Asset` (yet).
Since `open_trade_ledger()` now requires a sort we pass in a combo of
the std `pendulum.parse()` for API records and a custom flex parser for
flex entries pulled offline.
Add special handling for `MktPair.src` such that when it's a fiat (like
it should always be for most legacy assets) we try to get the fqme
without that `.src` token (i.e. not mnqusd) to avoid breaking
roundtripping of live feed requests (due to new symbology) as well as
the current tsdb table key set..
Do a wholesale renaming of fqsn -> fqme in most of the rest of the
backend modules.
Turns out that reading **and** writing with `tomlkit` is just wayya slow
for large documents like ledger files so move to using the `tomli`
sibling pkg `tomli-w` which seems to much improve on the latency, though
obviously longer run we're likely going to want:
- a better algorithm for only back loading records using as little
history as possible
- a different serialization format for production maybe something
like apache parquet?
The only issue with using a non-style-preserving writer is that we don't
necessarily get TOML conf ordering for free (without first ordering it
ourselves), and thus this patch also adds much more general date-time
sorting machinery which is now **required** when using
`open_trades_ledger()` via a `tx_sort: Callable`. By default we now
provide `.accounting._ledger.iter_by_dt()` (exposed in the subpkg mod)
which conducts dynamic "datetime key detection" based parsing of records
based on a `parsers: dict[str, Callabe]` input table. The default should
handle most use cases including all currently supported live backends
(kraken, ib) as well as our paper engine ledger-records format.
Granulars:
- adjust `Position.iter_clears()` to use new `iter_by_dt(key=lambda ..)`
signature.
- add `tomli-w` to setup and our `tomlkit` fork to requirements file.
- move `.write_config()` to bottom of class defn.
- fix closed pos popping to not error if pp was already popped..
Since porting all backends to the new `FeedInit` + `MktPair` + `Asset`
style init, we can now just directly pass a `MktPair` instance to the
history endpoint(s) since it's always called *after* the live feed
`.stream_quotes()` ep B)
This has a lot of benefits including allowing brokerd backends to have
more flexible, pre-processed market endpoint meta-data that piker has
already validated; makes handling special cases in much more straight
forward as well such as forex pairs from legacy brokers XD
First pass changes all crypto backends to expect this new input, ib will
come next after handling said special cases..
Since most (legacy) stock brokers design their symbology without
including the target exchange's source asset name - normally a fiat
currency like USD - this adds an option for rendering market endpoints
without that token for simpler use in backends for such brokers.
As an example IB doesn't expect a `mnq/usd.cme.ib` symbol and instead
presumes that since the CME lists all assets in USD then the source
asset is implied.
Impl details:
- add `MktPair.pair: str` which replaces `.key` as a better name.
- offer a `without_src: bool` to a new `.get_fqme()` getter method
which will render everything the same minus the src token.
- expose the new flag through both the new `.get_fqme()` and
`.get_bs_fqme()` methods and wrap those both under the original
property names `.bs_fqme` and `.fqme`.
Previously the subscription response handling was a bit sloppy what with
ignoring the welcome msg; this now correctly expects the correct startup
sequence. Also this avoids warn logging on pong messages by expecting
them in the msg loop and further drops the `KucoinMsg` struct and
instead changes the msg loop to expect `dict`s and only cast to structs
on live feed msgs that we actually process/relay.
Mostly renaming from the old acronym. This also contains necessary
conf.toml loading in order to call `open_storage_client()` which now
does not have default network contact info.
Previously we were passing the `fqme: str` which isn't as extensive nor
were we able to pass `MktPair` direct to backend history manager-loading
routines (which should be able to rely on always receiving it since
currently `stream_quotes()` is always called first for setup).
This also starts a slight bit of configuration oriented tsdb info
loading (via a new `conf.toml`) such that a user can decide to host
their (marketstore) db on a remote host and our container spawning and
client code will do the right startup automatically based on the config.
|-> Related to this I've added some comments about doing storage
backend module loading which should get actually written out as part of
patches coming in #486 (or something related).
Don't allow overruns again in history context since it seems it was
never a problem?
As per the new market info packing schema this patch almost gets it
completely compatible and useful via implementing the `get_mkt_info()`
backend module endpoint B)
There's still some questions around `MktPair.src` since all the contract
search machinery in the ib api isn't expecting a fiat currency in the
symbol key: for ex. `mnq/usd.cme.20230616.ib` has no handling for the
`[/]usd` part. For now i'm just excluding the `.src` since it requires
extra parsing on quotes-feed requests even though this is also currently
breaking forex pairs (idealpro or wtv). I think ideally we do move to
a `dst/src.<venue>.<etc..>` style but it's going to require adjustments
to all the existing crypto backends..
This also allows dropping the old `mk_init_msgs()` closure.
We need to allow overruns during the async multi-broker context spawning
init bc some backends might take longer then others to setup (eg.
binance vs. kucoin) and result in some context (stream) being overrun by
the time we get to the `.open_stream()` phase. Ideally, we can maybe
adjust the concurrent setup to be more of a task-per-provider style to
avoid this in the future - which would also in theory result in
more-immediate per-provider setup in terms showing ready feeds asap.
Also, does a bunch of renaming from fqsn -> fqme and drops the lower
casing of input symbols instead expecting the caller to know what the
data backend it's requesting is going to be able to handle in terms of
symbology.
Instead of pre-converting and mapping piker style fqmes to
`KucoinMktPair`s make `Client._pairs` keyed by the kucoin native market
ids and instead also create a `._fqmes2mktids: bidict[str, str]` for
doing lookups to the native pair from the fqme.
Also, adjust any remaining `fqsn` naming to fqme.
So that styling is preserved on write but requires that we pop `None`
values (in this case any unset `.expiry` transactions) due to `tomkit`
having no support for writing them as values?
So that we aren't creating blank files for legacy configs (as we do name
changes or wtv). Further change `.get_conf_path()` to validate against
new `account.` prefix and a god `conf.toml` file.
Allows for direct loading of an "account file configuration" contents
without having to pass the explicit config dir path. In this case we are
also rewriting the `pps.<brokername>.<accnt_name>.toml` file names to
instead have a `account.` prefix, but providing this helper function
allows such changes more easily in the future - since callers won't have
to use the lower level `.load()` input signature.
Also add some todo comments about moving to `tomlkit`.
I figure we might as well support multiple types of distributed
multi-host setups; why not allow running the API (gateway) and thus vnc
server on a diff host and allowing clients to connect and do their thing
B)
Deatz:
- make `ib._util.data_reset_hack()` take in a `vnc_host` which gets
proxied through to the `asyncvnc` client.
- pull `ib_insync.client.Client` host value and pass-through to data
reset machinery, presuming the vnc server is running in the same
container (and/or the same host).
- if no vnc connection **and** no i3ipc trick can be used, just report
to the user that they need to remove the data throttle manually.
- fix `feed.get_bars()` to handle throttle cases the same based on error
msg matching, not error the code and add a max `_failed_resets` count
to trigger bailing on the query loop.
No longer need to implement connection timeout logic in the streaming
code, instead we just `async for` that bby B)
Further refining:
- better `KucoinTrade` msg parsing and handling with object cases.
- make `subscribe()` do sub request in a loop wand wair for acks.
Previously you couldn't have a brokerd backend which defined
`.trades_dialogue()` but which could also indicate that the paper
clearing engine should be used. This adds that support by allowing the
endpoint task to return a simple `'paper'` string, in which case the ems
will boot a paperboi.
The obvious useful case for this is if you have a broker you want to use
but do not have actual broker credentials setup (yet) with that provider
in your `brokers.toml`; demonstrated here with the adjustment to
`kraken`'s startup to no longer raise a runtime error B)
To fit with the rest of the new requirements added in `.data.validate`
this adds `FeedInit` init including `MktPair` and `Asset` loading for
all spot currencies provided by `kucoin`.
Deatz:
- add a `Currency` struct and accompanying `Client.get_currencies()` for
storing all asset infos.
- implement `.get_mkt_info()` which loads all necessary accounting and
mkt meta-data structs including adding `.price/size_tick` fields to
the `KucoinMktPair`.
- on client boot, async spawn requests to cache both symbols and currencies.
- pass `subscribe()` as the `fixture` arg to `open_autorecon_ws()`
instead of opening it manually.
Other:
- tweak `Client._request` to not expect the prefixed `'/'` for the
`endpoint: str`.
- change the `api_v` arg to just be `api: str`.
Since it's a bit weird having service specific implementation details
inside the general service `._daemon` mod, and since i'd mentioned
trying this re-org; let's do it B)
Requires enabling the new mod in both `pikerd` and `brokerd` and
obviously a bit more runtime-loading of the service modules in the
`brokerd` service eps to avoid import cycles.
Also moved `_setup_persistent_brokerd()` into the new mod since the
naming would place it there even though the implementation really
wouldn't (longer run) since we want to split up `.data.feed` layer
backend-invoked eps into a separate actor eventually from the "actual"
`brokerd` which will be the actor running **only** the trade control eps
(eg. trades_dialogue()` and friends).
`trio`'s internals don't allow for async generator (and thus by
consequence dynamic reset of async exit stacks containing `@acm`s)
interleaving since doing so corrupts the cancel-scope stack. See details
in:
- https://github.com/python-trio/trio/issues/638
- https://trio-util.readthedocs.io/en/latest/#trio_util.trio_async_generator
- `trio._core._run.MISNESTING_ADVICE`
We originally tried to address this using
`@trio_util.trio_async_generator` in backend streaming code but for
whatever reason stopped working recently (at least for me) and it's more
or less implemented the same way as this patch but with more layers and
an extra dep. I also don't want us to have to address this problem again
if/when that lib isn't able to keep up to date with wtv `trio` is
doing..
So instead this is a complete rewrite of the conc design of our
auto-reconnect ws API to move all reset logic and msg relay into a bg
task which is respawned on reset-requiring events: user spec-ed msg recv
latency, network errors, roaming events.
Deatz:
- drop all usage of `AsyncExitStack` and no longer require client code
to (hackily) call `NoBsWs._connect()` on msg latency conditions,
intead this is all done behind the scenes and the user can instead
pass in a `msg_recv_timeout: float`.
- massively simplify impl of `NoBsWs` and move all reset logic into a
new `_reconnect_forever()` task.
- offer use of `reset_after: int` a count value that determines how many
`msg_recv_timeout` events are allowed to occur before reconnecting the
entire ws from scratch again.
Since we have made `MktPair.bs_mktid` mean something else now, change
all the feed setup var names to instead be more representative of the
actual value: `bs_fqme: str` and use the new `MktPair.bs_fqme` where
necessary.
The legacy version was a `dict` of `dicts` vs. now we want to be handed
a `list[FeedInit]`; process both in a factored way.
Drop `FeedInit.bs_mktid` since it's already defined on `.mkt.bs_mktid`
and we don't really need it top level.
More or less a replacement for what @guilledk did with the initial
attempt at a "broker check" type script a while back except in this case
we're going to always run this validation routine and it now uses a new
`FeedInit` struct to ensure backends are delivering the right schema-ed
data during startup. Also allows us to stick deprecation warnings / and
or strict API compat errors all in one spot (at least for live feeds).
Factors out a bunch of `MktPair` related adapter-logic into a new
`.validate.valiate_backend()` which warns to the backend implementer via
log msgs all the problems outstanding. Ideally we do our backend module
endpoint scan-and-complain regarding missing feature support from here
as well (eg. search, broker/trade ctl, ledger processing, etc.).
It needed some work..
- Make `unpack_fqme()` always return a 4-tuple handling the venue and
suffix parts more generally.
- add `Asset.Asset.guess_from_mkt_ep_key()` a like-it-sounds hack at
trying to render a `.dst: Asset` for most most purposes throughout the
stack.
- always try to preprocess the input `fqme: str` with `unpack_fqme()` in
`MktPair.from_fqme()` and use the new `Asset` method (above) to make
up a `.dst: Asset` pulling as much meta-info we can from the caller.
- add `MktPair.bs_fqme` to get the thing without the broker part..
- add an `'unknown'` value to the `_derivs` def.
- drop `Symbol.from_fqsn()` and `unpack_fqsn()` more generally (yes
BREAKING).
- only preload necessary (one for clearing, all for ledger sync)
`MktPair` info from the backend using `.get_mkt_info()`, build the
`mkt_by_fqme: dict[str, MktPair]` and pass it to
`TransactionLedger.iter_trans()`.
- use new `TransactionLedger.update_from_t()` method on clears.
- sanity check all `mkt_by_fqme` entries against `Flume.mkt` values
when we open a data feed.
- rename `PaperBoi._syms` -> `._mkts`.
Turns out we actually had further pp entry bugs due to *not quantizing*
the size inside `.minimize_clears()` method calcs; fix that using
`Position.sys.mkt.quantize()` as is done in `Position.calc_size()`.
Fix `PpTable.write_config()` to drop from the TOML config any
`closed: dict[str, Position]` entries delivered by `.dump_active()`.
Add a more detailed doc string for our position type and a little todo
for the `.bep` B)
Since ledger records are often provided (and thus stored) from most
backends *without* containing the info we normally need for accounting
defined by `MktPair`, this extends the ledger method to take in a table
that allows assigning the `Transaction.sys` from an fqme lookup. This
way client code (like the paper engine and new ledger mgmt tools) can
do the mkt info lookup before hand and then load both ledger
`Transactions` and positions via the `PpTable` and get correct
accounting calculations, always :fingers_crossed:
Also adds `TransactionLedger.update_from_t(t: Transaction)` to allow
updating directly from an existing tran instead of making the user cast
to a `dict` first. Includes fix to `.to_dict()` to always pop the `.sym`
again to avoid client code having to do so.
Wow not sure how that happened, but we should probably use the correct
market precision info for the correct parameter..
Also, use `@lru_cache` on new `get_mkt_info()` ep, seems to work?
In `.feed` and `._sampling` move to using the new
`tractor.Context.open_stream(allow_overruns: bool)` (cough, A BREAKING
CHANGE).
Also set `Flume.mkt` during construction in `.feed.open_feed()`.
Might as well try and flip it over to the new type; make appropriate
dict serialization changes in `.to_msg()`. Alias back to `.symbol:
Symbol` with a property.
Frickin ib, they give you the `0.001` (or wtv) in the
`ContractDetails.minSize: float` but won't accept fractional sizes
through the API.. Either way, it's probably not sane to be supporting
fractional order sizes for legacy instruments by default especially
since it in theory affects a lot of the clearing outcomes by having ib
do wtv magical junk behind the scenes to make it work..
Order mode previously was just willy-nilly sending `float` prices
(particularly on order edits) which are generated from the associated
level line. This actually uses the `MktPair.price_tick: Decimal` to
ensure the value is rounded correctly before submission to the ems..
Also adjusts the order mode init to expect a table of tables of startup
position messages, with the inner table being keyed by fqme per msg.
Tried a couple libs and ended up sticking with `rich` (since it's the
sibling lib to `typer`) but also (initially) implemented a version with
`blessings` that I ended up commenting out (and will likely remove).
Adjusted the CLI I/O a slight bit as well:
- require a fully qualified account name of the form:
`<brokername>.<accountname>` and error on non-matching input.
- dump positions summary lines as humanized size, ppu and cost basis
values per line.
When processing paper trades ledgers we normally won't have specific
`MktPair` info for the backend market we're simulating, as such we
need to look up this info when updating pps.toml files such that we
get precision info correct (particularly in the case of cryptos!) and
can also run paper ledger processing without running the simulated
clearing loop. In order to make it happen we lookup any `get_mkt_info()`
ep on the backend and pass the output to the `force_mkt` input of the
`PpTable.update_from_trans()` method.
This will (likely) act as a new backend query endpoint for other `piker`
(client) code to lookup `MktPair` info from each backend. To start it
also returns the backend-broker's local `Pair` (or wtv other type) as
well.
The main motivation for this is for our paper engine which can require
the mkt info when processing paper-trades ledgers which do not contain
appropriate info to compute position metrics.
Instead of stripping the broker part just use the full fqme for all
`Transaction.bs_mktid: str` values since it makes indexing the `PpTable`
much easier with less key mangling..
Change the root-service-task entrypoint to accept the level and
setup a console log as is now expected for all sub-services. Cast all
backend delivered startup `BrokerdPosition` msgs and log them to
console.
If you want a sub-actor to write console logs (with the right level) the
`get_console_log()` call has to be made somewhere during service task
startup. Previously this wasn't well formalized nor used (depending on
daemon) so passing `loglevel` to the service's root-task-endpoint (eg.
`_setup_persistent_brokerd()`) encourages that the daemon's logging is
configured during init according to the spawner's requesting logging
config. The previous `get_console_log()` call happening inside
`maybe_spawn_daemon()` wasn't actually doing anything in the target
daemon XD, so obviously remove that and instead passthrough loglevel
to the ctx endpoints and service manager methods.
Turns out we don't hookup our eventkit handler until after the
`load_aio_clients()` is complete, which means we can't get
`ib_insync.Client.apiError` events unless inside the asyncio side task.
So I guess try to report any such errors during API scan (note the
duplicate client id case is a special one from ibis itself) even though
we're not going to catch them trio side. The hack to work around this is
to just increment the client id value with the `connect_retries` led `i`
value even though that will break on more then 3 clients attached to an
API endpoint lul ..
Further adjustments that were to the end of trying to fix this proper:
- add `remove_handler_on_err()` cm to disconnect a handler when the trio
side of the channel closes.
- actually connect to client api erros in our `Client.inline_errors()`
- increase connect timeout to a sec.
- change the trio-asyncio proxy response-msg loop over to `match:`
syntax and raise on unhandled msgs from eventkit handlers.
We previously only offered a sync API (which was recently renamed to
`.<meth>_nowait()` style) since initially all order control was from our
`OrderMode` Qt driven UI/UX. This adds the equivalent async methods for
both testing as well as eventual auto-strat driven control B)
Also includes a bunch of renaming:
- `OrderBook` -> `OrderClient`.
- better internal renaming of the client's mem chan vars and add a ref
`._ems_stream: tractor.MsgStream`.
- drop `get_orders()` factory, just always check for the actor-global
instance and always set the ems stream on that client (in case old one
was closed).
This will end up being super handy for testing our accounting subsystems
as well as providing unified and simple cli utils for managing ledgers
and position tracking. Allows loading the paper boi without starting
a data feed and instead just trigger ledger and pps loading without
starting the entire clearing engine.
Deatz:
- only init `PaperBoi` and start clearing loop (tasks) if a non-`None`
fqme is provided, ow just `Context.started()` the existing pps msgs
as loaded from the ledger.
- always update both the ledger and pp table on startup and pass
a single instance of each obj to the `PaperBoi` for reuse (without
opening and closing backing config files since we now have
`.write_config()`).
- drop the global `_positions` dict, it's not needed any more if we use
a `PaperBoi.ppt: PpTable` which persists with the engine actor's
lifetime.
Add a new `class TransactionLedger(collections.UserDict)` for managing
ledger (files) from a `dict`-like API. The main motivations being easy
conversion between `dict` <-> `Transaction` obj forms as well as dynamic
(toml) file updates via a set of methods:
- `.write_config()` to render and write state to the local toml file.
- `.iter_trans()` to allow iterator style conversion to `Transaction`
form for each entry.
- `.to_trans()` for the dict output from the above.
Some adjustments to `Transaction` namely making `.sym/.sys` optional for
now so that paper engine entries can be loaded (offline) without
connecting to the emulated broker backend. Move to using `pathlib.Path`
throughout for bootyful toml file mgmt B)
When loading a `Position` from a pps file we might not have the entire
`MktPair` field-set loaded (though going forward that shouldn't really
ever happen except in the case of a legacy `pps.toml`), in which case we
can check if the `.fqme: str` value loaded from the transaction is
longer and use that instead - presuming it must have more mkt meta-data
filled out.
Also includes some more `fqsn` -> `fqme` renames.
Been meaning to do this port for a while and since it makes passing
around file handles (presumably alongside the in mem obj form) a lot
simpler/nicer and the implementations of all the config file handling
much more terse with less presumptions about the form of filename/dir
`str` values all over the place B)
moar technically, let's us:
- drop remaining `.config` usage of `os.path`.
- return `Path`s from most routines.
- adds a special case to `get_conf_path()` such that if the input name
contains a `pps.` pattern, we avoid validating the name; this is going
to be used by new `.accounting.open_pps()` code which will instead
write a separate TOML file for each account B)
Previous we were re-processing all ledgers for every position msg
received from the API, per client.. Instead do that once in a first pass
and drop all key-miss lookups for `bs_mktid`s; it should never happen.
Better typing for in-routine vars, convert pos msg/objects to `dict`
prior to logging so it's sane to read on console. Skip processing
specifically option contracts for now.
Turns out `binance` is pretty great with their schema since they have
more or less the same data schema for their exchange info ep which we
wrap in a `Pair` struct:
https://binance-docs.github.io/apidocs/spot/en/#exchange-information
That makes it super easy to provide the most general case for filling
out a `MktPair` with both `.src/dst: Asset` to maintain maximum
meta-data B)
Deatz:
- adjust `Pair` to have `.size/price_tick: Decimal` by parsing out
the values from the filters field; TODO: we should probably just rewrite
the input `.filter` at init time so we can keep the frozen style.
- rename `Client.mkt_info()` (was `.symbol_info` to `.exch_info()`
better matching the ep name and have it build, cache, and return
a `dict[str, Pair]`; allows dropping `.cache_symbols()`
- only pass the `mkt_info: MktPair` field in the init msg!
Accept a msg with any of:
- `.src: Asset` and `.dst: Asset`
- `.src: str` and `.dst: str`
- `.src: Asset` and `.dst: str`
but not the final combo tho XD
Also, fix `.key` to properly cast any `.src: Asset` to string!
If user has loaded from a flex report then we don't want the API records
from the same period to override those; instead just update with any
missing fields from the API schema.
Also, always `str`-ify the contract id (what is set for the `.bs_mktid`
*before* packing into transaction type to ensure when serialized to
`pps.toml` there are no discrepancies at the codec level.. smh
Instead adjust `load_aio_clients()` to only reload clients detected as
non-loaded or disconnected (2 birds), and avoid use of the global module
table which could result in stale disconnected clients persisting on
multiple `brokerd` client reconnects, resulting in error.
To make nested `msgspec.Struct`s work we need to tell the codec that the
`.symbol` is some struct def, since we don't really need to enforce that
(yet) we're just going to enc/dec as `str` until we further formalize
and/or need something more complex.
Initial attempt at getting the sampling and shm layer to use the new mkt
info meta-data type. Draft out a potential `BackendInitMsg:
msgspec.Struct` for validating the init msg returned from the
`stream_quotes()` start value; obvs don't actually use it yet.
To be compat with the `Symbol` (for now) and generally allow for reading
the (derivative) contract specific part of the fqme. Adjust
`contract_info: list[str]` and make `src: str = ''` by default.
Add `MktPair` handling block for when a backend delivers
a `mkt_info`-field containing init msg. Adjust the original
`Symbol`-style `'symbol_info'` msg processing to do `Decimal` defaults
and convert to `MktPair` including slapping in a hacky `_atype: str`
field XD
General initial name changes to `bs_mktid` and `_fqme` throughout!
For `price_tick` and `size_tick` we read in `str` and decimal-ize
and now correctly fail over to default values of the same type..
Also, always treat `bs_mktid` field as a `str` in TOML form.
Drop the strange `clears: dict` var from the loading code (not sure why
that was left in smh) and better name `toml_clears_list` for the
TOML-loaded-pre-transaction sequence.
Handle case where `'dst'` field is just a `str` (in which case delegate to
`.from_fqme()`) as well as do `Asset` loading and use our
`Struct.copy()` to enforce type-casting to (for eg. `Decimal`s) such
that we'll now capture typing errors despite IPC transport.
Change `Symbol.tick_size` and `.lot_tick_size` defaults to decimal
for proper casting and type `MktPair.atype: str` since `msgspec` can't
cast to `AssetTypeName` without special handling..
Allows building a `MktPair` from the backend specific `Pair` for
eventual use in the data feed layer. Also adds `Pair.price/tick_size` to
get to the expected tick precision info format.
Add a logic branch for now that switches on an instance check.
Generally swap over all `Position.symbol` and `Transaction.sym` refs to
`MktPair`. Do a wholesale rename of all `.bsuid` var names to
`.bs_mktid`.
Instead let's name it `.sys` for "system", the thing we use to conduct
the "transactions" ..
Also rename `.bsuid` -> `.bs_mktid` for "backend system market id`
which is more explicit, easier to remember and read.
Prepping to entirely replace `Symbol`; this adds a buncha docs/comments,
better implementation for representing and parsing the FQME: "fully
qualified market endpoint".
Deatz:
- make `.src` an optional field until we figure out how we're going
to support loading source assets from all backends sensibly..
- implement `MktPair.fqme: str` (what was previously called `fqsn`)
using a new util func: `maybe_cons_tokens()`.
- `Symbol.brokers` and expect only `.broker` usage.
- remap anything with `fqsn` in the name to `fqme` with aliases from the
old name.
- implement `unpack_fqme()` with `match:` syntax B)
- add `MktPair.tick_size_digits`, `.lot_size_digits`, `.fqsn`, `.key` for
backward compat.
- make all fqme generation related fields empty `str`s by default.
- add `MktPair.resolved: bool` a flag indicating whether or not `.dst`
is an `Asset` instance or just a string and, `.bs_mktid` the field
to hold the "backend system market id" per broker.
Try out using our new internal type for storing info about kraken's asset
infos now stored in the `Client.assets: dict[str, Asset]` table. Handle
a server error when requesting such info msgs.
Drop everything we can in terms of methods and attrs from `Symbol`:
- kill `.tokens()`, `.front_feed()`, `.tokens()`, `.nearest_tick()`,
`.front_fqsn()`, instead moving logic from these methods into
dependents (and obviously removing any usage from rest of code base,
coming in follow up commits).
- rename `.quantize_size()` -> `.quantize()`.
- re-implement `.brokers`, `.lot_size_digits`, `.tick_size_digits` as
`@property` methods; for the latter two, allows us to minimize to only
accepting min tick decimal values on alternative constructor class
methods and to drop the equivalent instance vars.
- map `_fqsn` related variable names to new and preferred `_fqme`.
We also juggle around some utility functions, moving limited precision
related `decimal.Decimal` routines to the top of module and soon-to-be
legacy `fqsn` related routines to the bottom.
`MktPair` draft type extensions:
- drop requirements for `src_type`, and offer the optional `.dst_type`
field as either a `str` or (new `typing.Literal`) `AssetTypeName`.
- define an equivalent `.quantize()` as (re)defined in `Symbol` but with
`quantity_type: str` field which specifies whether to use the price or
the size precision.
- add a lot more docs, a `.key` property for the "symbol" name, draft
property for a `.fqme: str`
- allow `.src` and `.dst` to be of type `str | Asset`
Add a new `Asset` to capture "things which can be used in markets and/or
transactions" XD
- defines a `.name`, `.atype: AssetTypeName` a financial category tag, `tx_tick:
Decimal` the precision limit for transactions and of course
a `.quantime()` method for doing accounting arithmetic on a given tech
stack.
- define the `atype: AssetTypeName` type as a finite set of `str`s
expected to be used in various ways for default settings in other
parts of the data and order control layers..
Our issue was not having the correct value set on each
`Symbol.lot_tick_size`.. and then doing PPU calcs with the default set
for legacy mkts..
Also,
- actually write `pps.toml` on broker mode exit.
- drop `get_likely_pair()` and import from pp module.
Not sure how this worked before but, the PPU calculation critically
requires that the order of clearing transactions are in the correct
chronological order! Fix this by sorting `trans: dict[str, Transaction]`
in the `PpTable.update_from_trans()` method.
Also, move the `get_likely_pair()` parser from the `kraken` backend here
for future use particularly when we revamp the asset-transaction
processing layer.