Not only improves startup latency but also avoids a bug where the rt
buffer was being tsdb-history prepended *before* the backfilling of
recent data from the backend was complete resulting in our of order
frames in shm.
Factor the multi-sample-rate region UI connecting into a new helper
`link_views_with_region()` which reads in the shm buffer offsets from
the `Feed` and appropriately connects the fast and slow chart handlers
for the linear region graphics. Add detailed comments writeup for the
inter-sampling transform algebra.
If a history manager raises a `DataUnavailable` just assume the sample
rate isn't supported and that no shm prepends will be done. Further seed
the shm array in such cases as before from the 1m history's last datum.
Also, fix tsdb -> shm back-loading, cancelling tsdb queries when either
no array-data is returned or a frame is delivered which has a start time
no lesser then the least last retrieved. Use strict timeframes for every
`Storage` API call.
Turns out querying for a high freq timeframe (like 1sec) will still
return a lower freq timeframe (like 1Min) SMH, and no idea if it's the
server or the client's fault, so we have to explicitly check the sample
step size and discard lower freq series-results. Do this inside
`Storage.read_ohlcv()` and return an empty `dict` when the wrong time
step is detected from the query result.
Further enforcements,
- both `.load()` and `read_ohlcv()` now require an explicit `timeframe:
int` input to guarantee the time step of the output array.
- drop all calls `.load()` with non-timeframe specific input.
Our default sample periods are 60s (1m) for the history chart and 1s for
the fast chart. This patch adds concurrent loading of both (or more)
different sample period data sets using the existing loading code but
with new support for looping through a passed "timeframe" table which
points to each shm instance.
More detailed adjustments include:
- breaking the "basic" and tsdb loading into 2 new funcs:
`basic_backfill()` and `tsdb_backfill()` the latter of which is run
when the tsdb daemon is discovered.
- adjust the fast shm buffer to offset with one day's worth of 1s so
that only up to a day is backfilled as history in the fast chart.
- adjust bus task starting in `manage_history()` to deliver back the
offset indices for both fast and slow shms and set them on the
`Feed` object as `.izero_hist/rt: int` values:
- allows the chart-UI linked view region handlers to use the offsets
in the view-linking-transform math to index-align the history and
fast chart.
Allows keeping mutex state around data reset requests which (if more
then one are sent) can cause a throttling condition where ib's servers
will get slower and slower to conduct a reconnect. With this you can
have multiple ongoing contract requests without hitting that issue and
we can go back to having a nice 3s timeout on the history queries before
activating the hack.
When a network outage or data feed connection is reset often the
`ib_insync` task will hang until some kind of (internal?) timeout takes
place or, in some (worst) cases it never re-establishes (the event
stream) and thus the backend needs to restart or the live feed will
never resume..
In order to avoid this issue once and for all this patch implements an
additional (extremely simple) task that is started with the real-time
feed and simply waits for any market data reset events; when detected
restarts the `open_aio_quote_stream()` call in a loop using
a surrounding cancel scope.
Been meaning to implement this for ages and it's finally working!
Allows for easier restarts of certain `trio` side tasks without killing
the `asyncio`-side clients; support via flag.
Also fix a bug in `Client.bars()`: we need to return the duration on the
empty bars case..
This allows the history manager to know the decrement size for
`end_dt: datetime` on the next query if a no-data / gap case was
encountered; subtract this in `get_bars()` in such cases. Define the
expected `pendulum.Duration`s in the `.api._samplings` table.
Also add a bit of query latency profiling that we may use later to more
dynamically determine timeout driven data feed resets. Factor the `162`
error cases into a common exception handler block.
When we get a timeout or a `NoData` condition still return a tuple of
empty sequences instead of `None` from `Client.bars()`. Move the
sampling period-duration table to module level.
It doesn't seem to be any slower on our least throttled backend
(binance) and it removes a bunch of hard to get correct frame
re-ordering logic that i'm not sure really ever fully worked XD
Commented some issues we still need to resolve as well.
Manual tinker-testing demonstrated that triggering data resets
completely independent of the frame request gets more throughput and
further, that repeated requests (for the same frame after cancelling on
the `trio`-side) can yield duplicate frame responses. Re-work the
dual-task structure to instead have one task wait indefinitely on the
frame response (and thus not trigger duplicate frames) and the 2nd data
reset task poll for the first task to complete in a poll loop which
terminates when the frame arrives via an event.
Dirty deatz:
- make `get_bars()` take an optional timeout (which will eventually be
dynamically passed from the history mgmt machinery) and move request
logic inside a new `query()` closure meant to be spawned in a task
which sets an event on frame arrival, add data reset poll loop in the
main/parent task, deliver result on nursery completion.
- handle frame request cancelled event case without crash.
- on no-frame result (due to real history gap) hack in a 1 day decrement
case which we need to eventually allow the caller to control likely
based on measured frame rx latency.
- make `wait_on_data_reset()` a predicate without output indicating
reset success as well as `trio.Nursery.start()` compat so that it can
be started in a new task with the started values yielded being
a cancel scope and completion event.
- drop the legacy `backfill_bars()`, not longer used.
Adjust all history query machinery to pass a `timeframe: int` in seconds
and set default of 60 (aka 1m) such that history views from here forward
will be 1m sampled OHLCV. Further when the tsdb is detected as up load
a full 10 years of data if possible on the 1m - backends will eventually
get a config section (`brokers.toml`) that allow user's to tune this.
The `Store.load()`, `.read_ohlcv()` and `.write_ohlcv()` and
`.delete_ts()` now can take a `timeframe: Optional[float]` param which
is used to look up the appropriate sampling period table-key from
`marketstore`.
Allow data feed sub-system to specify the timeframe (aka OHLC sample
period) to the `open_history_client()` delivered history fetching API.
Factor the data keycombo hack into a new routine to be used also from
the history backfiller code when request latency increases; there is
a first draft at trying to use the feed reset to speed up 1m frame
throttling by timing out on the history frame response, but it needs
a lot of fine tuning.
This is a simpler (and oddly more `trio`-nic and/or SC) way to handle
the cancelled-before-acked race for order dialogs. Will allow keeping
the `.req` field as solely an `Order` msg.
When the client is faster then a `brokerd` at submitting and cancelling
an order we run into the case where we need to specify that the EMS
cancels the order-flow as soon as the brokerd's ack arrives. Previously
we were stashing a `BrokerdCancel` msg as the `Status.req` msg (to be
both tested for as a "already cancelled" and sent immediately on ack arrival to
the broker), but for such
cases we can't use that msg to find the fqsn (since only the client side
msgs have it defined) which is required by the new
`Router.client_broadcast()`.
So, Since `Status.req` is supposed to be a client-side flow msg anyway,
and we need the fqsn for client broadcasting, we change this `.req`
value to the client's submitted `Cancel` msg (thus rectifying the
missing `Router.client_broadcast()` fqsn input issue) and build the
`BrokerdCancel` request from that `Cancel` inline in the relay loop
from the `.req: Cancel` status msg lookup.
Further we allow `Cancel` msgs to define an `.account` and adjust the
order mode loop to expect `Cancel` source requests in cancelled status
updates.
Except for paper accounts (in which case we need a trades dialog and
paper engine per symbol to enable simulated clearing) we can rely on the
instrument feed (symbol name) to be the caching key. Utilize
`tractor.trionics.maybe_open_context()` and the new key-as-callable
support in the paper case to ensure we have separate paper clearing
loops per symbol.
Requires https://github.com/goodboy/tractor/pull/329
With the refactor of the dark loop into a daemon task already-open order
relaying from a `brokerd` was broken since no subscribed clients were
registered prior to the relay loop sending status msgs for such existing
live orders. Repair that by adding one more synchronization phase to the
`Router.open_trade_relays()` task: deliver a `client_ready: trio.Event`
which is set by the client task once the client stream has been
established and don't start the `brokerd` order dialog relay loop until
this event is ready.
Further implementation deats:
- factor the `brokerd` relay caching back into it's own `@acm` method:
`maybe_open_brokerd_dialog()` since we do want (but only this) stream
singleton-cached per broker backend.
- spawn all relay tasks on every entry for the moment until we figure
out what we're caching against (any client pre-existing right, which
would mean there's an entry in the `.subscribers` table?)
- rename `_DarkBook` -> `DarkBook` and `DarkBook.orders` -> `.triggers`
This enables "headless" dark order matching and clearing where an `emsd`
daemon subactor can be left running with active dark (or other
algorithmic) orders which will still trigger despite to attached-controlling
ems-client.
Impl details:
- rename/add `Router.maybe_open_trade_relays()` which now does all work
of starting up ems-side long living clearing and relay tasks and the
associated data feed; make is a `Nursery.start()`-able task instead of
an `@acm`.
- drop `open_brokerd_trades_dialog()` and move/factor contents into the
above method.
- add support for a `router.client_broadcast('all', msg)` to wholesale
fan out a msg to all clients.
Establishes a more formalized subscription based fan out pattern to ems
clients who subscribe for order flow for a particular symbol (the fqsn
is the default subscription key for now).
Make `Router.client_broadcast()` take a `sub_key: str` value which
determines the set of clients to forward a message to and drop all such
manually defined broadcast loops from task (func) code. Also add
`.get_subs()` which (hackily) allows getting the set of clients for
a given sub key where any stream that is detected as "closed" is
discarded in the output. Further we simplify to `Router.dialogs:
defaultdict[str, set[tractor.MsgStream]]` and `.subscriptions` as maps
to sets of streams for much easier broadcast management/logic using set
operations inside `.client_broadcast()`.
This patch was originally to fix a bug where new clients who
re-connected to an `emsd` that was running a paper engine were not
getting updates from new fills and/or cancels. It turns out the solution
is more general: now, any client that creates a order dialog will be
subscribing to receive updates on the order flow set mapped for that
symbol/instrument as long as the client has registered for that
particular fqsn with the EMS. This means re-connecting clients as well
as "monitoring" clients can see the same orders, alerts, fills and
clears.
Impl details:
- change all var names spelled as `dialogues` -> `dialogs` to be
murican.
- make `Router.dialogs: dict[str, defaultdict[str, list]]` so that each
dialog id (oid) maps to a set of potential subscribing ems clients.
- add `Router.fqsn2dialogs: dict[str, list[str]]` a map of fqsn entries to
sets of oids.
- adjust all core task code to make appropriate lookups into these 2 new
tables instead of being handed specific client streams as input.
- start the `translate_and_relay_brokerd_events` task as a daemon task
that lives with the particular `TradesRelay` such that dialogs cleared
while no client is connected are still processed.
- rename `TradesRelay.brokerd_dialogue` -> `.brokerd_stream`
- broadcast all status msgs to all subscribed clients in the relay loop.
- always de-reg each client stream from the `Router.dialogs` table on close.
Not sure what exactly happened but it seemed clears weren't working in
some cases without this, also there's no point in spinning the simulated
clearing loop if we're handling a non-clearing tick type.
We haven't been using it for a while and the supposed (remembered)
latency issue on interaction doesn't seem existing after applying the
cache mode. This allows dropping some internal state-logic and generally
simplifying the show-on-hover checks.
Further add `.show_markers()` and `.hide_markers()` as explicit methods
that can be called externally by UI business logic.
Bit of a face palm but obviously `LevelLine.delete()` also removes any
`._marker` from the view which makes it disappear permanently when
moving from non-zero to zero to non-zero positions.. We don't really
need to delete the line since it can be re-used so just remove that
code.
Further this patch removes marker style setting logic from within the
`pp_line()` factory and instead expects the caller to set the correct
"direction" (for long / short) afterward.
- Every time a symbol is switched on chart we need to wait until the
search bar sidepane has been added beside the slow chart before
determining the offset for the pp line's arrow/labels; trigger this in
`GodWidget.load_symbol()` -> required monkeypatching on a
`.mode: OrderMode` to the `.rt_linked` for now..
- Drop the search pane widget removal from the current linked chart,
seems faster?
- On the slow chart override the `LevelMarker.scene_x()` callback to
adjust for the case where no L1 labels are shown beside the y-axis.
Also adds a `GodWidget.resize_all()` helper method which resizes all
sub-widgets and charts to their default ratios and/or parent-widget
dependent defaults using the detected available space on screen. This is
a "default layout" config method that eventually we'll probably want
allow users to customize.
In other words instead of some static view size previously determined by
the accompanying (slow) chart's height, (recursively) calculate the
number of displayed rows and compute the minimal height needed. This
still caps the view at the height of the chart such that the view will
switch to scroll bar mode when too many results are shown and can't all
be fit in the vertical space.
Deats:
- add a ``CompleterView.iter_df_rows()`` which recursively iterates all
rows in depth-first order making it simple to compute the absolute
number of result rows in view and thus the minimal number of pixels to
show all results.
- always pass the height in the `.on_resize()` handler to ensure
triggering the height logic when new results are generated in the
search loop.
Scales the "view" instance that holds search results to the size of the
accompanying "slow chart" for which the search pane is a "sidepane".
A lot of mucking about was required due to resizing of the view
seemingly feeding back into window resizing and further implementing the
sizing logic such that the parent `QSplitter` can be resized as the
user's whim as well.
Details,
- add a `CompleterView._init: bool` which is set once (and only once)
after startup where the first display of the current symbol/feed is
shown allowing and a single *width* padding applied once at startup
to ensure we don't have an awkward line to the right of the longest
result.
- in `.resize_to_results()` only apply a minimum height to the view
using `.setMinimumHeight()` with a down-scaled (`0.91` for now) height
value from input.
- re-implement `CompleterView.show_matches()` to accept and optional
width, heigh tuple and when not supplied pull the slow chart's
dimensions and pass as input to the resize method.
- Make `SearchWidget` x dim sizing policy "fixed".
- register the `SearchWidget` for resize events with god.
- add `.show_only_cache_entries()` for easy results clearing.
- add `.space_dims()` to retrieve slow linked-charts dimensions.
- implement `SearchWidget.on_resize()` which is the caller of all the
previously mentioned resizing routines.
- do resizing and cache entry showing on search loop startup and be sure
to clear to cache when the user selects a symbol-feed with Enter.
It ended up being what'd you expect, races on the accessing shm buffer
data by the UI during the whole "mega-async-startup-everything" phase XD
So we add the following list of ad-hoc startup steps:
- do `.default_view()` on the slow chart after the fast chart is mostly
fully spawned with the intention being to capture the state where the
historical buffer is mostly loaded before sizing the view to the
graphical form of the data.
- resize slow chart sidepanes from the fast chart just before sleeping
forever (and after order mode has booted).
Turns out god widget resizes aren't triggered implicitly by window
resizes, so instead, hook into the window by moving what was our useless
method to that class. Further we explicitly define and declare that our
window has a `.godwidget: GodWidget` and set it up in the bootstrap
phase - in `run_qutractor()` during `trio` guest mode configuration.
Further deatz:
- retype the runtime/bootstrap routines to take a qwidget "type" not an
instance, and drop the whole implicit `.main_widget` stuff.
- delegate into the `GodWidget.on_win_resize()` for any window resize
which then triggers all the custom resize callbacks we already had in
place.
- privatize `ChartnPane.sidepane` so that it can't be mutated willy
nilly without calling `.set_sidepane()`.
- always adjust splitter sizes inside `LinkeSplits.add_plot()`.
More or less moves all the UI related position "nav" logic and graphics
item management into a new `._position.Nav` composite type + api for
high level mgmt of position graphics indicators across multiple charts
(fast and slow).
The slow (history) chart requires it's own y-range checker logic which
needs to be run in 2 cases:
- the last datum is in view and goes outside the previous mx/mn in view
- the chart is incremented a step
Since we need this duplicate logic this patch also factors the incremental
graphics update info "reading" into a new `DisplayState.incr_info()`
method that can be configured to a chart and input state and returns all
relevant "graphics update measure" in a tuple (for now).
Use this method throughout the rest of the display loop for both fast
and slow chart checks and in the `increment_history_view()` slow chart
task.
Use the new `Feed.get_ds_info()` method in a poll loop to definitively
get the inter-chart sampling info and avoid races with shm buffer
backfilling.
Also, factor the history increment closure-task into
`graphics_update_loop()` which will make it clearer how to factor
all the "should we update" logic into some `DisplayState` API.
If you spawn a brokerd set and no `ib` data feed was started (via our
`.data.feed.Feed` api) then there will be no active client loaded and
thus wont' be connected. So in these cases just return nothing, and
I guess we'll figure out real connection failures later?
Add an update call to the display loop to consistently update the last
datum in the history view chart. Compute the inter-chart sampling ratio
and use it to sync the linear region.
Add a first draft of a working `pyqtgraph.LinearRegionItem` link between
a history view chart (+ data set) and the normal real-time "HFT" chart
set.
Add the history view (aka more downsampled data view) chart set to the
rt/hft set's splitter as it's "first widget". Hook up linear region
callbacks to enable syncing between charts including compenstating for
the downsampling rate ration (in this case hardcoded 60 since 1s to 1M,
but we'll actually compute it going forward obvs).
More to come dawgys..
Adds an additional `GodWidget.hist_linked: LinkedSplits` alongside the
renamed `.rt_linked` to enable 2 sets of linked charts with different
sampled data sets/flows. The history set is added without "all the
fixins" for now (i.e. no order mode sidepane or search integration) such
that it is merely a top level chart which shows a much longer term
history and can be added to the UI via embedding the entire history
linked-splits instance into the real-time linked set's splitter.
Further impl deats:
- adjust the `GodWidget._chart_cache: dict[str, tuple]]` to store both
linked split chart sets per symbol so that symbol switching will
continue to work with the added history chart (set).
- rework `.load_symbol()` to operate on both the real-time (HFT) chart
set and the history set.
- rework `LinkedSplits.set_split_sizes()` to compensate for the history
chart and do more detailed height calcs arithmetic to make it appear
by default as a minor sub-chart.
- adjust `LinkedSplits.add_plot()` and `ChartPlotWidget` internals to allow
adding a plot without a sidepane and/or container `ChartnPane`
composite widget by checking for a `sidepane == False` input.
- make `.default_view()` accept a manual y-axis offset kwarg.
- adjust search mode to provide history linked splits to
`.set_chart_symbol()` call.
As part of supporting a "history view" chart which shows downsampled
datums alongside our 1s (or higher) sampled OHLC we need a separate
buffer to store a the slower history from broker backends. This begins
that design by allocating 2 buffers:
- `rt_shm: ShmArray` which maps to a `/dev/shm/` file with `_rt` suffix
- `hist_shm: ShmArray` which maps to a file with `_hist` suffix
Deliver both of these shms back from both `manage_history()` and load
them as `Feed.rt_shm`/`.hist_shm` on the client side.
Impl deats:
- init the rt buffer with the first datum from loaded history and
assign all OHLC values to that row's 'close' and the vlm to 0.
- pass the hist buffer to the backfiller task
- only spawn **one** global sampler array-row increment task per
`brokerd` and pass in the 1s delay which we presume is our lowest
OHLC sample rate for now.
- drop `open_sample_step_stream()` and just move its body contents into
`Feed.index_stream()`
Instead of worrying about the increment period per shm subscription,
just use the value passed as input and presume the caller knows that
only one task is necessary and that the wakeup (sampling) period should
be the shortest that is needed.
It's very unlikely we don't want at least a 1s sampling (both in terms
of task switching cost and general usage) which will eventually ship as
the default "real-time" feed "timeframe". Further, this "fast" increment
sampling task can handle all lower sampling periods (eg. 1m, 5m, 1H)
based on the current implementation just the same.
Also, add a global default sample period as `_defaul_delay_s` for use in
other internal modules.