A slight facepalm but, the main issue was a simple indexing logic error:
we need to slice with `tsdb_history[-shm._first.value:]` to push most
recent history not oldest.. This allows cleanup of tsdb backfill loop as
well.
Further, greatly simply `diff_history()` time slicing by using the
classic `numpy` conditional slice on the epoch field.
This had a bug prior where the end of a frame (a partial) wasn't being
sliced correctly and we'd get odd gaps showing up in the backfilled from
`brokerd` vs. tsdb end index. Repair this by doing timeframe aware index
diffing in `diff_history()` which seems to resolve it. Also, use the
frame-result's `end_dt: datetime` for the loop exit condition.
Sync per-symbol sampler loop start to subscription registers such that
the loop can't start until the consumer's stream subscription is added;
the task-sync uses a `trio.Event`. This patch also drops a ton of
commented cruft.
Further adjustments needed to get parity with prior functionality:
- pass init msg 'symbol_info' field to the `Symbol.broker_info: dict`.
- ensure the `_FeedsBus._subscriptions` table uses the broker specific
(without brokername suffix) as keys for lookup so that the sampler
loop doesn't have to append in the brokername as a suffix.
- ensure the `open_feed_bus()` flumes-table-msg returned sent by
`tractor.Context.started()` uses the `.to_msg()` form of all flume
structs.
- ensure `maybe_open_feed()` uses `tractor.MsgStream.subscribe()` on all
`Flume.stream`s on cache hits using the
`tractor.trionics.gather_contexts()` helper.
Orient shm-flow-arrays around the new idea of a `Flume` which provides
access, mgmt and basic measure of real-time data flow sets (see water
flow management semantics).
- We discard the previous idea of a "init message" which contained all
the shm attachment info and instead send a startup message full of
`Flume.to_msg()`s which are symmetrically loaded on the caller actor
side.
- Create data-flows "entries" for every passed in fqsn such that the consumer gets back
streams and shm for each, now all wrapped in `Flume` types. For now we
allocate `brokermod.stream_quotes()` tasks 1-to-1 for each fqsn
(instead of expecting each backend to do multi-plexing, though we
might want that eventually) as well a `_FeedsBus._subscriber` entry
for each. The pause/resume management loop is adjusted to match.
Previously `Feed`s were allocated 1-to-1 with each fqsn.
- Make `Feed` a `Struct` subtype instead of a `@dataclass` and move all
flow specific attrs to the new `Flume`:
- move `.index_stream()`, `.get_ds_info()` to `Flume`.
- drop `.receive()`: each fqsn entry will now require knowledge of
separate streams by feed users.
- add multi-fqsn tables: `.flumes`, `.streams` which point to the
appropriate per-symbol entries.
- Async load all `Flume`s from all contexts and all quote streams using
`tractor.trionics.gather_contexts()` on the client `open_feed()` side.
- Update feeds test to include streaming 2 symbols on the same (binance)
backend.
This is to prep for multi-symbol feeds and charts so we accept
a sequence of fqsns to the top level entrypoints as well as the
`.data.feed.open_feed()` API (though we're not actually supporting true
multiplexed feeds nor shm lookups per fqsn yet).
Allows starting UI apps and passing the `pikerd` registry socket-addr
args via `--host` or `--port` such that a separate actor tree can be
started by selecting an unused port. This is handy when hacking new
features but while also wishing to run a more stable version of the code
for trading on the same host.
Drop all attempts at rewiring `ViewBox` signals, monkey-patching
relayee handlers, and generally modifying event source public
attributes. Instead take a much simpler approach where the event source
graphics object simply has it's handler dynamically overridden by
a broadcaster function which relays to all consumers using a Python
loop.
The benefits of this much simplified approach include:
- avoiding the tedious and often complex (re)connection of signals between
the source plot and the overlayed consumers.
- requiring zero modification of the public interface of any of the
publisher or consumer `ViewBox`s, no decoration, extra signal
definitions (eg. previous `mouseDragEventRelay` or the like).
- only a single dynamic method override on the event source graphics object
(`ViewBox`) which does the broadcasting work and requires no
modification to handler implementations.
Detailed `.ui._overlay` changes:
- drop `mk_relay_signal()`, `enable_relays()` which removes signal/slot
hacking methodology.
- drop unused `ComposedGridLayout.grid` and `.reverse`, change some
method names: `.insert()` -> `.insert_plotitem()`, `append()` ->
`.append_plotitem()`.
- in `PlotOverlay`, again drop all signal/slot rewiring in
`.add_plotitem()` and instead add our new closure based python-loop in
`broadcast()` routine which is used to override the event-source
object's handler.
- comment out all the auxiliary/want-to-have event source selection
methods for now.
Mainly this involves instantiating our overriden `PlotItem` in a few
places and tweaking type annots. A further detail is that inside
the fsp sub-chart creation code we hide some axes for overlays in the
flows subchart; these were previously somehow hidden implicitly?
Fork out our patch set submitted to upstream in multiple PRs (since they
aren't moving and/or aren't a priority to core) which can be seen in
full from the following diff:
https://github.com/pyqtgraph/pyqtgraph/compare/master...pikers:pyqtgraph:graphics_pin
Move these type extensions into the internal `.ui._pg_overrides` module.
The changes are related to both `pyqtgraph.PlotItem` and `.AxisItem` and
were driven for our need for multi-view overlays (overlaid charts with
optionally synced axis and interaction controls) as documented in the PR
to upstream: https://github.com/pyqtgraph/pyqtgraph/pull/2162
More specifically,
- wrt to `AxisItem` we added lru caching of tick values as per:
https://github.com/pyqtgraph/pyqtgraph/pull/2160.
- wrt to `PlotItem` we adjusted some of the axis management code, namely
adding a standalone `.removeAxis()` and modifying the `.setAxisItems()` logic
to use it in: https://github.com/pyqtgraph/pyqtgraph/pull/2162
as well as some tweaks to `.updateGrid()` to loop through all possible
axes when grid setting.
Details of the original patch to upstream are in:
https://github.com/pyqtgraph/pyqtgraph/pull/2281
Instead of trying to land this we've opted to just copy out that version
of `.debug.Profiler` into our own internals (luckily the class is
entirely self-contained) until such a time when we choose to find
a better dependency as per https://github.com/pikers/piker/issues/337
To make it easier to manually read/decipher long ledger files this adds
`dict` sorting based on record-type-specific (api vs. flex report)
datetime processing prior to ledger file write.
- break up parsers into separate routines for flex and api record
processing.
- add `parse_flex_dt()` for special handling of the weird semicolon
stamps in flex reports.
There never was any underlying db bug, it was a hardcoded timeframe in
the column series write key.. Now we always assert a matching timeframe
in results.
Not only improves startup latency but also avoids a bug where the rt
buffer was being tsdb-history prepended *before* the backfilling of
recent data from the backend was complete resulting in our of order
frames in shm.
Factor the multi-sample-rate region UI connecting into a new helper
`link_views_with_region()` which reads in the shm buffer offsets from
the `Feed` and appropriately connects the fast and slow chart handlers
for the linear region graphics. Add detailed comments writeup for the
inter-sampling transform algebra.
If a history manager raises a `DataUnavailable` just assume the sample
rate isn't supported and that no shm prepends will be done. Further seed
the shm array in such cases as before from the 1m history's last datum.
Also, fix tsdb -> shm back-loading, cancelling tsdb queries when either
no array-data is returned or a frame is delivered which has a start time
no lesser then the least last retrieved. Use strict timeframes for every
`Storage` API call.
Turns out querying for a high freq timeframe (like 1sec) will still
return a lower freq timeframe (like 1Min) SMH, and no idea if it's the
server or the client's fault, so we have to explicitly check the sample
step size and discard lower freq series-results. Do this inside
`Storage.read_ohlcv()` and return an empty `dict` when the wrong time
step is detected from the query result.
Further enforcements,
- both `.load()` and `read_ohlcv()` now require an explicit `timeframe:
int` input to guarantee the time step of the output array.
- drop all calls `.load()` with non-timeframe specific input.
Our default sample periods are 60s (1m) for the history chart and 1s for
the fast chart. This patch adds concurrent loading of both (or more)
different sample period data sets using the existing loading code but
with new support for looping through a passed "timeframe" table which
points to each shm instance.
More detailed adjustments include:
- breaking the "basic" and tsdb loading into 2 new funcs:
`basic_backfill()` and `tsdb_backfill()` the latter of which is run
when the tsdb daemon is discovered.
- adjust the fast shm buffer to offset with one day's worth of 1s so
that only up to a day is backfilled as history in the fast chart.
- adjust bus task starting in `manage_history()` to deliver back the
offset indices for both fast and slow shms and set them on the
`Feed` object as `.izero_hist/rt: int` values:
- allows the chart-UI linked view region handlers to use the offsets
in the view-linking-transform math to index-align the history and
fast chart.
Allows keeping mutex state around data reset requests which (if more
then one are sent) can cause a throttling condition where ib's servers
will get slower and slower to conduct a reconnect. With this you can
have multiple ongoing contract requests without hitting that issue and
we can go back to having a nice 3s timeout on the history queries before
activating the hack.
When a network outage or data feed connection is reset often the
`ib_insync` task will hang until some kind of (internal?) timeout takes
place or, in some (worst) cases it never re-establishes (the event
stream) and thus the backend needs to restart or the live feed will
never resume..
In order to avoid this issue once and for all this patch implements an
additional (extremely simple) task that is started with the real-time
feed and simply waits for any market data reset events; when detected
restarts the `open_aio_quote_stream()` call in a loop using
a surrounding cancel scope.
Been meaning to implement this for ages and it's finally working!
Allows for easier restarts of certain `trio` side tasks without killing
the `asyncio`-side clients; support via flag.
Also fix a bug in `Client.bars()`: we need to return the duration on the
empty bars case..
This allows the history manager to know the decrement size for
`end_dt: datetime` on the next query if a no-data / gap case was
encountered; subtract this in `get_bars()` in such cases. Define the
expected `pendulum.Duration`s in the `.api._samplings` table.
Also add a bit of query latency profiling that we may use later to more
dynamically determine timeout driven data feed resets. Factor the `162`
error cases into a common exception handler block.
When we get a timeout or a `NoData` condition still return a tuple of
empty sequences instead of `None` from `Client.bars()`. Move the
sampling period-duration table to module level.
It doesn't seem to be any slower on our least throttled backend
(binance) and it removes a bunch of hard to get correct frame
re-ordering logic that i'm not sure really ever fully worked XD
Commented some issues we still need to resolve as well.
Manual tinker-testing demonstrated that triggering data resets
completely independent of the frame request gets more throughput and
further, that repeated requests (for the same frame after cancelling on
the `trio`-side) can yield duplicate frame responses. Re-work the
dual-task structure to instead have one task wait indefinitely on the
frame response (and thus not trigger duplicate frames) and the 2nd data
reset task poll for the first task to complete in a poll loop which
terminates when the frame arrives via an event.
Dirty deatz:
- make `get_bars()` take an optional timeout (which will eventually be
dynamically passed from the history mgmt machinery) and move request
logic inside a new `query()` closure meant to be spawned in a task
which sets an event on frame arrival, add data reset poll loop in the
main/parent task, deliver result on nursery completion.
- handle frame request cancelled event case without crash.
- on no-frame result (due to real history gap) hack in a 1 day decrement
case which we need to eventually allow the caller to control likely
based on measured frame rx latency.
- make `wait_on_data_reset()` a predicate without output indicating
reset success as well as `trio.Nursery.start()` compat so that it can
be started in a new task with the started values yielded being
a cancel scope and completion event.
- drop the legacy `backfill_bars()`, not longer used.
Adjust all history query machinery to pass a `timeframe: int` in seconds
and set default of 60 (aka 1m) such that history views from here forward
will be 1m sampled OHLCV. Further when the tsdb is detected as up load
a full 10 years of data if possible on the 1m - backends will eventually
get a config section (`brokers.toml`) that allow user's to tune this.
The `Store.load()`, `.read_ohlcv()` and `.write_ohlcv()` and
`.delete_ts()` now can take a `timeframe: Optional[float]` param which
is used to look up the appropriate sampling period table-key from
`marketstore`.
Allow data feed sub-system to specify the timeframe (aka OHLC sample
period) to the `open_history_client()` delivered history fetching API.
Factor the data keycombo hack into a new routine to be used also from
the history backfiller code when request latency increases; there is
a first draft at trying to use the feed reset to speed up 1m frame
throttling by timing out on the history frame response, but it needs
a lot of fine tuning.
This is a simpler (and oddly more `trio`-nic and/or SC) way to handle
the cancelled-before-acked race for order dialogs. Will allow keeping
the `.req` field as solely an `Order` msg.
When the client is faster then a `brokerd` at submitting and cancelling
an order we run into the case where we need to specify that the EMS
cancels the order-flow as soon as the brokerd's ack arrives. Previously
we were stashing a `BrokerdCancel` msg as the `Status.req` msg (to be
both tested for as a "already cancelled" and sent immediately on ack arrival to
the broker), but for such
cases we can't use that msg to find the fqsn (since only the client side
msgs have it defined) which is required by the new
`Router.client_broadcast()`.
So, Since `Status.req` is supposed to be a client-side flow msg anyway,
and we need the fqsn for client broadcasting, we change this `.req`
value to the client's submitted `Cancel` msg (thus rectifying the
missing `Router.client_broadcast()` fqsn input issue) and build the
`BrokerdCancel` request from that `Cancel` inline in the relay loop
from the `.req: Cancel` status msg lookup.
Further we allow `Cancel` msgs to define an `.account` and adjust the
order mode loop to expect `Cancel` source requests in cancelled status
updates.
Except for paper accounts (in which case we need a trades dialog and
paper engine per symbol to enable simulated clearing) we can rely on the
instrument feed (symbol name) to be the caching key. Utilize
`tractor.trionics.maybe_open_context()` and the new key-as-callable
support in the paper case to ensure we have separate paper clearing
loops per symbol.
Requires https://github.com/goodboy/tractor/pull/329
With the refactor of the dark loop into a daemon task already-open order
relaying from a `brokerd` was broken since no subscribed clients were
registered prior to the relay loop sending status msgs for such existing
live orders. Repair that by adding one more synchronization phase to the
`Router.open_trade_relays()` task: deliver a `client_ready: trio.Event`
which is set by the client task once the client stream has been
established and don't start the `brokerd` order dialog relay loop until
this event is ready.
Further implementation deats:
- factor the `brokerd` relay caching back into it's own `@acm` method:
`maybe_open_brokerd_dialog()` since we do want (but only this) stream
singleton-cached per broker backend.
- spawn all relay tasks on every entry for the moment until we figure
out what we're caching against (any client pre-existing right, which
would mean there's an entry in the `.subscribers` table?)
- rename `_DarkBook` -> `DarkBook` and `DarkBook.orders` -> `.triggers`
This enables "headless" dark order matching and clearing where an `emsd`
daemon subactor can be left running with active dark (or other
algorithmic) orders which will still trigger despite to attached-controlling
ems-client.
Impl details:
- rename/add `Router.maybe_open_trade_relays()` which now does all work
of starting up ems-side long living clearing and relay tasks and the
associated data feed; make is a `Nursery.start()`-able task instead of
an `@acm`.
- drop `open_brokerd_trades_dialog()` and move/factor contents into the
above method.
- add support for a `router.client_broadcast('all', msg)` to wholesale
fan out a msg to all clients.
Establishes a more formalized subscription based fan out pattern to ems
clients who subscribe for order flow for a particular symbol (the fqsn
is the default subscription key for now).
Make `Router.client_broadcast()` take a `sub_key: str` value which
determines the set of clients to forward a message to and drop all such
manually defined broadcast loops from task (func) code. Also add
`.get_subs()` which (hackily) allows getting the set of clients for
a given sub key where any stream that is detected as "closed" is
discarded in the output. Further we simplify to `Router.dialogs:
defaultdict[str, set[tractor.MsgStream]]` and `.subscriptions` as maps
to sets of streams for much easier broadcast management/logic using set
operations inside `.client_broadcast()`.
This patch was originally to fix a bug where new clients who
re-connected to an `emsd` that was running a paper engine were not
getting updates from new fills and/or cancels. It turns out the solution
is more general: now, any client that creates a order dialog will be
subscribing to receive updates on the order flow set mapped for that
symbol/instrument as long as the client has registered for that
particular fqsn with the EMS. This means re-connecting clients as well
as "monitoring" clients can see the same orders, alerts, fills and
clears.
Impl details:
- change all var names spelled as `dialogues` -> `dialogs` to be
murican.
- make `Router.dialogs: dict[str, defaultdict[str, list]]` so that each
dialog id (oid) maps to a set of potential subscribing ems clients.
- add `Router.fqsn2dialogs: dict[str, list[str]]` a map of fqsn entries to
sets of oids.
- adjust all core task code to make appropriate lookups into these 2 new
tables instead of being handed specific client streams as input.
- start the `translate_and_relay_brokerd_events` task as a daemon task
that lives with the particular `TradesRelay` such that dialogs cleared
while no client is connected are still processed.
- rename `TradesRelay.brokerd_dialogue` -> `.brokerd_stream`
- broadcast all status msgs to all subscribed clients in the relay loop.
- always de-reg each client stream from the `Router.dialogs` table on close.
Not sure what exactly happened but it seemed clears weren't working in
some cases without this, also there's no point in spinning the simulated
clearing loop if we're handling a non-clearing tick type.
We haven't been using it for a while and the supposed (remembered)
latency issue on interaction doesn't seem existing after applying the
cache mode. This allows dropping some internal state-logic and generally
simplifying the show-on-hover checks.
Further add `.show_markers()` and `.hide_markers()` as explicit methods
that can be called externally by UI business logic.
Bit of a face palm but obviously `LevelLine.delete()` also removes any
`._marker` from the view which makes it disappear permanently when
moving from non-zero to zero to non-zero positions.. We don't really
need to delete the line since it can be re-used so just remove that
code.
Further this patch removes marker style setting logic from within the
`pp_line()` factory and instead expects the caller to set the correct
"direction" (for long / short) afterward.
- Every time a symbol is switched on chart we need to wait until the
search bar sidepane has been added beside the slow chart before
determining the offset for the pp line's arrow/labels; trigger this in
`GodWidget.load_symbol()` -> required monkeypatching on a
`.mode: OrderMode` to the `.rt_linked` for now..
- Drop the search pane widget removal from the current linked chart,
seems faster?
- On the slow chart override the `LevelMarker.scene_x()` callback to
adjust for the case where no L1 labels are shown beside the y-axis.
Also adds a `GodWidget.resize_all()` helper method which resizes all
sub-widgets and charts to their default ratios and/or parent-widget
dependent defaults using the detected available space on screen. This is
a "default layout" config method that eventually we'll probably want
allow users to customize.
In other words instead of some static view size previously determined by
the accompanying (slow) chart's height, (recursively) calculate the
number of displayed rows and compute the minimal height needed. This
still caps the view at the height of the chart such that the view will
switch to scroll bar mode when too many results are shown and can't all
be fit in the vertical space.
Deats:
- add a ``CompleterView.iter_df_rows()`` which recursively iterates all
rows in depth-first order making it simple to compute the absolute
number of result rows in view and thus the minimal number of pixels to
show all results.
- always pass the height in the `.on_resize()` handler to ensure
triggering the height logic when new results are generated in the
search loop.
Scales the "view" instance that holds search results to the size of the
accompanying "slow chart" for which the search pane is a "sidepane".
A lot of mucking about was required due to resizing of the view
seemingly feeding back into window resizing and further implementing the
sizing logic such that the parent `QSplitter` can be resized as the
user's whim as well.
Details,
- add a `CompleterView._init: bool` which is set once (and only once)
after startup where the first display of the current symbol/feed is
shown allowing and a single *width* padding applied once at startup
to ensure we don't have an awkward line to the right of the longest
result.
- in `.resize_to_results()` only apply a minimum height to the view
using `.setMinimumHeight()` with a down-scaled (`0.91` for now) height
value from input.
- re-implement `CompleterView.show_matches()` to accept and optional
width, heigh tuple and when not supplied pull the slow chart's
dimensions and pass as input to the resize method.
- Make `SearchWidget` x dim sizing policy "fixed".
- register the `SearchWidget` for resize events with god.
- add `.show_only_cache_entries()` for easy results clearing.
- add `.space_dims()` to retrieve slow linked-charts dimensions.
- implement `SearchWidget.on_resize()` which is the caller of all the
previously mentioned resizing routines.
- do resizing and cache entry showing on search loop startup and be sure
to clear to cache when the user selects a symbol-feed with Enter.
It ended up being what'd you expect, races on the accessing shm buffer
data by the UI during the whole "mega-async-startup-everything" phase XD
So we add the following list of ad-hoc startup steps:
- do `.default_view()` on the slow chart after the fast chart is mostly
fully spawned with the intention being to capture the state where the
historical buffer is mostly loaded before sizing the view to the
graphical form of the data.
- resize slow chart sidepanes from the fast chart just before sleeping
forever (and after order mode has booted).
Turns out god widget resizes aren't triggered implicitly by window
resizes, so instead, hook into the window by moving what was our useless
method to that class. Further we explicitly define and declare that our
window has a `.godwidget: GodWidget` and set it up in the bootstrap
phase - in `run_qutractor()` during `trio` guest mode configuration.
Further deatz:
- retype the runtime/bootstrap routines to take a qwidget "type" not an
instance, and drop the whole implicit `.main_widget` stuff.
- delegate into the `GodWidget.on_win_resize()` for any window resize
which then triggers all the custom resize callbacks we already had in
place.
- privatize `ChartnPane.sidepane` so that it can't be mutated willy
nilly without calling `.set_sidepane()`.
- always adjust splitter sizes inside `LinkeSplits.add_plot()`.
More or less moves all the UI related position "nav" logic and graphics
item management into a new `._position.Nav` composite type + api for
high level mgmt of position graphics indicators across multiple charts
(fast and slow).
The slow (history) chart requires it's own y-range checker logic which
needs to be run in 2 cases:
- the last datum is in view and goes outside the previous mx/mn in view
- the chart is incremented a step
Since we need this duplicate logic this patch also factors the incremental
graphics update info "reading" into a new `DisplayState.incr_info()`
method that can be configured to a chart and input state and returns all
relevant "graphics update measure" in a tuple (for now).
Use this method throughout the rest of the display loop for both fast
and slow chart checks and in the `increment_history_view()` slow chart
task.
Use the new `Feed.get_ds_info()` method in a poll loop to definitively
get the inter-chart sampling info and avoid races with shm buffer
backfilling.
Also, factor the history increment closure-task into
`graphics_update_loop()` which will make it clearer how to factor
all the "should we update" logic into some `DisplayState` API.