Bleh/🤦, the ``end_dt`` in scope is not the "earliest" frame's
`end_dt` in the async response queue.. Parse the queue's latest epoch
and use **that** to compare to the last last pushed datetime index..
Add more detailed logging to help debug any (un)expected datetime index
gaps.
When the tsdb has a last datum that is in the past less then a "frame's
worth" of sample steps we need to slice out only the data from the
latest frame that doesn't overlap; this fixes that slice logic..
Previously i dunno wth it was doing..
When the market isn't open the feed layer won't create a subscriber
entry in the sampler broadcast loop and so if a manual call to
``broadcast()`` is made (like when trying to update a chart from
a history prepend) we need to handle that case and just broadcast
a random `-1` for now..BD
Expect each backend to deliver a `config: dict[str, Any]` which provides
concurrency controls to `trimeter`'s batch task scheduler such that
backends can define their own concurrency limits.
The dirty deats in this patch include handling history "gaps" where
a query returns a history-frame-result which spans more then the typical
frame size (in seconds). In such cases we reset the target frame index
(datetime index sequence implemented with a `pendulum.Period`) using
a generator protocol `.send()` such that the sequence can be dynamically
re-indexed starting at the new (possibly) pre-gap datetime. The new gap
logic also allows us to detect out of order frames easier and thus wait
for the next-in-order to arrive before making more requests.
Use the new `open_history_client()` endpoint/API and expect backends to
provide a history "getter" routine that can be called to load historical
data into shm even when **not** using a tsdb. Add logic for filling in
data from the tsdb once the backend has provided data up to the last
recorded in the db. Add logic for avoiding overruns of the shm buffer
with more-then-necessary queries of tsdb data.
It turns out (i guess not so shockingly?) that `marketstore` doesn't
always teardown "gracefully" under SIGINT (seems to hang if there are
open client connections which are also in the midst of teardown?) so
this instead first tries the SIGINT and then fails over to a SIGKILL
(destroy loop) which seems to be much more reliable to ensure shutdown
without any downside - in terms of a "hard kill".
Originally i was thinking the issue was root perms related (which get
relegated solely to the `marketstored` daemon actor after spawn) but
actually it was indeed the signalling / application layer causing the
hold-up/latency on teardown. There's a bunch of lingering (now
commented) code which tried to solve this non-problem as well as a bunch
logging/prints to help decipher the root of the issue - this will all
get cleaned out shortly.
If `marketstore` is detected try to only load most recent missing data
from the data provider (broker) and the rest from the tsdb and push it
all to shm for display in the UI. If the provider/broker doesn't have
the history client endpoint, just use the old one for now so we can
start to incrementally add support. Don't start the ohlc step
incrementer task until the backend signals that the feed is live.
Add some basic `numpy` epoch slice logic to generate append and prepend
arrays to write to the db.
Mooar cool things,
- add a `Storage.delete_ts()` method to wipe a column series from the db
easily.
- don't attempt to read in any OHLC series by default on client load
- add some `pyqtgraph` profiling and drop manual latency measures
- if no db series for the fqsn exists write the entire shm array
Also, Start tinkering with `tractor.trionics.ipython_embed()`
In effort to get back to a usable REPL around the mkts client
this adds usage of the new `tractor` integration api as well as logic
for skipping backfilling if existing tsdb arrays are found.
Starts a wrapper around the `marketstore` client to do basic ohlcv query
and retrieval and prototypes out write methods for ohlc and tick.
Try to connect to `marketstore` automatically (which will fail if not
started currently) but we will eventually first do a service query.
Further:
- get `pikerd` working with and without `--tsdb` flag.
- support spawning `brokerd` with no real-time quotes.
- bring back in "fqsn" support that was originally not
in this history before commits factoring.
Not sure how I missed mapping the 5995 grpc port 🤦; done now.
Also adds graceful teardown using SIGINT with included container
logging relayed to the piker console B).
Found this caused breakage on `kraken` orders which triggered the
"insufficient funds" error response. Makes sense since they won't
generate an order id if the order can't ever be submitted.
The most important changes include:
- iterating the new `Flow` type and updating graphics
- adding detailed profiling
- increasing the min uppx before graphics updates are throttled
- including the L1 spread in y-range calcs so that you never have the
bid/ask go "out of view"..
- pass around `Flow`s instead of shms
- drop all the old prototyped downsampling code
If manually managing an overlay you'll likely call `.overlay_plotitem()`
and then a plotting method so we need to accept a plot item input so
that the chart's pi doesn't get assigned incorrectly in the `Flow` entry
(though it is by default if no input is provided).
More,
- add a `Flow.graphics` field and set it to the `pg.GraphicsObject`.
- make `Flow.maxmin()` return `None` in the "can't calculate" cases.
- set shm refs on `Flow` entries.
- don't run a graphics cycle on 'update' msgs from the engine
if the containing chart is hidden.
- drop `volume` from flows map and disable auto-yranging
once $vlm comes up.
Allows for removing resize callbacks for a flow/overlay that you wish to
remove from view (eg. unit volume after dollar volume is up) and thus
less general interaction callback overhead for any plot you don't wish
to show or resize.
Further,
- drop the `autoscale_linked_plots` block for now since with
multi-view-box overlays each register their own vb resize slots
- pull the graphics object from the chart's `Flow` map inside
`.maybe_downsample_graphics()`
This new type wraps a shm data flow and will eventually include things
like incremental path-graphics updates and serialization + bg downsampling
techniques. The main immediate motivation was to get a cached y-range max/min
calc going since profiling revealed the `numpy` equivalents were
actually quite slow as the data set grows large. Likely we can use all
this to drive a streaming mx/mn routine that's always launched as part
of each on-host flow.
This is our official foray into use of `msgspec.Struct` B) and I have to
say, pretty impressed; we'll likely completely ditch `pydantic` from
here on out.
We don't need update graphics on every x-range change since that's what
the display loop does. Instead, only on manual changes do we make manual
calls into `.update_graphics_from_array()` and be sure to iterate all
linked subplots and all their embedded graphics.
The pg profiler seems to have trouble with early `return`s in function
calls (likely muckery with the GC/`.__delete__()`) so let's just try
to avoid it for now until we either fix it (probably by implementing as
a ctx mngr) or use diff one.
Ugh, turns out the wacky `ChartView.maxmin` callback stuff we did (for
determining y-range sizings) currently requires that the volume array
has a "bars in view" result.. so let's make that keep working without
rendering the graphics for the curve (since we're disabling them once
$vlm comes up).
As with the `BarItems` graphics, this makes it possible to pass in a "in
view" range of array data that can be *only* rendered improving
performance for large(r) data sets. All the other normal behaviour is
kept (i.e a persistent, (pre/ap)pendable path can still be maintained)
if a ``view_range`` is not provided.
Further updates,
- drop the `.should_ds_or_redraw()` and `.maybe_downsample()` predicates
instead moving all that logic inside `.update_from_array()`.
- disable the "cache flipping", which doesn't seem to be needed to avoid
artifacts any more?
- handle all redraw/dowsampling logic in `.update_from_array()`.
- even more profiling.
- drop path `.reserve()` stuff until we better figure out how it's
supposed to work.
Drop all the logic originally in `.update_ds_line()` which is now done
internal to our `FastAppendCurve`. Add incremental update of the
flattened OHLC -> line curve (unfortunately using `np.concatenate()` for
the moment) and maintain a new `._ds_line_xy` arrays tuple which keeps
the internal state. Add `.maybe_downsample()` as per the new interaction
update method requirement. Draft out some fast path curve stuff like in
our line graphic. Short-circuit bars path updates when we downsample to
line. Oh, and add a ton more profiling in prep for getting
all this stuff faf.
Build out an interface that makes it super easy to downsample curves
using the m4 algorithm while keeping our incremental `QPainterPath`
update feature. A lot of hard work and tinkering went into getting this
working all in-thread correctly and there are quite a few details..
New interface methods:
- `.x_uppx()` which returns the x-axis "view units per pixel"
- `.px_width()` which returns the total (rounded) x-axis pixels spanned
by the curve in view.
- `.should_ds_or_redraw()` a predicate which checks internal state to
see if either downsampling of the curve should take place, or the curve
should have all downsampling removed and be redrawn with source array
data.
- `.downsample()` the actual ds processing routine which delegates into
the m4 algo impl.
- `.maybe_downsample()` a simple update method which can be called by
the view box when the user changes the zoom level.
Implementation details/changes:
- make `.update_from_array()` check for downsample (or revert to source
aka de-downsample) conditions exist and then downsample and re-draw
path graphics accordingly.
- in order to even further speed up path appends (since our main
bottleneck is measured to be `QPainter.drawPath()` calls with large
paths which are frequently updates), add a secondary path `.fast_path`
which is the path that is real-time updates by incremental appends and
which is painted separately for speed in `.pain()`.
- drop all the `QPolyLine` stuff since it was tested to be much slower
in general and especially so for append-updates.
- stop disabling the cache settings on updates since it doesn't seem to
be required any more?
- more move toward deprecating and removing all lingering interface
requirements from `pg.PlotCurveItem` (like `.xData`/`.yData`).
- adjust `.paint()` and `.boundingRect()` to compensate for the new
`.fast_path`
- add a butt-load of profiling B)
Pretty sure this was most of the cause of the stale (more downsampled)
curves showing when zooming in and out from bars mode quickly. All this
stuff needs to get factored out into a new abstraction anyway, but
i think this get's mostly correct functionality.
Only draw new ds curve on uppx steps >= 4 and stop adding/removing
graphics objects from the scene; doesn't seem to speed anything up
afaict. Add better reporting of ds scale changes.
Only if the uppx increases by more then 2 we redraw the entire line
otherwise just ds with previous params and update the current curve.
This *should* avoid strange lower sample rate artefacts from showing on
updates.
Summary:
- stash both uppx and px width in `._dsi` (downsample info)
- use the new `ohlc_to_m4_line()` flags
- add notes about using `.reserve()` and friends
- always delete last `._array` ref prior to line updates
In an effort to try and make `QPainterPath.reserve()` work, add internal
logic to use the same object without de-allocating memory from
a previous path write/creation.
Note this required the addition of a `._redraw` flag (to be used in
`.clear()` and a small patch to `pyqtgraph.functions.arrayToQPath` to
allow passing in an existing path (thus reusing the same underlying mem
alloc) which will likely be first pushed to our fork.
We were previously ad-hoc scaling up the px count/width to get more
detail at lower uppx values. Add a log scaling sigmoid that range scales
between 1 < px_width < 16.
Add in a flag to use the mxmn OH tracer in `ohlc_flatten()` if desired.
Helpers to quickly convert ohlc struct-array sequences into lines
for consumption by the m4 downsampler. Strip trailing zero entries
from the `ds_m4()` output if found (avoids lines back to origin).
This makes the `'r'` hotkey snap the last bar to the middle of the pp
line arrow marker no matter the zoom level. Now we also boot with
approximately the most number of x units on screen that keep the bars
graphics drawn in full (just before downsampling to a line).
Moved some internals around to get this all in place,
- drop `_anchors.marker_right_points()` and move it to a chart method.
- change `.pre_l1_x()` -> `.pre_l1_xs()` and just have it return the
two view-mapped x values from the former method.
Instead of using a guess about how many x-indexes to reset the last
datum in-view to, calculate and shift the latest index such that it's
just before any L1 spread labels on the y-axis. This makes the view
placement "widget aware" and gives a much more cross-display UX.
Summary:
- add `ChartPlotWidget.pre_l1_x()` which returns a `tuple` of
x view-coord points for the absolute x-pos and length of any L1
line/labels
- make `.default_view()` only shift to see the xlast just outside
the l1 but keep whatever view range xfirst as the first datum in view
- drop `LevelLine.right_point()` since this is now just a
`.pre_l1_x()` call and can be retrieved from the line's internal chart
ref
- drop `._style.bars_from/to_..` vars since we aren't using hard coded
offsets any more
`ChartPlotWidget.curve_width_pxs()` now can be used to get the total
horizontal (x) pixels on screen that are occupied by the current curve
graphics for a given chart. This will be used for downsampling large
data sets to the pixel domain using M4.
Probably the best place to root the profiler since we can get a better
top down view of bottlenecks in the graphics stack.
More,
- add in draft M4 downsampling code (commented) after getting it mostly
working; next step is to move this processing into an FSP subactor.
- always update the vlm chart last y-axis sticky
- set call `.default_view()` just before inf sleep on startup
Obviously determining the x-range from indices was wrong and was the
reason for the incorrect (downsampled) output size XD. Instead correctly
determine the x range and start value from the *values of* the input
x-array. Pretty sure this makes the implementation nearly production
ready.
Relates to #109
All the refs are in the comments and original sample code from infinite
has been reworked to expect the input x/y arrays to already be sliced
(though we can later support passing in the start-end indexes if
desired).
The new routines are `ds_m4()` the python top level API and `_m4()` the
fast `numba` implementation.
- the chart's uppx (units-per-pixel) is > 4 (i.e. zoomed out a lot)
- don't shift the chart (to keep the most recent step in view) if the
last datum isn't in view (aka the user is probably looking at history)
When a bars graphic is zoomed out enough you get a high uppx, datum
units-per-pixel, and there is no point in drawing the 6-lines in each
bar element-graphic if you can't see them on the screen/display device.
Instead here we offer converting to a `FastAppendCurve` which traces
the high-low outline and instead display that when it's impossible to see the
details of bars - approximately when the uppx >= 2.
There is also some draft-commented code in here for downsampling the
outlines as zoom level increases but it's not fully working and should
likely be factored out into a higher level api anyway.
In effort to start getting some graphics speedups as detailed in #109,
this adds a `FastAppendCurve`to every `BarItems` as a `._ds_line` which
is only displayed (instead of the normal mult-line bars curve) when the
"width" of a bar is indistinguishable on screen from a line -> so once
the view coordinates map to > 2 pixels on the display device.
`BarItems.maybe_paint_line()` takes care of this scaling detection logic and is
called by the associated view's `.sigXRangeChanged` signal handler.
The graphics update loop is much easier to grok when all the UI
components which potentially need to be updated on a cycle are arranged
together in a high-level composite namespace, thus this new
`DisplayState` addition. Create and set this state on each
`LinkedSplits` chart set and add a new method `.graphics_cycle()` which
let's a caller trigger a graphics loop update manually. Use this method
in the fsp graphics manager such that a chain can update new history
output even if there is no real-time feed driving the display loop (eg.
when a market is "closed").
As per https://github.com/erdewit/ib_insync/pull/454 the more correct
way to do this is with `.reqContractDetailsAsync()` which we wrap with
`Client.con_deats()` and which works just as well. Further drop all the
`dict`-ifying that was being done in that method and instead always
return `ContractDetails` object in an fqsn-like explicitly keyed `dict`.
ib has a throttle limit for "hft" bars but contained in here is some
hackery using ``xdotool`` to reset data farms auto-magically B)
This copies the working script into the ib backend mod as a routine and
now uses `trio.run_process()` and calls into it from the `get_bars()`
history retriever and then waits for "data re-established" events to be
received from the client before making more history queries.
TL;DR summary of changes:
- relay ib's "system status" events (like for data farm statuses)
as a new "event" msg that can be processed by registers of
`Client.inline_errors()` (though we should probably make a new
method for this).
- add `MethodProxy.status_event()` which allows a proxy user to register
for a particular "system event" (as mentioned above), which puts
a `trio.Event` entry in a small table can be set by an relay task if
there are any detected waiters.
- start a "msg relay task" when opening the method proxy which does
the event setting mentioned above in the background.
- drop the request error handling around the proxy creation, doesn't
seem necessary any more now that we have better error propagation from
`asyncio`.
- add event waiting logic around the data feed reset hackzorin.
- change the order relay task to only log system events for now (though
we need to do some better parsing/logic to get tws-external order
updates to work again..
Found an issue (that was predictably brushed aside XD) where the
`ib_insync.util.df()` helper was changing the timestamps on bars data to
be way off (probably a `pandas.Timestamp` timezone thing?).
Anyway, dropped all that (which will hopefully let us drop `pandas` as
a hard dep) and added a buncha timestamp checking as well as start/end
datetime return values using `pendulum` so that consumer code can know
which "slice" is output.
Also added some WIP code to work around "no history found" request
errors where instead now we try to increment backward another 200
seconds - not sure if this actually correct yet.
Make the throttle error propagate through to `trio` again by adding
`dict`-msg support between the two loops such that errors can be
re-raised on the `trio` side. This is all integrated into the
`MethoProxy` and accompanying result relay task.
Further fix a longer standing issue where sometimes the `ib_insync`
order entry method will raise a weird assertion error because it detects
some internal order-id state issue.. Just ignore those and make relay
back an error to the ems in such cases.
Add a bunch of notes for todos surrounding data feed reset hackery.
To start we only have futes working but this allows both searching
and loading multiple expiries of the same instrument by specifying
different expiries with a `.<expiry>` suffix in the symbol key (eg.
`mnq.globex.20220617`). This also paves the way for options contracts
which will need something similar plus a strike property. This change
set also required a patch to `ib_insync` to allow retrieving multiple
"ambiguous" contracts from the `IB.reqContractDetailsAcync()` method,
see https://github.com/erdewit/ib_insync/pull/454 for further discussion
since the approach here might change.
This patch also includes a lot of serious reworking of some `trio`-`asyncio`
integration to use the newer `tractor.to_asyncio.open_channel_from()`
api and use it (with a relay task) to open a persistent connection with
an in-actor `ib_insync` `Client` mostly for history requests.
Deats,
- annot the module with a `_infect_asyncio: bool` for `tractor` spawning
- add a futes venu list
- support ambiguous futes contracts lookups so that all expiries will
show in search
- support both continuous and specific expiry fute contract
qualification
- allow searching with "fqsn" keys
- don't crash on "data not found" errors in history requests
- move all quotes msg "topic-key" generation (which should now be
a broker-specific fqsn) and per-contract quote processing into
`normalize()`
- set the fqsn key in the symbol info init msg
- use `open_client_proxy()` in bars backfiller endpoint
- include expiry suffix in position update keys
This adds a new client manager-factory: `open_client_proxy()` which uses
the newer `tractor.to_asyncio.open_channel_from()` (and thus the
inter-loop-task-channel style) a `aio_client_method_relay()` and
a re-implemented `MethodProxy` wrapper to allow transparently calling
`asyncio` client methods from `trio` tasks. Use this proxy in the
history backfiller task and add a new (prototype)
`open_history_client()` which will be used in the new storage management
layer. Drop `get_client()` which was the portal wrapping equivalent of
the same proxy but with a one-task-per-call approach. Oh, and
`Client.bars()` can take `datetime`, so let's use it B)
Use fqsn as input to the client-side EMS apis but strip broker-name
stuff before generating and sending `Brokerd*` msgs to each backend for
live order requests (since it's weird for a backend to expect it's own
name, though maybe that could be a sanity check?).
Summary of fqsn use vs. broker native keys:
- client side pps, order requests and general UX for order management
use an fqsn for tracking
- brokerd side order dialogs use the broker-specific symbol which is
usually nearly the same key minus the broker name
- internal dark book and quote feed lookups use the fqsn where possible
In order to support instruments with lifetimes (aka derivatives) we need
generally need special symbol annotations which detail such meta data
(such as `MNQ.GLOBEX.20220717` for daq futes). Further there is really
no reason for the public api for this feed layer to care about getting
a special "brokername" field since generally the data is coming directly
from UIs (eg. search selection) so we might as well accept a fqsn (fully
qualified symbol name) which includes the broker name; for now a suffix
like `'.ib'`. We may change this schema (soon) but this at least gets us
to a point where we expect the full name including broker/provider.
An additional detail: for certain "generic" symbol names (like for
futes) we will pull a so called "front contract" and map this to
a specific fqsn underneath, so there is a double (cached) entry for that
entry such that other consumers can use it the same way if desired.
Some other machinery changes:
- expect the `stream_quotes()` endpoint to deliver it's `.started()` msg
almost immediately since we now need it deliver any fqsn asap (yes
this means the ep should no longer wait on a "live" first quote and
instead deliver what quote data it can right away.
- expect the quotes ohlc sampler task to add in the broker name before
broadcast to remote (actor) consumers since the backend isn't (yet)
expected to do that add in itself.
- obviously we start using all the new fqsn related `Symbol` apis
Move the core ws message handling into `stream_messages()` and call that
from 2 new stream processors: `process_data_feed_msgs()` and
`process_order_msgs()`. Add comments for hints on how to implement the
order msg parsing as well as `pprint` received msgs to console for now.
Since moving to a "god loop" for graphics, we don't really need to have
a dedicated task for updating graphics on new sample increments. The
only UX difference will be that curves won't be updated until an actual new
rt-quote-event triggers the graphics loop -> so we'll have the chart
"jump" to a new position and new curve segments generated only when new
data arrives. This is imo fine since it's just less "idle" updates
where the chart would sit printing the same (last) value every step.
Instead only update the view increment if a new index is detected by
reading shm.
If we ever want this dedicated task update again this commit can be
easily reverted B)
Break up real-time quote feed and history loading into 2 separate tasks
and deliver a client side `data.Feed` as soon as history is loaded
(instead of waiting for a rt quote - the previous logic). If
a symbol doesn't have history then likely the feed shouldn't be loaded
(since presumably client code will need at least "some" datums history
to do anything) and waiting on a real-time quote is dumb, since it'll
hang if the market isn't open XD. If a symbol doesn't have history we
can always write a zero/null array when we run into that case. This also
greatly speeds up feed loading when both history and quotes are available.
TL;DR summary:
- add a `_Feedsbus.start_task()` one-cancel-scope-per-task method for
assisting with (re-)starting and stopping long running persistent
feeds (basically a "one cancels one" style nursery API).
- add a `manage_history()` task which does all history loading (and
eventually real-time writing) which has an independent signal and
start it in a separate task.
- drop the "sample rate per symbol" stuff since client code doesn't really
care when it can just inspect shm indexing/time-steps itself.
- run throttle tasks in the bus nursery thus avoiding cancelling the
underlying sampler task on feed client disconnects.
- don't store a repeated ref the bus nursery's cancel scope..
To avoid the "trigger finger" issue (darks execing before they should
due to a stale last price state, normally when generating a trigger
predicate..) always iterate the loop and update the last known book
price even when no execs/triggered orders are registered.
You can get a weird "last line segment" artifact if *only* that segment
is drawn and the cache is enabled, so just disable unless in step mode
at startup and re-flash as normal when new path data is appended. Add
a `.disable_cache()` method for the multi-use in the update method. Use
line style on the `._last_line: QLineF` segment as well.
Enables retrieving all "named axes" on a particular "side" of the
overlayed plot items. This is useful for calculating how much space
needs to be allocated for the axes before the view box area starts.
Though it's not per-tick accurate, accumulate the number of "trades"
(i.e. the "clearing rate" - maybe this is a better name?) per bar
inside the `dolla_vlm` fsp and average and report wmas of this in the
`flow_rates` fsp.
Define the flows table as a class var (thus making it a "global" and/or
actor-local state) which can be accessed by any in process task. Add
`Fsp.get_shm()` to allow accessing output streams by source-token + fsp
routine reference and thus providing inter-fsp low level access to
real-time flows.