Commit Graph

575 Commits (30d55fdb275373516dd2133ae6f850a6acd6d7a3)

Author SHA1 Message Date
Tyler Goodlet 452cd7db8a Optionally load `MktPair` in `Flume`s 2023-05-09 14:49:25 -04:00
Tyler Goodlet 2cc80d53ca First stage port of `.data.feed` to `MktPair`
Add `MktPair` handling block for when a backend delivers
a `mkt_info`-field containing init msg. Adjust the original
`Symbol`-style `'symbol_info'` msg processing to do `Decimal` defaults
and convert to `MktPair` including slapping in a hacky `_atype: str`
field XD

General initial name changes to `bs_mktid` and `_fqme` throughout!
2023-05-09 14:49:25 -04:00
Tyler Goodlet 7eb0b1d249 Comment about `Struct.typecast()` conflict with frozen instances 2023-05-09 14:49:25 -04:00
Tyler Goodlet 55b6cba31e Encode a `mktpair` field if passed in msg by caller 2023-05-09 14:49:25 -04:00
Tyler Goodlet 7a8e615fa6 Explicitly decode tick sizes as decimal for symbol loading in `Flume` 2023-05-09 14:49:25 -04:00
Tyler Goodlet 9e2eff507e Drop shm logging levels to debug over warning 2023-05-09 14:49:25 -04:00
Tyler Goodlet 56f736e7ca Drop use of `Symbol.brokers` everywhere 2023-05-09 14:49:25 -04:00
Tyler Goodlet 9f03484c4d Move all fqsn parsing and `Symbol` to new `accounting._mktinfo 2023-05-09 14:49:25 -04:00
Tyler Goodlet badc30baae Add an inverse of `float_digits()`: `digits_to_dec() 2023-05-09 14:49:25 -04:00
Tyler Goodlet 0d9acb1cb0 numpy: drop `numpy.float` in py311 2023-05-04 12:01:59 -04:00
jaredgoldman 9bf6f557ed Label private methods accordingly, remove cryptofeeds module 2023-04-12 19:48:46 -04:00
jaredgoldman b14b323068 Remove breakpoint in web_bs,
ensure we only unsub if ws is connected
2023-04-12 19:48:46 -04:00
jaredgoldman ac34ca7cad Add sub method to flow
Stash for checkout of master
2023-04-12 19:48:46 -04:00
Tyler Goodlet 8e91e215b3 WIP - ensure `asyncio` pumps the event loop each send 2023-04-12 19:48:46 -04:00
jaredgoldman c751c36a8b Update trade message format 2023-04-12 19:48:46 -04:00
jaredgoldman ad9d645782 WIP - setup basic history and streaming client 2023-04-12 19:48:46 -04:00
jaredgoldman 5fdec8012d Add cryptofeeds data feed module,
Add Kucoin backend client
wip
2023-04-12 19:48:46 -04:00
Tyler Goodlet 12e196a6f7 Catch `KeyError` on bcast errors which pop the sub
Not sure how i missed this (and left in handling of `list.remove()` and
it ever worked for that?) after the `samplerd` impl in 5ec1a72 but, this
adjusts the remove-broken-subscriber loop to catch the correct
`set.remove()` exception type on a missing (likely already removed)
subscription entry.
2023-03-10 18:20:22 -05:00
Tyler Goodlet 712f1a47a0 Require `step: float` input to `slice_from_time()`
There's been way too many issues when trying to calculate this
dynamically from the input array, so just expect the caller to know what
it's doing and don't bother with ever hitting the error case of
calculating and incorrect value internally.
2023-03-10 18:20:22 -05:00
Tyler Goodlet 29418e9655 Avoid index-from-time slicing including gaps
Not sure why this was ever allowed but, for slicing to the sample
*before* whatever target time stamp is passed in we should definitely
not return the prior index as for the slice start since that might
include a very large gap prior to whatever sample is scanned to have
the earliest matching time stamp.

This was essential to fixing overlay intersect points searching in our
``ui.view_mode`` machinery..
2023-03-10 18:20:22 -05:00
Tyler Goodlet 5dd69b2295 Better handle dynamic registry sampler broadcasts
In situations where clients are (dynamically) subscribing *while*
broadcasts are starting to taking place we need to handle the
`set`-modified-during-iteration case. This scenario seems to be more
common during races on concurrent startup of multiple symbols. The
solution here is to use another set to take note of subscribers which
are successfully sent-to and then skipping them on re-try.

This also contains an attempt to exception-handle throttled stream
overruns caused by higher frequency feeds (like binance) pushing more
quotes then can be handled during (UI) client startup.
2023-03-10 18:20:22 -05:00
Tyler Goodlet 441243f83b Attempt to report `piker storage -d <fqsn>` errors
Not really sure there's much we can do besides dump Grpc stuff when we
detect an "error" `str` for the moment..

Either way leave a buncha complaints (como siempre) and do linting
fixups..
2023-03-09 15:37:43 -05:00
Tyler Goodlet b226b678e9 Fix missed `marketstore` mod imports 2023-03-09 15:37:42 -05:00
Tyler Goodlet afac553ea2 Move all docker and external db code to `piker.service` 2023-03-09 15:37:42 -05:00
Tyler Goodlet 93c81fa4d1 Start `piker.service` sub-package
For now just moves everything that was in `piker._daemon` to a subpkg
module but a reorg is coming pronto!
2023-03-09 15:37:42 -05:00
Tyler Goodlet bfe3ea1f59 Set explicit `marketstore` container startup timeout 2023-03-09 15:37:42 -05:00
Tyler Goodlet 56629b6b2e Hardcode `cancel` log level for `ahabd` for now 2023-03-09 15:37:42 -05:00
Tyler Goodlet 7694419e71 Background docker-container logs processing
Previously we would make the `ahabd` supervisor-actor sync to docker
container startup using pseudo-blocking log message processing.

This has issues,
- we're forced to do a hacky "yield back to `trio`" in order to be
  "fake async" when reading the log stream and further,
- blocking on a message is fragile and often slow.

Instead, run the log processor in a background task and in the parent
task poll for the container to be in the client list using a similar
pseudo-async poll pattern. This allows the super to `Context.started()`
sooner (when the container is actually registered as "up") and thus
unblock its (remote) caller faster whilst still doing full log msg
proxying!

Deatz:
- adds `Container.cuid: str` a unique container id for logging.
- correctly proxy through the `loglevel: str` from `pikerd` caller task.
- shield around `Container.cancel()` in the teardown block and use
  cancel level logging in that method.
2023-03-09 15:37:42 -05:00
Tyler Goodlet 8c66f066bd Deliver es specific ahab-super in endpoint startup config 2023-03-09 15:37:42 -05:00
Tyler Goodlet 959e423849 Add warning around detach flag to docker client 2023-03-09 15:37:42 -05:00
Tyler Goodlet 7b196b1b97 Support startup-config overrides to `ahabd` super
With the addition of a new `elastixsearch` docker support in
https://github.com/pikers/piker/pull/464, adjustments were made
to container startup sync logic (particularly the `trio` checkpoint
sleep period - which itself is a hack around a sync client api) which
caused a regression in upstream startup logic wherein container error
logs were not being bubbled up correctly causing a silent failure mode:

- `marketstore` container started with corrupt input config
- `ahabd` super code timed out on startup phase due to a larger log
  polling period, skipped processing startup logs from the container,
  and continued on as though the container was started
- history client fails on grpc connection with no clear error on why the
  connection failed.

Here we revert to the old poll period (1ms) to avoid any more silent
failures and further extend supervisor control through a configuration
override mechanism. To address the underlying design issue, this patch
adds support for container-endpoint-callbacks to override supervisor
startup configuration parameters via the 2nd value in their returned
tuple: the already delivered configuration `dict` value.

The current exposed values include:
    {
        'startup_timeout': 1.0,
        'startup_query_period': 0.001,
        'log_msg_key': 'msg',
    },

This allows for container specific control over the startup-sync query
period (the hack mentioned above)  as well as the expected log msg key
and of course the startup timeout.
2023-03-09 15:37:42 -05:00
Tyler Goodlet fe0695fb7b First draft storage layer cli
Adds a `piker storage` subcmd with a `-d` flag to wipe a particular
fqsn's time series (both 1s and 60s). Obviously this needs to be
extended much more but provides a start point.
2023-03-09 15:37:42 -05:00
Tyler Goodlet 3a4794e9d1 Backward-compat: don't require `'lot_tick_size'`
In order to support existing `pps.toml` files in the wild which don't
have the `asset_type, price_tick_size, lot_tick_size` fields, we need to
only optionally read them and instead expect that backends will write
the fields going forward (coming in follow patches).

Further this makes some small asset-size (vlm accounting) quantization
related adjustments:
- rename `Symbol.decimal_quant()` -> `.quantize_size()` since that is
  explicitly what this method is doing.
- and expect an input `size: float` which we cast to decimal instead of
  doing it inside the `.calc_size()` caller code.
- drop `Symbol.iterfqsns()` which wasn't being used anywhere at all..

Additionally, this drafts out a new replacement market-trading-pair data
type to eventually replace `.data._source.Symbol` -> `MktPair` which we
aren't using yet, but serves as the documentation-driven motivator ;)
and, it relates to https://github.com/pikers/piker/issues/467.
2023-03-02 19:22:19 -05:00
Guillermo Rodriguez f5b8b9a14f
Add sym registry to PaperBoi as well as a sym ref on Transaction
Add decimal quantize API to Symbol to simplify by-broker truncation
Add symbol info to `pps.toml`
Move _assert call to outside the _async_main context manager
Minor indentation and styling changes, also convert a few prints to log calls
Fix multi write / race condition on open_pps call
Switch open_pps to not write by default
Fix integer math kraken syminfo _tick_size initialization
2023-03-01 21:06:48 -03:00
jaredgoldman 342aec648b Skip zero test and change use Path when creating a config folder in marketstore 2023-02-28 13:51:47 -05:00
jaredgoldman 4b72d3ba99 Add backpressure setting back as it wasn't altering test behaviour 2023-02-28 13:51:47 -05:00
algorandpa 0dec2b9c89 Enable backpressure during data-feed layer startup to avoid overruns 2023-02-25 18:59:39 -05:00
Guillermo Rodriguez 47bf45f30e
Merge pull request #464 from pikers/elasticsearch_integration
Elasticsearch integration
2023-02-24 16:38:37 -03:00
Esmeralda Gallardo b96e2c314a
Minor style changes and removed unnecesary comments 2023-02-24 15:11:15 -03:00
Esmeralda Gallardo f96d6a04b6
Fixed UnboundLocalError on _ahab. Added test for marketstore's initialization 2023-02-22 13:28:07 -03:00
Guillermo Rodriguez acc6249d88
Remove unnesesary arguments to some pikerd functions, fix container init error
by switching from log reading to quering es health endpoint, fix install on ci
and add more logging.
2023-02-21 20:45:10 -03:00
Esmeralda Gallardo b5cdf14036
Modified elasticsearch file name to 'elastic' to avoid name errors. Applied changes suggested in the pr. 2023-02-21 13:34:29 -03:00
Guillermo Rodriguez bf9ca4a4a8
Generalize ahab to support elasticsearch logs and init procedure 2023-02-21 13:34:29 -03:00
Guillermo Rodriguez 17a4fe4b2f
Trim unnecesary stuff left from marketstore copy, also fix elastic config name for docker build, add elasticsearch to dependencies 2023-02-21 13:34:28 -03:00
Esmeralda Gallardo 0dc24bd475
Added dockerfile, yaml file and script to statrt an elasticsearch's docker instance. 2023-02-21 13:34:26 -03:00
Tyler Goodlet e01220af14 Type annot tweaks to feeds mod 2023-02-21 10:54:18 -05:00
Tyler Goodlet ebf53e32bd Fix return type annot for `slice_from_time()` 2023-02-13 12:27:58 -05:00
Tyler Goodlet 433697cc4f Add cached refs to last 1d xy outputs
For the purposes of avoiding another full format call we can stash the
last rendered 1d xy pre-graphics formats as
`IncrementalFormatter.x/y_1d: np.ndarray`s and allow readers in the viz
and render machinery to use this data easily for things like "only
drawing the last uppx's worth of data as a line". Also add
a `.flat_index_ratio: float` which can be used similarly as a scalar
applied to indexes into the src array but instead when indexing
(flattened) 1d xy formatted outputs. Finally, this drops the way
overdone/noisy `.__repr__()` meth we had XD
2023-02-13 12:27:58 -05:00
Tyler Goodlet d622b4157c Only draw up to 2nd last datum for OHLC bars paths 2023-02-13 12:27:58 -05:00
Tyler Goodlet 92ce1b3304 Only handle hist discrepancies when market is open
We obviously don't want to be debugging a sample-index issue if/when the
market for the asset is closed (since we'll be guaranteed to have
a mismatch, lul). Pass in the `feed_is_live: trio.Event` throughout the
backfilling routines to allow first checking for the live feed being active
so as to avoid breakpointing on false +ves. Also, add a detailed warning
log message for when *actually* investigating a mismatch.
2023-02-13 12:27:58 -05:00
Tyler Goodlet a8e1796a8b Comment bad x-range bp for now 2023-02-13 12:27:58 -05:00
Tyler Goodlet 5ced05aab0 Breakpoint bad (-ve or too large) x-ranges to m4
This should never really happen but when it does it appears to be a race
with writing startup pre-graphics-formatter array data where we get
`x_end` epoch value subtracting some really small offset value (like
`-/+0.5`) or the opposite where the `x_start` is epoch and `x_end` is
small.

This adds a warning msg and `breakpoint()` as well as guards around the
entire code downsampling code path so that when resumed the downsampling
cycle should just be skipped and avoid a crash.
2023-02-13 12:27:58 -05:00
Tyler Goodlet 7afc9301ac Handle last-in-view time slicing edge case
Whenever the last datum is in view `slice_from_time()` need to always
spec the final array index (i.e. the len - 1 value we set as
`read_i_max`) to avoid a uniform-step arithmetic error where gaps in the
underlying time series causes an index that's too low to be returned.
2023-02-13 12:27:58 -05:00
Tyler Goodlet 12c6d58c2a Drop bp blocks from formatters mod 2023-02-13 12:27:58 -05:00
Tyler Goodlet 63f0567418 Drop `Flume.index_stream()`, `._sampling.open_sample_stream()` replaces it 2023-02-13 12:27:58 -05:00
Tyler Goodlet 6a0c36922e Drop `._index_step` from formatters and instead defer to `Viz.index_step()` 2023-02-12 13:55:26 -05:00
Tyler Goodlet fc17187ff4 Drop edge case from `slice_from_time()`
Doesn't seem like we really need to handle the situation where the start
or stop input time stamps are outside the index range of the data since
the new binary search handling via `numpy.searchsorted()` covers this
case at minimal runtime cost and with an equally correct output. Allows
us to drop some other indexing endpoint internal variables as well.
2023-02-12 13:55:26 -05:00
Tyler Goodlet a7d78a3f40 Use left-style index search on RHS scan as well 2023-02-12 13:55:26 -05:00
Tyler Goodlet cdec4782f0 Add commented append slice-len sanity check 2023-02-12 13:55:26 -05:00
Tyler Goodlet ed1f64cf43 Fix gap detection on RHS; always bin-search on overshot time range 2023-02-12 13:55:26 -05:00
Tyler Goodlet 50ef4efccb Align step curves the same as OHLC bars 2023-02-12 13:55:26 -05:00
Tyler Goodlet 51f2461e8b Add `IncrementalFormatter.x_offset: np.ndarray`
Define the x-domain coords "offset" (determining the curve graphics
per-datum placement) for each formatter such that there's only on place
to change it when needed. Obviously each graphics type has it's own
dimensionality and this is reflected by the array shapes on each
subtype.
2023-02-12 13:55:26 -05:00
Tyler Goodlet 444768d30f Adjust OHLC bar x-offsets to be time span matched
Previously we were drawing with the middle of the bar on each index with
arms to either side: +/- some arm length. Instead this changes so that
each bar is drawn *after* each index/timestamp such that in graphics
coords the bar span more correctly matches the time span in the
x-domain. This makes the linked region between slow and fast chart
directly match (without any transform) for epoch-time indexing such that
the last x-coord in view on the fast chart is no more then the
next time step in (downsampled) slow view.

Deats:
- adjust in `._pathops.path_arrays_from_ohlc()` and take an `bar_w` bar
  width input (normally taken from the data step size).
- change `.ui._ohlc.bar_from_ohlc_row()` and
  `BarItems.draw_last_datum()` to match.
2023-02-12 13:55:26 -05:00
Tyler Goodlet 24b384f3ef Set `path_arrays_from_ohlc(use_time_index=True)` on epoch indexing
Allows easily switching between normal array `int` indexing and time
indexing by just flipping the `Viz._index_field: str`.

Also, guard all the x-data audit breakpoints with a time indexing
condition.
2023-02-12 13:55:26 -05:00
Tyler Goodlet 93330954c2 Ugh, use `bool` flag to determine index field.. 2023-02-12 13:55:26 -05:00
Tyler Goodlet 3019c35e30 Move `Viz` layer to new `.ui` mod 2023-02-12 13:41:18 -05:00
Tyler Goodlet 3638ae8d3e Drop unused `read_src_from_key: bool` to `.format_to_1d()` 2023-02-12 13:41:18 -05:00
Tyler Goodlet 0663880a6d Fix formatter xy ndarray first prepend case
First allocation vs. first "prepend" of source data to an xy `ndarray`
format **must be mutex** in order to avoid a double prepend.

Previously when both blocks were executed we'd end up with
a `.xy_nd_start` that was decremented (at least) twice as much as it
should be on the first `.format_to_1d()` call which is obviously
incorrect (and causes problems for m4 downsampling as discussed below).
Further, since the underlying `ShmArray` buffer indexing is managed
(i.e. write-updated) completely independently from the incremental
formatter updates and internal xy indexing, we can't use
`ShmArray._first.value` and instead need to use the particular `.diff()`
output's prepend length value to decrement the `.xy_nd_start` on updates
after initial alloc.

Problems this resolves with m4:
- m4 uses a x-domain diff to calculate the number of "frames" to
  downsample to, this is normally based on the ratio of pixel columns on
  screen vs. the size of the input xy data.
- previously using an int-index (not epoch time) the max diff between
  first and last index would be the size of the input buffer and thus
  would never cause a large mem allocation issue (though it may have
  been inefficient in terms of needed size).
- with an epoch time index this max diff could explode if you had some
  near-now epoch time stamp **minus** an x-allocation value: generally
  some value in `[0.5, -0.5]` which would result in a massive frames and
  thus internal `np.ndarray()` allocation causing either a crash in
  `numba` code or actual system mem over allocation.

Further, put in some more x value checks that trigger breakpoints if we
detect values that caused this issue - we'll remove em after this has
been tested enough.
2023-02-12 13:41:18 -05:00
Tyler Goodlet 3bed142d15 Handle time-indexing for fill arrows
Call into a reworked `Flume.get_index()` for both the slow and fast
chart and do time index clipping to last datum where necessary.
2023-02-12 13:41:18 -05:00
Tyler Goodlet 7aef31701b Add some commented debug prints for default fmtr 2023-02-12 13:41:18 -05:00
Tyler Goodlet 135627e142 Slicec to an extra index around each timestamp input 2023-02-12 13:41:18 -05:00
Tyler Goodlet 44f50e3d0e Implement `stop_t` gap adjustments; the good lord said it is the problem 2023-02-12 13:41:18 -05:00
Tyler Goodlet 5ab4e5493e Add gap detection for `stop_t`, though only report atm 2023-02-12 13:41:18 -05:00
Tyler Goodlet 98438e29ef Drop `Flume.view_data()` 2023-02-12 13:41:18 -05:00
Tyler Goodlet d649a7d1fa Drop old breakpoint 2023-02-12 13:41:18 -05:00
Tyler Goodlet 2669ced629 Drop `_slice_from_time()` 2023-02-12 13:41:18 -05:00
Tyler Goodlet f2c0987a04 Use uniform step arithmetic in `slice_from_time()`
If we presume that time indexing using a uniform step we can calculate
the exact index (using `//`) for the input time presuming the data
set has zero gaps. This gives a massive speedup over `numpy` fancy
indexing and (naive) `numba` iteration. Further in the case where time
gaps are detected, we can use `numpy.searchsorted()` to binary search
for the nearest expected index at lower latency.

Deatz,
- comment-disable the call to the naive `numba` scan impl.
- add a optional `step: int` input (calced if not provided).
- add todos for caching binary search results in the gap detection
  cases.
- drop returning the "absolute buffer indexing" slice since the caller
  can always just use the read-relative slice to acquire it.
2023-02-12 13:41:18 -05:00
Tyler Goodlet 0bdb7261d1 Flip over to epoch-time based x-domain indexing 2023-02-12 13:41:17 -05:00
Tyler Goodlet 12857a258b Adjust all `slice_from_time()` calls to not expect mask 2023-02-12 13:41:17 -05:00
Tyler Goodlet 46808fbb89 Rewrite `slice_from_time()` using `numba`
Gives approx a 3-4x speedup using plain old iterate-with-for-loop style
though still not really happy with this .5 to 1 ms latency..

Move the core `@njit` part to a `_slice_from_time()` with a pure python
func with orig name around it. Also, drop the output `mask` array since
we can generally just use the slices in the caller to accomplish the
same input array slicing, duh..
2023-02-12 13:41:17 -05:00
Tyler Goodlet a3844f9922 Use step size to determine bar gaps 2023-02-12 13:41:17 -05:00
Tyler Goodlet a33f58a61a Move `Flume.slice_from_time()` to `.data._pathops` mod func 2023-02-12 13:41:17 -05:00
Tyler Goodlet d5844ce8ff Delegate formatter `.index_field` to the parent `Viz` 2023-02-12 13:41:17 -05:00
Tyler Goodlet bf88b40a50 Facepalm**2: fix array-read-slice, like actually..
We need to subtract the first index in the array segment read, not the
first index value in the time-sliced output, to get the correct offset
into the non-absolute (`ShmArray.array` read) array..

Further we **do** need the `&` between the advance indexing conditions
and this adds profiling to see that it is indeed real slow (like 20ms
ish even when using `np.where()`).
2023-02-12 13:41:17 -05:00
Tyler Goodlet e4a0d4ecea Markup OHLC->path gen with `numba` issue # 2023-02-12 13:41:17 -05:00
Tyler Goodlet 031d7967de Facepalm: actually return latest index on time slice fail.. 2023-02-12 13:41:17 -05:00
Tyler Goodlet 2e67e98b4d Go with explicit `.data._m4` mod name
Since it's a notable and self-contained graphics compression algo, might
as well give it a dedicated module B)
2023-02-12 13:41:17 -05:00
Tyler Goodlet 7124a131dd Move (unused) path gen routines to `.ui._pathops` 2023-02-12 13:41:17 -05:00
Tyler Goodlet 9052ed5ddf Move qpath-ops routines back to separate mod 2023-02-12 13:41:17 -05:00
Tyler Goodlet 7ec21c7f3b Rename `.ui._pathops.py` -> `.ui._formatters.py 2023-02-12 13:41:17 -05:00
Tyler Goodlet 382a619a03 Fix from-time index slicing?
Apparently we want an `|` for the advanced indexing logic?
Also, fix `read_slc` start to not always be 0 XD
2023-02-12 13:41:17 -05:00
Tyler Goodlet 7f3f6f871a Move path ops routines to top of mod
Planning to put the formatters into a new mod and aggregate all path
gen/op helpers into this module.

Further tweak include:
- moving `path_arrays_from_ohlc()` back to module level
- slice out the last xy datum for `OHLCBarsAsCurveFmtr` 1d formatting
- always copy the new x-value from the source to `.x_nd`
2023-02-12 13:41:17 -05:00
Tyler Goodlet 6ea04f850d Drop diff state tracking in formatter
This was a major cause of error (particularly trying to get epoch
indexing working) and really isn't necessary; instead just have
`.diff()` always read from the underlying source array for current
index-step diffing and append/prepend slice construction.

Allows us to,
- drop `._last_read` state management and thus usage.
- better handle startup indexing by setting `.xy_nd_start/stop` to
  `None` initially so that the first update can be done in one large
  prepend.
- better understand and document the step curve "slice back to previous
  level" logic which is now heavily commented B)
- drop all the `slice_to_head` stuff from and instead allow each
  formatter to choose it's 1d segmenting.
2023-02-12 13:41:17 -05:00
Tyler Goodlet f3bab826f6 Comment out bps for time indexing 2023-02-12 13:41:17 -05:00
Tyler Goodlet ac1f37a2c2 Expect `index_field: str` in all graphics objects 2023-02-12 13:41:17 -05:00
Tyler Goodlet 166d14af69 Simplify formatter update methodology
Don't expect values (array + slice) to be returned and applied by
`.incr_update_xy_nd()` and instead presume this will implemented
internally in each (sub)formatter.

Attempt to simplify some incr-update routines, (particularly in the step
curve formatter, though most of it was reverted to just a simpler form
of the original implementation XD) including:
- dropping the need for the `slice_to_head: int` control.
- using the `xy_nd_start/stop` index counters over custom lookups.
2023-02-12 13:41:17 -05:00
Tyler Goodlet 696c6f8897 First attempt, field-index agnostic formatting
Remove harcoded `'index'` field refs from all formatters in a first
attempt at moving towards epoch-time alignment (though don't actually
use it it yet).

Adjustments to the formatter interface:
- property for `.xy_nd` the x/y nd arrays.
- property for and `.xy_slice` the nd format array(s) start->stop index
  slice.

Internal routine tweaks:
- drop `read_src_from_key` and always pass full source array on updates
  and adjust handlers to expect to have to index the data field of
  interest.
- set `.last_read` right after update calls instead of after 1d
  conversion.
- drop `slice_to_head` array read slicing.
- add some debug points for testing 'time' indexing (though not used
  here yet).
- add `.x_nd` array update logic for when the `.index_field` is not
  'index' - i.e. when we begin to try and support epoch time.
- simplify some new y_nd updates to not require use of `np.broadcast()`
  where possible.
2023-02-12 13:41:17 -05:00
Tyler Goodlet 6cacd7d18b Make `Viz.slice_from_time()` take input array
Probably means it doesn't need to be a `Flume` method but it's
convenient to expect the caller to pass in the `np.ndarray` with
a `'time'` field instead of a `timeframe: str` arg; also, return the
slice mask instead of the sliced array as output (again allowing the
caller to do any slicing). Also, handle the slice-outside-time-range
case by just returning the entire index range with a `None` mask.

Adjust `Viz.view_data()` to instead do timeframe (for rt vs. hist shm
array) lookup and equiv array slicing with the returned mask.
2023-02-12 13:41:17 -05:00
Tyler Goodlet 5b08e9cba3 Add breakpoint on -ve range for now 2023-02-12 13:41:17 -05:00
Tyler Goodlet d3f5ff1b4f Go back to hard-coded index field
Turns out https://github.com/numba/numba/issues/8622 is real
and the suggested `numba.literally` hack doesn't seem to work..
2023-02-12 13:41:16 -05:00
Tyler Goodlet e45bc4c619 Move `ui._compression`/`._pathops` to `.data` subpkg
Since these modules no longer contain Qt specific code we might
as well include them in the data sub-package.

Also, add `IncrementalFormatter.index_field` as single point to def the
indexing field that should be used for all x-domain graphics-data
rendering.
2023-02-12 13:39:10 -05:00
Tyler Goodlet 8d592886fa Pass `Flume`s throughout FSP-ui and charting APIs
Since higher level charting and fsp management need access to the
new `Flume` indexing apis this adjusts some func sigs to pass through
(and/or create) flume instances:
- `LinkedSplits.add_plot()` and dependents.
- `ChartPlotWidget.draw_curve()` and deps, and it now returns a `Flow`.
- `.ui._fsp.open_fsp_admin()` and `FspAdmin.open_fsp_ui()` related
  methods => now we wrap the destination fsp shm in a flume on the admin
  side and is returned from `.start_engine_method()`.

Drop a bunch of (unused) chart widget methods including some already
moved to flume methods: `.get_index()`, `.in_view()`,
`.last_bar_in_view()`, `.is_valid_index()`.
2023-02-02 13:32:30 -05:00
Tyler Goodlet fcfc0f31f0 Enable backpressure in an effort to prevent bootup overruns 2023-01-30 11:45:29 -05:00
Tyler Goodlet 844626f6dc Move `brokerd` service task to root `.data` mod 2023-01-13 13:21:49 -05:00
Tyler Goodlet 71ca4c8e1f Use actor uid in shm keys for rt quote buffers
Allows running simultaneous data feed services on the same (linux) host
by avoiding file-name collisions instead keying shm buffer sets by the
given `brokerd` instance. This allows, for example, either multiple dev
versions of the data layer to run side-by-side or for the test suite to
be seamlessly run alongside a production instance.
2023-01-13 13:21:49 -05:00
Tyler Goodlet 045b76bab5 Make `Flume.index_stream()` defer to new sampling api 2023-01-13 13:21:49 -05:00
Tyler Goodlet d66fb49077 Don't deliver shms from `start_backfill()`, they're not used 2023-01-13 13:21:49 -05:00
Tyler Goodlet 78c7c8524c Breakpoint when bad 1m history offsets are detected 2023-01-13 13:21:49 -05:00
Tyler Goodlet 5adb234a24 Don't receive sample-index msgs in feed layer 2023-01-13 13:21:49 -05:00
Tyler Goodlet 2778ee1401 Support not registering for sample-index msgs via `sub_for_broadcasts: bool` flag 2023-01-13 13:21:49 -05:00
Tyler Goodlet b3d1b1aa63 Port feed layer to use new `samplerd` APIs
Always use `open_sample_stream()` to register fast and slow quote feed
buffers and get a sampler stream which we use to trigger
`Sampler.broadcast_all()` calls on the service side after backfill
events.
2023-01-13 13:21:15 -05:00
Tyler Goodlet 5ec1a72a3d Implement a `samplerd` singleton actor service
Now spawned under the `pikerd` tree as a singleton-daemon-actor we offer
a slew of new routines in support of this micro-service:

- `maybe_open_samplerd()` and `spawn_samplerd()` which provide the
  `._daemon.Services` integration to conduct service spawning.
- `open_sample_stream()` which is a client-side endpoint which does all
  the work of (lazily) starting the `samplerd` service (if dne) and
  registers shm buffers for update as well as connect a sample-index
  stream for iterator by the caller.
- `register_with_sampler()` which is the `samplerd`-side service task
  endpoint implementing all the shm buffer and index-stream registry
  details as well as logic to ensure a lone service task runs
  `Services.increment_ohlc_buffer()`; it increments at the shortest period
  registered which, for now, is the default 1s duration.

Further impl notes:
- fixes to `Services.broadcast()` to ensure broken streams get discarded
  gracefully.
- we use a `pikerd` side singleton mutex `trio.Lock()` to ensure
  one-and-only-one `samplerd` is ever spawned per `pikerd` actor tree.
2023-01-13 13:21:15 -05:00
Tyler Goodlet 2c76cee928 Begin formalizing `Sampler` singleton API
We're moving toward a single actor managing sampler work and distributed
independently of `brokerd` services such that a user can run samplers on
different hosts then real-time data feed infra. Most of the
implementation details include aggregating `.data._sampling` routines
into a new `Sampler` singleton type.

Move the following methods to class methods:
- `.increment_ohlc_buffer()` to allow a single task to increment all
  registered shm buffers.
- `.broadcast()` for IPC relay to all registered clients/shms.

Further add a new `maybe_open_global_sampler()` which allocates
a service nursery and assigns it to the `Sampler.service_nursery`; this
is prep for putting the step incrementer in a singleton service task
higher up the data-layer actor tree.
2023-01-13 13:21:15 -05:00
Tyler Goodlet 3efb0b5884 Sync 1s (or less) sampler steps using rounded now-epoch 2023-01-13 13:21:15 -05:00
Tyler Goodlet 009bbe456e Always `.error()` log unknown queries for `marketstore` 2023-01-13 13:21:15 -05:00
Tyler Goodlet daf7b3f4a5 Only accept 6 tries for the same duplicate hist frame
When we see multiple history frames that are duplicate to the request
set, bail re-trying after a number of tries (6 just cuz) and return
early from the tsdb backfill loop; presume that this many duplicates
means we've hit the beginning of history. Use a `collections.Counter`
for the duplicate counts. Make sure and warn log in such cases.
2023-01-13 13:21:15 -05:00
Tyler Goodlet b0a6dd46e4 Use recon set on stack closing during reconnect
Hopefully resolves https://github.com/pikers/piker/issues/434
2023-01-13 13:21:15 -05:00
Tyler Goodlet 1c5141f4c6 Fix f-str in duplicate frame msg print 2023-01-13 13:21:15 -05:00
Tyler Goodlet 4cdd2271b0 Drop `tractor` assert bug note 2023-01-13 13:21:15 -05:00
Tyler Goodlet 04c0d77595 Frame ticks in helper routine
Wow, turns out tick framing was totally borked since we weren't framing
on "greater then throttle period long waits" XD

This moves all the framing logic into a common func and calls it in
every case:
- every (normal) "pre throttle period expires" quote receive
- each "no new quote before throttle period expires" (slow case)
- each "no clearing tick yet received" / only burst on clears case
2023-01-13 13:21:15 -05:00
Tyler Goodlet 8e1ceca43d Add some data-flows jargon notes (re: #270) 2023-01-13 13:21:15 -05:00
Tyler Goodlet c85e7790de Rename `._flumes.py` -> `.flows.py` 2023-01-13 13:21:15 -05:00
Tyler Goodlet 2399c618b6 Expand sampler loop shm write lines 2023-01-13 13:21:15 -05:00
Tyler Goodlet 7ec88f8cac Make hist shm token optional to allow for FSPs 2023-01-13 13:21:15 -05:00
Tyler Goodlet eacd44dd65 Move `Flume` to a new `.data._flumes` module 2023-01-13 13:21:15 -05:00
Tyler Goodlet e5e70a6011 Extend `Flume` methods
Add some (untested) data slicing util methods for mapping time ranges to
source data indices:
- `.get_index()` which maps a single input epoch time to an equiv array
  (int) index.
- add `slice_from_time()` which returns a view of the shm data from an
  input epoch range presuming the underlying struct array contains
  a `'time'` field with epoch stamps.
- `.view_data()` which slices out the "in view" data according to the
  current state of the passed in `pg.PlotItem`'s view box.
2023-01-13 13:21:15 -05:00
Tyler Goodlet 1ee49df31d Ensure a rt shm buffer without backfill has correct epoch timestamping 2023-01-13 13:21:15 -05:00
Tyler Goodlet f2df32a673 Use throttle period for wait-on-clearing-event timeout 2023-01-13 13:21:15 -05:00
Tyler Goodlet 125e31dbf3 Implement by-type tick-framing in throttler loop
This has been an outstanding idea for a while and changes the framing
format of tick events into a `dict[str, list[dict]]` wherein for each
tick "type" (eg. 'bid', 'ask', 'trade', 'asize'..etc) we create an FIFO
ordered `list` of events (data) and then pack this table into each
(throttled) send. This gives an additional implied downsample reduction
(in terms of iteration on the consumer side) from `N` tick-events to
a (max) `T` tick-types presuming the rx side only needs the latest tick
event.

Drop the `types: set` and adjust clearing event test to use the new
`ticks_by_type` map's keys.
2023-01-13 13:21:15 -05:00
Tyler Goodlet 715e693564 Improved clearing-tick-burst-oriented throttling
Instead of uniformly distributing the msg send rate for a given
aggregate subscription, choose to be more bursty around clearing ticks
so as to avoid saturating the consumer with L1 book updates and vs.
delivering real trade data as-fast-as-possible.

Presuming the consumer is in the "UI land of slow" (eg. modern display
frame rates) such an approach serves more useful for seeing "material
changes" in the market as-bursty-as-possible (i.e. more short lived fast
changes in last clearing price vs. many slower changes in the bid-ask
spread queues). Such an approach also lends better to multi-feed
overlays which in aggregate tend to scale linearly with the number of
feeds/overlays; centralization of bursty arrival rates allows for
a higher overall throttle rate if used cleverly with framing.
2023-01-13 13:21:15 -05:00
Tyler Goodlet 4300470786 Fix for empty tsdb query result case 2023-01-13 13:21:15 -05:00
Tyler Goodlet cf6e44cb9c Add `NoBsWs.connected()` predicate 2023-01-13 12:39:17 -05:00
Tyler Goodlet 2a158aea2c Rework `_FeedsBus` subscriptions mgmt using `set`
Allows using `set` ops for subscription management and guarantees no
duplicates per `brokerd` actor. New API is simpler for dynamic
pause/resume changes per `Feed`:
- `_FeedsBus.add_subs()`, `.get_subs()`, `.remove_subs()` all accept multi-sub
  `set` inputs.
- `Feed.pause()` / `.resume()` encapsulates management of *only* sending
  a msg on each unique underlying IPC msg stream.

Use new api in sampler task.
2023-01-10 11:09:19 -05:00
Tyler Goodlet 88870fdda7 Set `brokers: list[st]` from mods when not provided.. 2023-01-10 11:09:19 -05:00
Tyler Goodlet 326f153a47 Catch overruns on throttled feed subs too
Previously we would only detect overruns and drop subscriptions on
non-throttled feed subs, however you can get the same issue with
a wrapping throttler task:
- the intermediate mem chan can be blocked either by the throttler task
  being too slow, in which case we still want to warn about it
- the stream's IPC channel actually breaks and we still want to drop
  the connection and subscription so it doesn't be come a source of
  stale backpressure.
2023-01-10 11:09:19 -05:00
Tyler Goodlet f5cd63ad35 Ensure correct stream is set on each `Flume`
Set each quote-stream by matching the provider for each `Flume` and thus
results in some flumes mapping to the same (multiplexed) stream.
Monkey-patch the equivalent `tractor.MsgStream._ctx: tractor.Context` on
each broadcast-receiver subscription to allow use by feed bus methods as
well as other internals which need to reference IPC channel/portal info.

Start a `_FeedsBus` subscription management API:
- add `.get_subs()` which returns the list of tuples registered for the
  given key (normally the fqsn).
- add `.remove_sub()` which allows removing by key and tuple value and
  provides encapsulation for sampler task(s) which deal with dropped
  connections/subscribers.
2023-01-10 11:09:19 -05:00
Tyler Goodlet 1e96ca32df Move `maybe_open_feed()` above for readability 2023-01-10 11:09:19 -05:00
Tyler Goodlet 7b9db86753 Multi-`broker` quotes with `Feed.open_multi_stream()`
Adds provider-list-filtered (quote) stream multiplexing support allowing
for merged real-time `tractor.MsgStream`s using an `@acm` interface.
Behind the scenes we are just doing a classic multi-task push to common
mem chan approach.

Details to make it work on `Feed`:
- add `Feed.mods: dict[str, Moduletype]` and
  `Feed.portals[ModuleType, tractor.Portal]` which are both populated
  during init in `open_feed()`
- drop `Feed.portal` and `Feed.name`

Also fix a final lingering tsdb history loading loop termination bug.
2023-01-10 11:09:19 -05:00
Tyler Goodlet 20a396270e `Storage.read_ohlcv()` now returns a `numpy` array 2023-01-10 11:09:19 -05:00
Tyler Goodlet 81516c5204 Finally fix tsdb -> shm backfill loading
A slight facepalm but, the main issue was a simple indexing logic error:
we need to slice with `tsdb_history[-shm._first.value:]` to push most
recent history not oldest.. This allows cleanup of tsdb backfill loop as
well.

Further, greatly simply `diff_history()` time slicing by using the
classic `numpy` conditional slice on the epoch field.
2023-01-10 11:09:19 -05:00
Tyler Goodlet d6fb6fe3ae Just drop the pretty repr from our struct for now 2023-01-10 11:09:19 -05:00
Tyler Goodlet 8476d8d056 Fix partial-frame-missing backfill logic
This had a bug prior where the end of a frame (a partial) wasn't being
sliced correctly and we'd get odd gaps showing up in the backfilled from
`brokerd` vs. tsdb end index. Repair this by doing timeframe aware index
diffing in `diff_history()` which seems to resolve it. Also, use the
frame-result's `end_dt: datetime` for the loop exit condition.
2023-01-10 11:09:19 -05:00
Tyler Goodlet 29b6b3e54f Port `storesh` cli-cmd machinery to `Flume` apis 2023-01-10 11:09:19 -05:00
Tyler Goodlet 8a01c9e42b Fix broker-tail stripping using `str.removesuffix()` 2023-01-10 11:09:19 -05:00
Tyler Goodlet 7daab6329d Make `Symbol` derive from internal `.types.Struct` 2023-01-10 11:09:19 -05:00
Tyler Goodlet bb6452b969 Further feed syncing fixes wrt to `Flumes`
Sync per-symbol sampler loop start to subscription registers such that
the loop can't start until the consumer's stream subscription is added;
the task-sync uses a `trio.Event`. This patch also drops a ton of
commented cruft.

Further adjustments needed to get parity with prior functionality:
- pass init msg 'symbol_info' field to the `Symbol.broker_info: dict`.
- ensure the `_FeedsBus._subscriptions` table uses the broker specific
  (without brokername suffix) as keys for lookup so that the sampler
  loop doesn't have to append in the brokername as a suffix.
- ensure the `open_feed_bus()` flumes-table-msg returned sent by
  `tractor.Context.started()` uses the `.to_msg()` form of all flume
  structs.
- ensure `maybe_open_feed()` uses `tractor.MsgStream.subscribe()` on all
  `Flume.stream`s on cache hits using the
  `tractor.trionics.gather_contexts()` helper.
2023-01-10 11:09:19 -05:00
Tyler Goodlet 25bfe6f035 Use new |-union style type annots in sampling routines 2023-01-10 11:09:19 -05:00
Tyler Goodlet e7de5404d3 Add `Symbol.fqsn: str` property 2023-01-10 11:09:19 -05:00
Tyler Goodlet 18dc8b08e4 First draft aggregate feedz support
Orient shm-flow-arrays around the new idea of a `Flume` which provides
access, mgmt and basic measure of real-time data flow sets (see water
flow management semantics).

- We discard the previous idea of a "init message" which contained all
  the shm attachment info and instead send a startup message full of
  `Flume.to_msg()`s which are symmetrically loaded on the caller actor
  side.

- Create data-flows "entries" for every passed in fqsn such that the consumer gets back
  streams and shm for each, now all wrapped in `Flume` types. For now we
  allocate `brokermod.stream_quotes()` tasks 1-to-1 for each fqsn
  (instead of expecting each backend to do multi-plexing, though we
  might want that eventually) as well a `_FeedsBus._subscriber` entry
  for each. The pause/resume management loop is adjusted to match.
  Previously `Feed`s were  allocated 1-to-1 with each fqsn.

- Make `Feed` a `Struct` subtype instead of a `@dataclass` and move all
  flow specific attrs to the new `Flume`:
  - move `.index_stream()`, `.get_ds_info()` to `Flume`.
  - drop `.receive()`: each fqsn entry will now require knowledge of
    separate streams by feed users.
  - add multi-fqsn tables: `.flumes`, `.streams` which point to the
    appropriate per-symbol entries.

- Async load all `Flume`s from all contexts and all quote streams using
  `tractor.trionics.gather_contexts()` on the client `open_feed()` side.

- Update feeds test to include streaming 2 symbols on the same (binance)
  backend.
2023-01-10 11:09:18 -05:00
Tyler Goodlet 344a634cb6 Always set fqsn in `Feed.symbols: dict` 2023-01-10 11:09:18 -05:00