It doesn't seem to be any slower on our least throttled backend
(binance) and it removes a bunch of hard to get correct frame
re-ordering logic that i'm not sure really ever fully worked XD
Commented some issues we still need to resolve as well.
Manual tinker-testing demonstrated that triggering data resets
completely independent of the frame request gets more throughput and
further, that repeated requests (for the same frame after cancelling on
the `trio`-side) can yield duplicate frame responses. Re-work the
dual-task structure to instead have one task wait indefinitely on the
frame response (and thus not trigger duplicate frames) and the 2nd data
reset task poll for the first task to complete in a poll loop which
terminates when the frame arrives via an event.
Dirty deatz:
- make `get_bars()` take an optional timeout (which will eventually be
dynamically passed from the history mgmt machinery) and move request
logic inside a new `query()` closure meant to be spawned in a task
which sets an event on frame arrival, add data reset poll loop in the
main/parent task, deliver result on nursery completion.
- handle frame request cancelled event case without crash.
- on no-frame result (due to real history gap) hack in a 1 day decrement
case which we need to eventually allow the caller to control likely
based on measured frame rx latency.
- make `wait_on_data_reset()` a predicate without output indicating
reset success as well as `trio.Nursery.start()` compat so that it can
be started in a new task with the started values yielded being
a cancel scope and completion event.
- drop the legacy `backfill_bars()`, not longer used.
Adjust all history query machinery to pass a `timeframe: int` in seconds
and set default of 60 (aka 1m) such that history views from here forward
will be 1m sampled OHLCV. Further when the tsdb is detected as up load
a full 10 years of data if possible on the 1m - backends will eventually
get a config section (`brokers.toml`) that allow user's to tune this.
The `Store.load()`, `.read_ohlcv()` and `.write_ohlcv()` and
`.delete_ts()` now can take a `timeframe: Optional[float]` param which
is used to look up the appropriate sampling period table-key from
`marketstore`.
Allow data feed sub-system to specify the timeframe (aka OHLC sample
period) to the `open_history_client()` delivered history fetching API.
Factor the data keycombo hack into a new routine to be used also from
the history backfiller code when request latency increases; there is
a first draft at trying to use the feed reset to speed up 1m frame
throttling by timing out on the history frame response, but it needs
a lot of fine tuning.
This is a simpler (and oddly more `trio`-nic and/or SC) way to handle
the cancelled-before-acked race for order dialogs. Will allow keeping
the `.req` field as solely an `Order` msg.
When the client is faster then a `brokerd` at submitting and cancelling
an order we run into the case where we need to specify that the EMS
cancels the order-flow as soon as the brokerd's ack arrives. Previously
we were stashing a `BrokerdCancel` msg as the `Status.req` msg (to be
both tested for as a "already cancelled" and sent immediately on ack arrival to
the broker), but for such
cases we can't use that msg to find the fqsn (since only the client side
msgs have it defined) which is required by the new
`Router.client_broadcast()`.
So, Since `Status.req` is supposed to be a client-side flow msg anyway,
and we need the fqsn for client broadcasting, we change this `.req`
value to the client's submitted `Cancel` msg (thus rectifying the
missing `Router.client_broadcast()` fqsn input issue) and build the
`BrokerdCancel` request from that `Cancel` inline in the relay loop
from the `.req: Cancel` status msg lookup.
Further we allow `Cancel` msgs to define an `.account` and adjust the
order mode loop to expect `Cancel` source requests in cancelled status
updates.
Except for paper accounts (in which case we need a trades dialog and
paper engine per symbol to enable simulated clearing) we can rely on the
instrument feed (symbol name) to be the caching key. Utilize
`tractor.trionics.maybe_open_context()` and the new key-as-callable
support in the paper case to ensure we have separate paper clearing
loops per symbol.
Requires https://github.com/goodboy/tractor/pull/329
With the refactor of the dark loop into a daemon task already-open order
relaying from a `brokerd` was broken since no subscribed clients were
registered prior to the relay loop sending status msgs for such existing
live orders. Repair that by adding one more synchronization phase to the
`Router.open_trade_relays()` task: deliver a `client_ready: trio.Event`
which is set by the client task once the client stream has been
established and don't start the `brokerd` order dialog relay loop until
this event is ready.
Further implementation deats:
- factor the `brokerd` relay caching back into it's own `@acm` method:
`maybe_open_brokerd_dialog()` since we do want (but only this) stream
singleton-cached per broker backend.
- spawn all relay tasks on every entry for the moment until we figure
out what we're caching against (any client pre-existing right, which
would mean there's an entry in the `.subscribers` table?)
- rename `_DarkBook` -> `DarkBook` and `DarkBook.orders` -> `.triggers`
This enables "headless" dark order matching and clearing where an `emsd`
daemon subactor can be left running with active dark (or other
algorithmic) orders which will still trigger despite to attached-controlling
ems-client.
Impl details:
- rename/add `Router.maybe_open_trade_relays()` which now does all work
of starting up ems-side long living clearing and relay tasks and the
associated data feed; make is a `Nursery.start()`-able task instead of
an `@acm`.
- drop `open_brokerd_trades_dialog()` and move/factor contents into the
above method.
- add support for a `router.client_broadcast('all', msg)` to wholesale
fan out a msg to all clients.
Establishes a more formalized subscription based fan out pattern to ems
clients who subscribe for order flow for a particular symbol (the fqsn
is the default subscription key for now).
Make `Router.client_broadcast()` take a `sub_key: str` value which
determines the set of clients to forward a message to and drop all such
manually defined broadcast loops from task (func) code. Also add
`.get_subs()` which (hackily) allows getting the set of clients for
a given sub key where any stream that is detected as "closed" is
discarded in the output. Further we simplify to `Router.dialogs:
defaultdict[str, set[tractor.MsgStream]]` and `.subscriptions` as maps
to sets of streams for much easier broadcast management/logic using set
operations inside `.client_broadcast()`.
This patch was originally to fix a bug where new clients who
re-connected to an `emsd` that was running a paper engine were not
getting updates from new fills and/or cancels. It turns out the solution
is more general: now, any client that creates a order dialog will be
subscribing to receive updates on the order flow set mapped for that
symbol/instrument as long as the client has registered for that
particular fqsn with the EMS. This means re-connecting clients as well
as "monitoring" clients can see the same orders, alerts, fills and
clears.
Impl details:
- change all var names spelled as `dialogues` -> `dialogs` to be
murican.
- make `Router.dialogs: dict[str, defaultdict[str, list]]` so that each
dialog id (oid) maps to a set of potential subscribing ems clients.
- add `Router.fqsn2dialogs: dict[str, list[str]]` a map of fqsn entries to
sets of oids.
- adjust all core task code to make appropriate lookups into these 2 new
tables instead of being handed specific client streams as input.
- start the `translate_and_relay_brokerd_events` task as a daemon task
that lives with the particular `TradesRelay` such that dialogs cleared
while no client is connected are still processed.
- rename `TradesRelay.brokerd_dialogue` -> `.brokerd_stream`
- broadcast all status msgs to all subscribed clients in the relay loop.
- always de-reg each client stream from the `Router.dialogs` table on close.
Not sure what exactly happened but it seemed clears weren't working in
some cases without this, also there's no point in spinning the simulated
clearing loop if we're handling a non-clearing tick type.