Design: data schema for opts ts storage #24
Open
opened 2025-02-06 18:02:20 +00:00 by ntorres
·
4 comments
No Branch/Tag Specified
hist_backfill_fixes
max_pain_storage
macos_fixed_with_readme
how_to_show_ur_pp
macos_fixes_2025
qt_w_graceful_SIGINT
ib_2025_updates
testing_utils
alt_tpts_for_perf
binance_api_3.1
port_to_latest_tractor
bump_polars
decimal_prices_thru_ems
mp_fomo_polish
max_pain_chart
max_pain_deribit
deribit_fix
brokers_refinery
accounting_refinery
add_visidata
no_symcache_no_problem
service_mng_to_tractor
ems_no_last_required
stop_is_oec
gitea_feats
jsonrpc_err_in_rent
tsp_gaps
max_pain_chart_backup
subsys_refinery
subsys_refinery_BACKUP
max_pain_deribit_backup
add_ruff_linter
deribit_fix_backup
kucoin_and_binance_fix
uv_migration
kucoin_and_binance_BACKUP
go_httpx_no_unrelated_binance_stuff
go_httpx
go_httpx_orig_BACKUP
ib_refinements
fix_deribit_hist_queries_NEW
fix_deribit_hist_queries
prev_deribit_fix
nix_shell_env_fix
cherry_to_storage_pr
fix_deribit_hist_queries_BACKUP
kucoin_and_binance_fix_goodboyBACKUP
nix-headless-fix
go_httpx_binance
fix-binance-venues
pyqt6
distribute_dis
runtime_to_msgspec
nix-monkeys-fix
ib_py311_fixes
poetry2nix
account_tests
py311_ib_fix
master
basic_buy_bot
marketstore_disable_snappy
rekt_pps
py311
kucoin_backend
small_kucoin_fixes
pre_overruns_ctxcancelled
emit_clear_ticks_only_on_ts_change
binance_ws_ep_update
log_linearized_curve_overlays
xdotool_fixes
deribit_updates
storage_middleware_layer
service_subpkg
pps_precision_hotfixes
service_subpkg_backup
storage_cli
xdo_and_you
decimalization_take_2
backward_compat_trans_with_symbolinfo
explicit_write_pps_on_exit
backend_spec
paper_trade_improvements_rebase
loglevel_to_testpikerd
overlays_interaction_latency_tuning
kraken_deposits_fixes
l1_compaction
epoch_indexing_and_dataviz_layer
multichartz
axis_sticky_api
multi_symbol_input
update_qt_screen_info_script
fsps_and_flumes
epoch_index
pregraphics_formatters
multichartz_backup
samplerd_service
misc_brokerd_backend_repairs
pre_viz_calls
dark_clearing_improvements
dark_clearing_repairs
epoch_index_backup
agg_feedz
kraken_limits_fields
cz_post_ftx
kraken_pair_status
ib_contract_updates
pre_multifeed_hotfix
kraken_pair_updates
daemon_sockaddr_config
no_signal_pi_overlays
pre_multi_feedz
pg_exts_fork
ib_1m_hist
even_moar_kraken_order_fixes
ci_fix_tractor_testing
clears_table_events
offline_dark_clearing
multi_client_order_mgt
pin_tractor_main
history_view
tractor_core_port
kraken_fill_bugs
paper_clear_logics_fix
kraken_nameerr_fix
livenpaper
size_in_shm_token
deribit
asycvnc_pin_bump
live_n_paper
open_order_loading
doin_the_splits
dict_differ
msgpack_zombie
the_ems_flattening
kraken_ws_orders
kraken_userref_hackzin
pydantic_zombie
ib_pps_upgrade
multisympaper
ppu_history
basic_pp_audit
historical_breakeven_pp_price
ib_native_data_hack
pptables
ahab_you_bad_boi
fix_forex
fsp_shm_caching
paper_eng_msg_fixes
drop_pydantic
krakenwsbackup
dpbackup
null_last_quote_fix
ahab_hardkill
tractor_typed_msg_hackin
kraken_ledger_pps
ib_rt_pp_update_hotfix
notokeninwswrapper
pps_postmortem
lifo_pps_ib
310_plus
mxmn_from_m4
contain_mkts
slic_fix_v2
uppx_slice_fix
update_last_datums_in_view
ib_subpkg
flexxin
ib_checker_hackz
incremental_update_paths
ib_dedicated_data_client
dockerize_ib_gw
pre_flow
drop_pandas
no_ib_pps
trimeter_dep
l1_precision_fix
marketstore
no_orderid_in_error
incr_update_backup
big_data_lines
m4_corrections
offline_history_loading
drop_arrow_add_predulum
marketstore_backup
broker_bumpz
no_git_prot_w_pip
kraken_editorder
fqsns
kraken_cleaning
kraken_orders
only_draw_iv_for_ohlc
mkts_backup
pp_bar_fixes
async_hist_loading
kraken_gb
windows_fixes_yo
ib_mkt_closed
py3.10_support
gb_kraken_orders
dark_vlm
overlayed_dvlm
fsp_ui_mod
vlm_plotz
plotitem_overlays
misc_backend_fixes
dolla_vlm
fspd_cluster
misc_ib_updates
single_display_update_loop
basic_vlm_display
pane_sizing_fixes
tractor_clustering
msgpack_no_sets_allowed
tinas_unite
simpler_quote_throttle_logic
fast_step_curve
win_fixes
fsp_hotfixes
teardown_guesmost_via_cs
windows_testing_volume
fsp_drunken_alignment
vlm_plotz_backup
fsp_feeds
chart_mod_breakup
pause_feeds_on_sym_switch
brokers_config
ordermodepps_backup
readme_bump_zone
ci_on_forks
asyncify_input_modes
minimal_brokerd_trade_dialogues
wait_on_daemon_portals
backup_asyncify_input_modes
naive_feed_throttling
window_cuckery
msgspec_fixes
status_bar
ems_hotfixes
ems_to_bidir_streaming
web_utils
symbol_search
binance_syminfo_and_mintick
update_throttling
syseng_tweaks
ems_tweaks
py3.9
binance_aggtrades_and_ohlc_parsing
binance_backend
tractor_open_stream_from
feed_fixes
order_mode_finesse
cached_feeds
readme_bumpz
supervise
basic_orders
kraken_trades_data
basic_alerts
kraken_history
y_zoom
chart_trader
graphics_pixel_buffer
vwap_fsp
to_qpainterpath_and_beyond
tina_free_vwap
vwap_backup
bar_select
ib_backend
unleash_the_kraken
facepalm
marketstore_integration
its_happening
relicense_as_agplv3
questrade_candles
use_tractor_logging
questrade_conns
kivy_mainline_and_py3.8
Labels
Clear labels
No items
No Label
Milestone
Clear milestone
No items
No Milestone
Projects
Clear projects
No project
Assignees
Clear assignees
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.
No due date set.
Dependencies
No dependencies set.
Reference: pikers/piker#24
Reference in New Issue
There is no content yet.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may exist for a short time before cleaning up, in most cases it CANNOT be undone. Continue?
The fundamental “source” data schema we need to store for computing max pain at this moment:
Maybe we need to evaluate which data we want to add to the above data schema.
For the storage we need to add methods like
write_ohlcv()
andread_ohlcv()
for the new data schema:write_oi()
andread_oi()
(of course, this can be change)Also, add a method like
mk_ohlcv_shm_keyed_filepath()
for the open interest shm keyed filepath:mk_oi_shm_keyed_filepath()
Add storage support for derivsto Design: data schema for opts ts storage@ntorres one thing i can see immediately missing is an expiry field 😉
One in depth question i have (and don’t have an answer for yet) is whether we want to to orient the table schema such that all contracts can be interleaved in a big continuous table (for say a given strike/expiry/type) and then broken up for processing/viewing via a
polars
unpack or should we be keeping all contracts separate files entirely for long term storage and then expecting.parquet
loader code to do the correct (parsing of filenames) thing to make it easy to load multiple contract-mkts at once?Ok, yeah after a little thought i figured i’d make a table diagram (see attached) of what i envision being the most useful joined end-table for analysis and processing (like max-pain).
I’m hoping to re-render this using a
d2
table (maybe inside a larger diagram) and/or possibly using a real screen shot fromsummary of approaches
granular per-mkt named by FQME files
the most file-granular approach would be to keep the table fields relatively simple with a min 2 but possibly optionally 3:
time[_ns]: int|float
, the timestamp of the OI updateoi: float|int
, the actual purported open interest of the contract reported by a provider (feed)oi_cal: float|int
, thepiker
options sys calculated by us OI based on a mkt’s prior state (initial OI) and ongoing (real-time) vlm feed (likely ohlcv of some sort)this would mean we leverage a unique fqme schema to index mkts in the file sys very distinctly in such a way that,
.parquet
file contains which deriv mkts data by glancing at the filesysvisidata
(which yes supportsparquest
;) from consolepiker.fsp
andpiker.ui
consumers can be more flexible with logic around certain fields existing optionally from various providers and then dynamically processing/displaying different outputpolars
when needed but will have the further flexibility that not all mkts for a given “super scope” need to be loaded into mem if the user doesn’t require itall-in-one-table NON-granular, which would mean less
.parquet
filesnot my preferred approach since it means constantly having to load a large table scoped by some grander mkt-property (like provider or expiry or settling asset etc.)
However, this likely would result in a much simpler
.piker.storage
implementation as well as improved loading performance for very large (multi assert) derivatives data sets since all “sub contract mkts” could be allocated in a single table like originally put in the descr:Another impl detail wrt how a
datad
provider can offer OI info for a deriv..Likely they either provide it like we should,
oi
/oi_calc
updated with every clearing event (every tick that contains non-zero vlm) or,deribit
)how we want to aggregate
given there is likely going to be 2 feeds for most providers,
we need a way to merge these to enable getting single table for processing such that you can easily get the normaly trade event info (ticks) but with at least an added
oi
/oi_calc
field included as a column.For a granular per-mkt named by FQME files approach.
Open interest storage: #42
Here we’re using the
storage.nativedb
mod to write in a parquet file8f1e082c91
, at the moment it creates a file if it doesn’t exist for thefqme
and append the last process row.As discuss above this is the structure dtype:
The file name is the
fqme
for each instrument, in themax_pain
we construct this for each incoming msg, a better solution is to pass thefqme
already build,fqme
github discussionSo far this script appends the last row on each update for the instruments on separate files for a specifics
expiry_date
write_oi()
and_write_oi()
interface and impl for writting in parquet theoi struct
instorage.nativedb
, also addmk_oi_shm_keyed_filepath()
for managing shm file.