コンテンツにスキップ

LT LDAS Runbook

CRITICAL: lt-ldas runs in parallel with Python aegis-ldas during the G2 migration observation phase. The two daemons write to DIFFERENT archive roots to avoid collision. Do NOT remove the --archive-path=/data/live_data_archive_rust override in docker-compose.lt-ldas.yml until the G2 Day 15 cutover.

What It Is

lt-ldas is the Rust-native LDAS EOD + Intraday collection daemon, replacement for Python aegis-ldas container's supervisord-managed 2 processes (cron EOD + intraday 15min loop). It is the G2 phase of the Python → Rust trading infrastructure migration.

The binary has two subcommands:

  • lt-ldas eod — one-shot EOD collection: VIX + stocks + options + GEX + IV Rank for a given date (default: previous trading day). Designed to be called by external scheduler (cron, or manual docker compose run).
  • lt-ldas intraday — long-running intraday loop: every 15 minutes during market hours (09:35-16:00 ET weekdays), collect options snapshots with _pt_HHMMSS suffix.

Key design decisions:

  • Subcommand CLI mirrors Python's 2-process supervisord shape
  • In-proc GEX computation via the crate::data::gex pure-function module (G2 micro handoff, commit 182a8eb95)
  • In-proc IV Rank computation via crate::ivrank::run_daily (no subprocess needed, unlike Python which shells out to aegis-ivrank binary)
  • Parquet schema 1:1 compatible with Python ldas_writer.py (verified: timestamp, symbol, net_gex, abs_gex, zero_gamma_level, gamma_wall, put_wall, gex_regime, underlying_price)
  • Market hours + ET timezone + US holiday calendar handled natively
  • SIGTERM/SIGINT graceful shutdown via tokio::signal

Prerequisites

Central Synology .env at /volume1/aegis/.env MUST provide:

  • POLYGON_API_KEY — Polygon REST API key (for VIX, stocks, options fetch)

Optional (graceful degradation if absent): - UNUSUAL_WHALES_API_TOKEN — legacy UW Flow. Rust lt-ldas does NOT implement UW Flow (disabled in Python since 2026-03-26, consistent). Safe to omit. - FRED_API_KEY — used by Python historical VIX fallback. Rust uses Polygon exclusively.

Paths that MUST exist on Synology:

  • /volume1/aegis/.env (must contain POLYGON_API_KEY=...)
  • /volume1/aegis/live_data_archive_rust/ (parallel observation archive root, created by deploy workflow)
  • /volume1/aegis/repo/aegis_v3/configs/ldas/lt_ldas.yaml (daemon config)
  • /volume1/aegis/repo/aegis_v3/configs/universe/lt_ldas_symbols.txt (712 symbols)

Archive layout (parallel observation):

/volume1/aegis/live_data_archive/        ← Python canonical (aegis-ldas writes)
/volume1/aegis/live_data_archive_rust/   ← Rust parallel (aegis-lt-ldas-* writes)
    ├── vix/YYYY/MM/DD.parquet
    ├── stocks/SYMBOL/YYYY/MM/DD_poly.parquet
    ├── options/SYMBOL/YYYY/MM/DD.parquet
    ├── options/SYMBOL/YYYY/MM/DD_pt_HHMMSS.parquet  (intraday snapshots)
    ├── gex_summary/YYYY/MM/DD.parquet
    └── iv_rank/YYYY/MM/DD.parquet

How To Start

Preferred: trigger the dedicated workflow.

# Start only the intraday long-running service:
gh workflow run deploy-lt-ldas.yml

# Start intraday + run eod one-shot (smoke test):
gh workflow run deploy-lt-ldas.yml -f run_eod_once=true

# Force recreate containers:
gh workflow run deploy-lt-ldas.yml -f force_recreate=true

Manual (operator on Synology, emergency only):

cd /volume1/aegis/repo/aegis_v3/lt-rust-docker

# Start intraday service:
sudo /usr/local/bin/docker compose \
    --env-file /volume1/aegis/.env \
    -f docker-compose.lt-ldas.yml \
    up -d --build aegis-lt-ldas-intraday

# Run eod once (ad-hoc profile):
sudo /usr/local/bin/docker compose \
    --env-file /volume1/aegis/.env \
    --profile ad-hoc \
    -f docker-compose.lt-ldas.yml \
    run --rm aegis-lt-ldas-eod

How To Stop

cd /volume1/aegis/repo/aegis_v3/lt-rust-docker
sudo /usr/local/bin/docker compose \
    -f docker-compose.lt-ldas.yml stop aegis-lt-ldas-intraday

Delete the container entirely:

sudo /usr/local/bin/docker compose \
    -f docker-compose.lt-ldas.yml rm -f aegis-lt-ldas-intraday

How To Verify It Is Working

Container status (read-only SSH OK):

ssh fukutani.ryo@192.168.42.252 "sudo /usr/local/bin/docker ps --format 'table {{.Names}}\t{{.Status}}' | grep aegis-lt-ldas"

Intraday log tail (should show 15-min cycle completion during market hours):

ssh fukutani.ryo@192.168.42.252 "sudo /usr/local/bin/docker logs aegis-lt-ldas-intraday --tail 100 2>&1"

Expected log lines during market hours:

INFO lt_ldas::intraday: starting cycle suffix=_pt_093512
INFO lt_ldas::intraday: collected symbols=712 written=<N>
INFO lt_ldas::intraday: cycle complete, sleeping until next interval

Expected during market-closed:

INFO lt_ldas::intraday: market closed (outside 09:35-16:00 ET), sleeping 60s

Rust archive growth:

ssh fukutani.ryo@192.168.42.252 "find /volume1/aegis/live_data_archive_rust -name '*.parquet' -newer /tmp/1hour_ago -type f 2>/dev/null | wc -l"

During market hours this should increment by ~712 files per 15-min cycle.

GEX summary freshness check:

ssh fukutani.ryo@192.168.42.252 "ls -lt /volume1/aegis/live_data_archive_rust/gex_summary/2026/04/ 2>/dev/null | head -5"

EOD one-shot verification (after 17:15 ET weekdays):

ssh fukutani.ryo@192.168.42.252 "ls -lt /volume1/aegis/live_data_archive_rust/vix/2026/04/ /volume1/aegis/live_data_archive_rust/gex_summary/2026/04/ 2>/dev/null | head"

Parallel Observation (G2 Day 11-14)

During parallel observation, BOTH implementations write: - Python aegis-ldas: canonical /volume1/aegis/live_data_archive/ - Rust aegis-lt-ldas-intraday + aegis-lt-ldas-eod: parallel /volume1/aegis/live_data_archive_rust/

Observation acceptance criteria

After 2 business days EOD parity + 2 business days intraday parity, both implementations should exhibit:

  1. Parquet file count per date: Python and Rust produce the same count of options/stocks/gex_summary/iv_rank files (allow ±1 for race at midnight boundary)
  2. File naming convention: 1:1 match for {SYMBOL}/{YYYY}/{MM}/{DD}.parquet and {SYMBOL}/{YYYY}/{MM}/{DD}_pt_HHMMSS.parquet
  3. Schema compatibility: Python-written parquet can be read by Rust reader, and vice versa (schema-level comparison, see below)
  4. GEX regime symbol distribution: Rust GEX summary's gex_regime column distribution matches Python's within ±1 symbol per regime
  5. IV Rank deterministic match: since Rust lt-ldas's IV Rank is the same algorithm as the aegis-ivrank binary (which Python already calls), the iv_rank parquet column values should be identical

Parity comparison script

# Compare EOD options counts for a given date (planned Rust binary, G2 Day 15 cutover gate)
lt-ldas-parity \
  --python-archive /volume1/aegis/live_data_archive \
  --rust-archive /volume1/aegis/live_data_archive_rust \
  --date 2026-04-11 \
  --types vix,stocks,options,gex_summary,iv_rank \
  --tolerance-file-count 1 \
  --tolerance-row-count 5

(NOTE: lt-ldas-parity is a Sprint G2 follow-up item — a Rust binary in aegis-bt-rs/src/bin/lt_ldas_parity.rs, to be created before Day 15 cutover. An earlier draft aegis_v3/scripts/compare_ldas_parity.py was removed 2026-04-10 because new Python tooling violates the Python→Rust migration rule; the Rust replacement will mirror lt-wft-parity's pattern — small focused binary, deserializes both sides with the same types, exits 0/1 based on tolerance gate.)

Known nullable column mismatch (RESOLVED 2026-04-10)

Originally Rust wrote zero_gamma_level, gamma_wall, put_wall as Polars nullable (Option) in the gex_summary parquet while Python wrote them as non-nullable float64 with 0.0 fallback. This created a schema mismatch when Python tried to read a Rust-written gex_summary parquet.

Fix applied: write_gex_summary_parquet in aegis_v3/aegis-bt-rs/src/data/ldas_collector.rs now calls .unwrap_or(0.0) on the 3 Option fields before building the Polars Series, making Rust output byte-for-byte compatible with the Python reader. Verified by the G2 nullable fix commit series (see git log --oneline aegis_v3/aegis-bt-rs/src/data/ldas_collector.rs).

Cutover Path (G2 Day 15, NOT active yet)

Switching from parallel observation to Rust-as-source-of-truth requires:

Pre-cutover verification

  1. 2 business days EOD parity green (file count, schema, GEX regime, IV Rank value match, verified by lt-ldas-parity)
  2. 2 business days intraday parity green (cycle count, 15-min cadence, snapshot file count per symbol)
  3. Nullable column fix applied (already resolved 2026-04-10 — see above)
  4. WORK_LOG entry with explicit cutover approval

Cutover sequence

# a. Stop Python aegis-ldas FIRST
ssh fukutani.ryo@192.168.42.252 "sudo /usr/local/bin/docker stop aegis-ldas"

# b. Wait 30s for any in-flight snapshot fetches to flush
sleep 30

# c. Edit docker-compose.lt-ldas.yml to remove --archive-path override
#    and point at canonical path. Change both eod + intraday service commands:
#      - "--archive-path=/data/live_data_archive"
#    And update the volume bind:
#      - /volume1/aegis/live_data_archive:/data/live_data_archive:rw
#    Commit + push + wait for GHA deploy to succeed.

# d. After cutover deploy, Rust writes canonical /volume1/aegis/live_data_archive/
#    Verify the first intraday cycle writes to the canonical tree:
ssh fukutani.ryo@192.168.42.252 "find /volume1/aegis/live_data_archive/options/SPY/2026/04 -name '*_pt_*' -newer /tmp/5min_ago -type f 2>/dev/null | head"

# e. Keep Python aegis-ldas container image present for 1 week as rollback safety.
#    Do NOT docker rm until the parallel observation window has elapsed.

EOD scheduling after cutover

The Python container used internal supervisord + cron for EOD trigger. Rust lt-ldas eod is a one-shot subcommand invoked externally. Options:

  1. External cron on Synology host: add a crontab entry that runs docker compose run --rm aegis-lt-ldas-eod at 17:15 ET weekdays
  2. GitHub Actions scheduled workflow: add a cron to deploy-lt-ldas.yml that triggers run_eod_once=true at 21:15 UTC weekdays (covers EDT and EST automatically)
  3. Sprint G2 follow-up: add a native lt-ldas eod-cron subcommand that loops until the next 17:15 ET and executes, then exits (supervisord-friendly)

Recommend option 2 for consistency with existing GHA-driven deploy pattern.

Post-cutover Monitoring (24h on-call)

  • Watch intraday container logs for cycle completion every 15 minutes during market hours
  • Watch archive file count growing at the expected rate (~712 files per 15-min cycle)
  • Watch EOD run at 17:15 ET and complete within 45 minutes (Python baseline)
  • Check GEX summary and IV Rank files for the day are present by 18:00 ET

Rollback

During parallel observation

Safe rollback — Rust daemon can be stopped independently without affecting Python or any downstream consumer:

ssh fukutani.ryo@192.168.42.252 \
  "cd /volume1/aegis/repo/aegis_v3/lt-rust-docker && \
   sudo /usr/local/bin/docker compose \
     -f docker-compose.lt-ldas.yml stop aegis-lt-ldas-intraday"

After cutover

Post-cutover rollback requires restarting Python aegis-ldas:

# a. Stop Rust intraday
ssh fukutani.ryo@192.168.42.252 \
  "cd /volume1/aegis/repo/aegis_v3/lt-rust-docker && \
   sudo /usr/local/bin/docker compose \
     -f docker-compose.lt-ldas.yml stop aegis-lt-ldas-intraday"

# b. Start Python aegis-ldas back up
ssh fukutani.ryo@192.168.42.252 "sudo /usr/local/bin/docker start aegis-ldas"

# c. Verify Python daemon resumed intraday cycle
ssh fukutani.ryo@192.168.42.252 \
  "sudo /usr/local/bin/docker logs aegis-ldas --tail 20 2>&1 | grep -iE 'intraday|cycle|snapshot'"

Total rollback time: < 1 分 (if Python image is still present).

Troubleshooting

POLYGON_API_KEY is required for lt-ldas at startup

  • Compose used ${POLYGON_API_KEY:?...} which fails loudly if the env var is missing
  • Fix: verify /volume1/aegis/.env contains POLYGON_API_KEY=...
  • The deploy workflow's "Validate POLYGON_API_KEY on Synology" step should catch this before compose up

Intraday container running but no parquet files written

  • Check logs for "market closed" messages (outside 09:35-16:00 ET is expected no-op)
  • Check POLYGON_API_KEY is valid: hit a simple Polygon endpoint manually
  • Check find /volume1/aegis/live_data_archive_rust -name '*.parquet' -newer /tmp/1hour_ago returns files

Rust GEX regime column distribution diverges from Python

  • Most likely cause: underlying spot price source differs between implementations
  • Rust reads from stocks/SYMBOL/YYYY/MM/DD_poly.parquet close column (see collector.collect_gex)
  • Python reads from same file via PolygonProvider.get_stock_price
  • Verify both implementations get the same close value for a test symbol

EOD collection runs but gex_summary is empty

  • GEX computation requires both options parquet AND stocks parquet to exist for the date
  • Check: if the eod run was started before the options parquet was flushed, GEX will have 0 symbols
  • Fix: ensure eod runs vix→stocks→options→gex→iv_rank in that order (this is the default)

Nullable column schema mismatch when reading Rust-written gex_summary from Python

  • Known issue, Sprint G2 follow-up (1-line fix in write_gex_summary_parquet)
  • Workaround: during parallel observation each implementation reads only its own archive

IV Rank computation slow

  • Rust IV Rank calls crate::ivrank::run_daily which computes over a 252-day lookback window
  • For 712 symbols the full EOD IV Rank pass takes ~2-5 minutes (Python baseline)
  • If slower, check parquet read performance on the Synology RAID volume

UW Flow not implemented

  • Intentional: Python has had UW Flow disabled since 2026-03-26 (subscription cancelled)
  • Rust lt-ldas does not implement UW Flow collection
  • If UW Flow is needed in the future, implement as a separate subcommand or service

Sprint G2 Follow-up Items

Tracked for after the parallel observation window / cutover:

  1. Nullable column fix: write_gex_summary_parquet should .unwrap_or(0.0) the 3 Option fields (zero_gamma_level, gamma_wall, put_wall) for byte-for-byte Python parquet compatibility.
  2. lt-ldas eod-cron subcommand: a built-in scheduler that blocks until next 17:15 ET and runs eod, then exits. Avoids external cron dependency.
  3. lt-ldas-parity Rust binary: parity comparison tool for file counts, schema, and GEX regime distribution during the observation window. Mirrors the scope of lt-wft-parity but for LDAS parquet archives.
  4. Intraday WSS streaming (optional): Rust could use Polygon WebSocket instead of REST polling for lower latency, but Python uses REST and this has no cutover impact.