コンテンツにスキップ

LT Quote Collector Runbook

CRITICAL: lt-quote-collector runs in parallel with Python aegis-quote-collector during the G3 migration observation phase. The two collectors write to DIFFERENT output directories and listen on DIFFERENT status API ports to avoid race conditions and port binding conflicts. Do NOT remove the --output, --status-api-port, --heartbeat-path, --no-option-cache-path overrides in docker-compose.lt-quote-collector.yml until the G3 Day 13 cutover.

What It Is

lt-quote-collector is the Rust-native BT fill_ratio calibration quote collection daemon, replacement for Python scripts/run_quote_collector.py running in the aegis-quote-collector container. It is the G3 phase of the Python → Rust trading infrastructure migration.

The daemon loop: 1. Load universe tier map from reports/universe_tiers.csv 2. For each cycle (rate-limited by per-API-call throttle): - Check market hours, switch between rpm_market and rpm_offmarket RPM - For each tier symbol: - Skip if in no-option cache (TTL 30 days) - Resolve Saxo UIC via existing rest_client - Fetch option chain for ATM put selection (DTE 10-35) - Select nearest-strike + nearest-expiry ATM put - Fetch leg quote (bid/ask/iv/sizes) - Append to JSONL output 3. At cycle end, prune expired no-option cache entries and save 4. Update heartbeat file every 30s (separate tokio task) 5. Serve status API on configurable port (GET /api/status)

Critical design: per-API-call throttling (not per-symbol). This prevents recurrence of the 2026-03-21 Python quote collector Docker hang where per-symbol throttle granularity allowed bursts.

Prerequisites

Central Synology .env at /volume1/aegis/.env MUST provide:

  • POLYGON_API_KEY — Polygon REST API key (for underlying spot price lookup)
  • SAXOBANK_APP_KEY_LIVE — Saxo OAuth client ID for live environment
  • SAXOBANK_APP_SECRET_LIVE — Saxo OAuth client secret for live environment

Paths that MUST exist on Synology:

  • /volume1/aegis/.env (must contain the three vars above)
  • /volume1/aegis/tokens/saxobank_tokens_live.json (managed by token-keeper, read-only mount)
  • /volume1/aegis/quote_samples_rust/ (parallel observation output dir, created by deploy workflow)
  • /volume1/aegis/repo/aegis_v3/configs/lt_quote_collector.yaml
  • /volume1/aegis/repo/aegis_v3/reports/universe_tiers.csv

Ports: - 8093 — Python aegis-quote-collector status API (do NOT reuse during observation) - 8094 — Rust lt-quote-collector status API (parallel observation)

How To Start

Preferred: trigger the dedicated workflow.

gh workflow run deploy-lt-quote-collector.yml

Manual (operator on Synology, emergency only):

cd /volume1/aegis/repo/aegis_v3/lt-rust-docker
sudo /usr/local/bin/docker compose \
    --env-file /volume1/aegis/.env \
    -f docker-compose.lt-quote-collector.yml \
    up -d --build aegis-lt-quote-collector

How To Stop

cd /volume1/aegis/repo/aegis_v3/lt-rust-docker
sudo /usr/local/bin/docker compose \
    -f docker-compose.lt-quote-collector.yml stop aegis-lt-quote-collector

Delete the container entirely:

cd /volume1/aegis/repo/aegis_v3/lt-rust-docker
sudo /usr/local/bin/docker compose \
    -f docker-compose.lt-quote-collector.yml rm -f aegis-lt-quote-collector

How To Verify It Is Working

Container status (read-only SSH OK):

ssh fukutani.ryo@192.168.42.252 "sudo /usr/local/bin/docker ps --format 'table {{.Names}}\t{{.Status}}' | grep aegis-lt-quote-collector"

Status API (from Synology or via port-forward):

ssh fukutani.ryo@192.168.42.252 "curl -s http://localhost:8094/api/status | jq"

Expected JSON schema:

{
  "status": "collecting",
  "cycle": 42,
  "rpm": 15.0,
  "market_hours": true,
  "total_symbols": 956,
  "no_option_symbols": 15,
  "api_calls_this_cycle": 145,
  "timestamp": "2026-04-10T10:48:00-04:00",
  "uptime_sec": 3600.0
}

Heartbeat file mtime (should update every 30s):

ssh fukutani.ryo@192.168.42.252 "stat -c '%Y %n' /volume1/aegis/quote_samples_rust/collector_heartbeat.json"

JSONL output tail:

ssh fukutani.ryo@192.168.42.252 "tail -n 5 /volume1/aegis/quote_samples_rust/live_quote_samples.jsonl | jq -c"

Expected per-record schema (matches Python for cutover compatibility):

{"timestamp":"...","symbol":"AAPL","uic":12345,"dte":20,"strike":180.0,"bid":0.45,"ask":0.55,"mid":0.50,"bid_size":5,"ask_size":10,"iv":0.32}

No-option cache inspection:

ssh fukutani.ryo@192.168.42.252 "cat /volume1/aegis/quote_samples_rust/collector_no_option_cache.json | jq '.entries | length'"

Parallel Observation (G3 Day 10-12)

During parallel observation, BOTH collectors run simultaneously: - Python aegis-quote-collector writes /volume1/aegis/quote_samples/live_quote_samples.jsonl and serves :8093 - Rust aegis-lt-quote-collector writes /volume1/aegis/quote_samples_rust/live_quote_samples.jsonl and serves :8094

Observation acceptance criteria

After 3-5 days parallel running, both collectors should exhibit: - RPM implied by api_calls_this_cycle / cycle_duration_sec * 60 stays within ±1 of the configured target - JSONL record counts within ±5% per symbol over 1-hour windows - No-option cache symbol sets ≥ 95% overlap between Python and Rust - Heartbeat file mtime updates every 30s (no gaps > 60s during market hours) - Zero container restarts (restart count = 0)

Critical parity metric: RPM throttle accuracy

The 2026-03-21 incident root cause was that per-symbol throttling allowed API bursts. Verify the Rust version does NOT recur:

# Sample api_calls_this_cycle every minute for 30 min during market hours
for i in {1..30}; do
  ssh fukutani.ryo@192.168.42.252 \
    "curl -s http://localhost:8094/api/status | jq '.api_calls_this_cycle, .cycle'"
  sleep 60
done

Cutover Path (G3 Day 13, NOT active yet)

Switching from parallel observation to Rust-as-source-of-truth requires:

  1. Pre-cutover verification
  2. 3-5 days clean parallel observation (see criteria above)
  3. Zero API throttle violations
  4. WORK_LOG entry with explicit cutover approval

  5. Cutover sequence

    # a. Stop Python quote collector
    ssh fukutani.ryo@192.168.42.252 "sudo /usr/local/bin/docker stop aegis-quote-collector"
    
    # b. Edit docker-compose.lt-quote-collector.yml to use canonical paths
    #    and port 8093:
    #      --output=/data/quote_samples/live_quote_samples.jsonl
    #      --status-api-port=8093
    #      --heartbeat-path=/data/quote_samples/collector_heartbeat.json
    #      --no-option-cache-path=/data/quote_samples/collector_no_option_cache.json
    #    Also update the ports section: "8093:8093"
    #    And the volume section to mount /volume1/aegis/quote_samples:/data/quote_samples:rw
    #    Commit + push + wait for GHA deploy to succeed.
    
    # c. After cutover deploy, Rust writes the canonical paths and serves :8093.
    
    # d. Keep Python container image present for 1 week for rollback safety.
    

  6. Post-cutover monitoring (24h on-call)

  7. Watch heartbeat mtime (should update every 30s without gaps)
  8. Watch JSONL record count rate (should match pre-cutover baseline)
  9. Check API throttle metrics via /api/status

Rollback

During parallel observation

Safe rollback — Rust collector can be stopped independently without affecting Python or any downstream container:

ssh fukutani.ryo@192.168.42.252 \
  "cd /volume1/aegis/repo/aegis_v3/lt-rust-docker && \
   sudo /usr/local/bin/docker compose \
     -f docker-compose.lt-quote-collector.yml stop aegis-lt-quote-collector"

After cutover

Post-cutover rollback requires restarting Python collector:

# a. Stop Rust collector
ssh fukutani.ryo@192.168.42.252 \
  "cd /volume1/aegis/repo/aegis_v3/lt-rust-docker && \
   sudo /usr/local/bin/docker compose \
     -f docker-compose.lt-quote-collector.yml stop aegis-lt-quote-collector"

# b. Start Python collector back up (container image must still be present)
ssh fukutani.ryo@192.168.42.252 "sudo /usr/local/bin/docker start aegis-quote-collector"

# c. Verify Python collector resumed JSONL writes
ssh fukutani.ryo@192.168.42.252 \
  "sudo /usr/local/bin/docker logs aegis-quote-collector --tail 20 2>&1 | grep -iE 'rpm|cycle|sampling'"

Total rollback time: < 1 分 (if Python image is still present).

Troubleshooting

"Token cache missing" at startup

  • Expected if token-keeper container is stopped or token cache file is missing
  • The Rust collector consumes the cache (read-only) — it does NOT manage tokens itself
  • Fix: ensure aegis-token-keeper-live or aegis-lt-token-keeper is running

Status API returns 404 or connection refused

  • Check the port matches: 8094 for parallel observation, 8093 after cutover
  • Verify container is running and port is published
  • Check --status-api-port CLI arg in the compose file

Heartbeat mtime not updating

  • Container may be hung in the collection cycle
  • Check container logs for ERROR-level messages
  • Check api_calls_this_cycle via status API — if increasing, collection is alive

Rate limit burst detected (api_calls_this_cycle > expected)

  • CRITICAL — may indicate throttle regression
  • Immediately roll back per the parallel observation rollback procedure above
  • File a Sprint 1d issue to debug the throttle implementation

Unknown Saxo UIC for a symbol

  • Normal for non-optionable or delisted symbols
  • They get cached in collector_no_option_cache.json with 30-day TTL
  • Pruning happens automatically at cycle end

JSONL file growth unbounded

  • Expected — this is append-only historical data for BT calibration
  • Rotation/compression is a separate operational concern, not handled by the collector
  • Recommend: monthly archive rotation via a separate cron or systemd timer