2026-04-13 Monday Market Open — Rust Parallel Observation Smoke Test¶
Purpose: on Monday 2026-04-13 09:30 ET (13:30 UTC), verify that the 4 Rust parallel observation containers start writing clean artifacts, without repeating the 2026-04-10 Friday failure where 186/187 parity records were lost to the Polygon float volume parse bug (fix commit
35011d07e).Created: 2026-04-10 (Claude). Runs: 2026-04-13 09:30 ET onward and daily through Day 10 cutover judgment.
Context¶
- 2026-04-10 Fri afternoon market session: 187
parity_log.jsonlrecords written, 100% hit the Polygon float volume parse error (get_spy_history failed: Polygon payload parse error: invalid type: floating point 68441672.317886, expected i64). Regime wasnullfor every scan,entries_proposed=0,exits_proposed=0. The clean parity observation is effectively lost for Day 1. - Polygon volume-as-float fix was merged at
35011d07eon 2026-04-10 19:38 UTC (after the market had already been running with the broken code for 6+ hours). - lt-scan-cycle container restarted at 2026-04-10 21:38:18 UTC
with args
--scenario-multi-yaml=/config/scenarios/lt_wft.yaml,--comparison-log-path=/data/state/lt_rust/lt_wft_comparison.jsonl,--comparison-interval-scans=20,--wft-batch-size=3,--dry-run,--scan-interval=45. That is the version of the binary that will see first market open on Monday. - Market was closed all weekend (2026-04-11 Sat, 2026-04-12 Sun).
- Monday 2026-04-13 09:30 ET = 13:30 UTC is the first clean multi-scenario observation window.
Pre-open checks (run at 09:00-09:29 ET Monday)¶
1. Container health (all 4 must be "Up")¶
ssh fukutani.ryo@192.168.42.252 "sudo /usr/local/bin/docker ps --format 'table {{.Names}}\t{{.Status}}' | grep -E 'aegis-lt-(scan-cycle|ldas-intraday|token-keeper$|quote-collector)'"
Expected — all 4 "Up" with reasonable uptime (days):
aegis-lt-scan-cycle Up 2 days (or similar)
aegis-lt-ldas-intraday Up 2 days
aegis-lt-token-keeper Up 2 days
aegis-lt-quote-collector Up 2 days
If any container is Restarting, Exited, or missing → ABORT the
smoke test and debug before market open. Re-deploy via
gh workflow run deploy-lt-{scan-cycle,ldas,token-keeper,quote-collector}.yml
and wait for completion.
2. lt-scan-cycle is in multi-scenario mode¶
ssh fukutani.ryo@192.168.42.252 "sudo /usr/local/bin/docker inspect aegis-lt-scan-cycle --format '{{json .Args}}'"
Expected — must contain --scenario-multi-yaml (not --config):
["--scenario-multi-yaml=/config/scenarios/lt_wft.yaml",
"--state-dir=/data/state/lt_rust/lt_scan_cycle",
"--parity-log=/data/state/lt_rust/parity_log.jsonl",
"--comparison-log-path=/data/state/lt_rust/lt_wft_comparison.jsonl",
"--comparison-interval-scans=20",
"--wft-batch-size=3",
"--dry-run",
"--scan-interval=45"]
If the args start with --config=/config/scenarios/lt_rc.yaml
instead → single-scenario mode is still live and the multi-scenario
promote commit never deployed. Dispatch
deploy-lt-scan-cycle.yml with force_recreate=true and wait.
3. Saxo token cache is fresh¶
ssh fukutani.ryo@192.168.42.252 "stat -c '%Y %n' /volume1/aegis/tokens/saxobank_tokens_live.json /volume1/aegis/tokens/saxobank_tokens_live_rust.json"
Both files must have mtime within the last 15 min (both the Python canonical and the Rust parallel cache should have refreshed recently). If the Rust cache is > 30 min old → Rust token-keeper is broken, debug before market open.
4. Polygon API key reachable¶
ssh fukutani.ryo@192.168.42.252 "sudo /usr/local/bin/docker exec aegis-lt-scan-cycle sh -c 'test -n \"\$POLYGON_API_KEY\" && echo OK || echo MISSING'"
Must echo OK. If MISSING, the pt-docker/.env env-file load
broke.
5. Python aegis-wft is still running (non-interference)¶
ssh fukutani.ryo@192.168.42.252 "sudo /usr/local/bin/docker ps --filter name=aegis-wft --filter status=running --format '{{.Names}} {{.Status}}'"
Must show aegis-wft Up N days (healthy). If missing → the
Rust parallel observation accidentally took out Python PT, which
is a CRITICAL failure.
Market-open checks (09:30-10:00 ET Monday)¶
6. First scan record — Polygon fix is live¶
Wait until 09:32 ET (~2 min after open) and inspect the tail of
parity_log.jsonl:
ssh fukutani.ryo@192.168.42.252 "jq -c 'select(.record_kind==\"scan\" and .scan_count > 186) | {scan_count, ts, regime, vix: .vix_value, errors, entries: (.entries_proposed|length), exits: (.exits_proposed|length)}' /volume1/aegis/wft_state/lt_rust/parity_log.jsonl | tail -5"
Pass criteria (the Polygon fix is working):
- regime is non-null ("NORMAL", "CAUTION", or "CRISIS")
- vix is a reasonable number (10-80)
- errors is empty or at most short-lived (not the Polygon
float volume error)
- scan_count is incrementing every ~45 seconds
Fail signal (Polygon bug resurfaced or fix didn't land):
- regime: null for every scan
- Same get_spy_history failed: Polygon payload parse error in
errors
If the fail signal appears, immediately check the container image git sha:
ssh fukutani.ryo@192.168.42.252 "sudo /usr/local/bin/docker inspect aegis-lt-scan-cycle --format '{{.Image}}'"
ssh fukutani.ryo@192.168.42.252 "sudo /usr/local/bin/docker inspect aegis-lt-scan-cycle --format '{{.Config.Labels}}'"
and verify the build post-dates commit 35011d07e. If the
container was built before the fix, re-dispatch
deploy-lt-scan-cycle.yml with force_recreate=true.
7. Multi-scenario lanes both running¶
Look for the startup log from the most recent restart:
ssh fukutani.ryo@192.168.42.252 "sudo /usr/local/bin/docker logs aegis-lt-scan-cycle --since 2h 2>&1 | grep -E 'Priority scenarios|WFT scenarios|Completed startup reconcile|scenarios_loaded' | head -20"
Expected (the Sprint 1a-1d runner is alive):
- Completed startup reconcile for priority scenarios applied_events=N scenarios=1
- Priority scenarios paused because market is closed (pre-open)
then transitions to running after open
- WFT scenarios paused because market is closed (pre-open)
then transitions to running after open
If you see only Priority scenarios logs but no WFT
scenarios logs → multi-scenario wiring is broken, WFT lane not
starting. Debug with
docker logs aegis-lt-scan-cycle --tail 200 2>&1 | grep -E 'error|panic|scenario'.
8. First comparison log record (after 20 WFT batches)¶
lt_wft_comparison.jsonl gets written every 20 WFT batches. With
--scan-interval=45 and --wft-batch-size=3, that is roughly
45 * 20 / 3 ≈ 300 seconds = 5 min between records. So expect the
first record between 09:35 and 09:40 ET.
ssh fukutani.ryo@192.168.42.252 "ls -la /volume1/aegis/wft_state/lt_rust/lt_wft_comparison.jsonl 2>&1"
If the file does not yet exist at 09:35 ET → wait. If it still does not exist at 09:45 ET → multi-scenario comparison writer is broken.
Inspect the first record:
ssh fukutani.ryo@192.168.42.252 "jq '.' /volume1/aegis/wft_state/lt_rust/lt_wft_comparison.jsonl | head -60"
Expected:
- timestamp: ISO-8601 ET
- date: "2026-04-13"
- scan_count: 20 or similar
- scenarios: array of 11 entries (LT_RC + WFT_A..I + PT_XW)
- each scenario has label, equity, closed_trades,
win_rate_pct, etc. matching the Python
pt_wft_comparison.jsonl schema
If scenarios has fewer than 11 entries → lt_wft.yaml wrapper is
incomplete. Check
/volume1/aegis/repo/aegis_v3/configs/scenarios/lt_wft.yaml.
9. First per-scenario parity diff with lt-wft-parity¶
As soon as both Python pt_wft_comparison.jsonl AND Rust
lt_wft_comparison.jsonl have at least one record for today,
run lt-wft-parity via a one-shot docker run --rm (NOT via
docker exec, because the running aegis-lt-scan-cycle
container only mounts wft_state/lt_rust/ — it can't see the
Python comparison log at wft_state/pt_wft_comparison.jsonl):
ssh fukutani.ryo@192.168.42.252 "sudo /usr/local/bin/docker run --rm --entrypoint lt-wft-parity \
-v /volume1/aegis/wft_state/pt_wft_comparison.jsonl:/data/pt_wft.jsonl:ro \
-v /volume1/aegis/wft_state/lt_rust/lt_wft_comparison.jsonl:/data/lt_wft.jsonl:ro \
lt-rust-docker-aegis-lt-scan-cycle \
--python-log /data/pt_wft.jsonl \
--rust-log /data/lt_wft.jsonl \
--date 2026-04-13 \
--tolerance-equity 50 \
--tolerance-trade-count 0"
The --rm flag ensures the one-shot container is cleaned up
immediately. The --entrypoint lt-wft-parity override is
REQUIRED — the lt-rust-docker image has a wrapper entrypoint
(lt-rust-entrypoint.sh) that defaults to launching
lt-shadow, so without the override the one-shot container
tries to start the shadow daemon and crashes before reaching
lt-wft-parity. The two file-level bind mounts (:ro) expose
just the specific JSONL files without touching the directory
structure the parallel-observation scan-cycle container relies
on.
Verified working: 2026-04-10 23:17 UTC smoke-tested this
exact command against --date 2026-04-13. Result: both sides
correctly reported "NO RECORD FOR 2026-04-13" (expected —
Friday's data exists for 2026-04-10, not 2026-04-13) and the
binary exited with Overall: PASS (zero gated failures on an
empty-record-set input). This confirms the docker run pattern,
the mounts, and the binary path all work end-to-end.
Expected on Monday morning: both sides start from equity
~$32,000 with 0 closed_trades, so the diff should be trivially
within tolerance. Overall: PASS expected.
Fail signal: Python "NO RECORD FOR 2026-04-13" or Rust
"NO RECORD FOR 2026-04-13" — means the comparison log writer
for that side isn't running. If Python is missing →
aegis-wft container is broken. If Rust is missing → see check 8.
Note:
lt-wft-parityis built into thelt-rust-docker-aegis-lt-scan-cyclecontainer image as of commitd07c99bd5(2026-04-10) and verified working on 2026-04-10 23:06 UTC after the--remove-orphansrecovery. If the image does not have the binary, re-dispatchdeploy-lt-scan-cycle.ymlwithforce_recreate=true.Note on
docker runvsdocker exec: the alternative would be to bind-mount the Python path into the running container, but that requires a compose file edit + force_recreate which touches the live observation loop. The one-shotdocker run --rmpattern is fully out-of-band and does not risk the live container state.
10. lt-ldas-intraday writes first 15-min cycle¶
The intraday collector runs every 15 minutes between 09:35 and 16:00 ET. The first cycle of the day is 09:35 ET.
ssh fukutani.ryo@192.168.42.252 "sudo /usr/local/bin/docker logs aegis-lt-ldas-intraday --since 1h 2>&1 | grep -E 'cycle|snapshot|Inside intraday window' | tail -10"
Expected at ~09:35 ET:
- Inside intraday window, running cycle
- cycle start + per-symbol snapshot logs
- cycle complete with N parquet files written
Verify the Rust archive received new parquet files:
ssh fukutani.ryo@192.168.42.252 "find /volume1/aegis/live_data_archive_rust/options -name '13*_pt_*.parquet' -newer /tmp/anchor 2>&1 | wc -l"
Should be > 0 within 5 min after 09:35 ET.
11. lt-quote-collector resumes market-hours throttle¶
Pre-open the collector should be in off-market mode (rpm=60).
After 09:30 ET it switches to market-hours mode (rpm=15 per the
2026-03-21 incident prevention — slower during market hours).
ssh fukutani.ryo@192.168.42.252 "cat /volume1/aegis/quote_samples_rust/collector_heartbeat.json 2>/dev/null | python3 -m json.tool"
Expected at 09:31 ET:
- status: "collecting"
- market_hours: true
- rpm: 15.0 (not 60)
- api_calls_this_cycle: incrementing
If market_hours: false at 09:35 ET or later → the clock
detection is broken.
Checkpoint (10:00 ET / 14:00 UTC — 30 min after open)¶
Run the daily summary for the observation window:
ssh fukutani.ryo@192.168.42.252 "
echo '=== parity_log scan count ==='
jq -c 'select(.record_kind==\"scan\" and (.ts | startswith(\"2026-04-13\")))' /volume1/aegis/wft_state/lt_rust/parity_log.jsonl | wc -l
echo '=== parity_log error rate ==='
jq -c 'select(.record_kind==\"scan\" and (.ts | startswith(\"2026-04-13\"))) | .errors | length' /volume1/aegis/wft_state/lt_rust/parity_log.jsonl | awk '{s+=\$1; n++} END {printf \"%.1f%% (%d errors in %d scans)\n\", s/n*100, s, n}'
echo '=== comparison log record count ==='
wc -l /volume1/aegis/wft_state/lt_rust/lt_wft_comparison.jsonl
echo '=== ldas intraday files written today ==='
find /volume1/aegis/live_data_archive_rust/options -name '13*_pt_*.parquet' -newer /tmp/monday_anchor 2>/dev/null | wc -l
echo '=== quote collector cycles ==='
jq '.cycle' /volume1/aegis/quote_samples_rust/collector_heartbeat.json
"
Pass criteria (Day 1 clean baseline): - scan_count ≥ 30 (30 min / 45s/scan ≈ 40) - error rate < 5% - comparison log records ≥ 5 (first record at 09:35-09:40, one every ~5 min) - ldas intraday: at least 1 file written at 09:35 ET - quote collector cycles ≥ 5
If all green: update WORK_LOG with "2026-04-13 Day 1 clean baseline observation started" and begin the 5-business-day clock.
Mid-day and end-of-day checks¶
Every 2-3 hours during market hours (11:30 ET, 14:00 ET,
15:45 ET), re-run step 9 (lt-wft-parity) and step 10 for
intraday progress. Log the Overall: PASS/FAIL verdict + per-
scenario deltas to WORK_LOG. Any FAIL triggers immediate
investigation (parity drift is the G1 cutover blocker signal).
After market close (16:00 ET / 20:00 UTC), run the full-day diff
via the same docker run --rm pattern:
ssh fukutani.ryo@192.168.42.252 "sudo /usr/local/bin/docker run --rm --entrypoint lt-wft-parity \
-v /volume1/aegis/wft_state/pt_wft_comparison.jsonl:/data/pt_wft.jsonl:ro \
-v /volume1/aegis/wft_state/lt_rust/lt_wft_comparison.jsonl:/data/lt_wft.jsonl:ro \
lt-rust-docker-aegis-lt-scan-cycle \
--python-log /data/pt_wft.jsonl \
--rust-log /data/lt_wft.jsonl \
--date 2026-04-13 \
--tolerance-equity 50 \
--tolerance-trade-count 0 \
--json" > /tmp/monday_parity.json
cat /tmp/monday_parity.json | jq '.overall, .scenarios[] | {label, failures}'
Attach this JSON to the WORK_LOG entry. This is the first formal Day 1 per-scenario parity snapshot — save it as the baseline for the 5-business-day observation window.
Failure playbook¶
| Symptom | Action |
|---|---|
| Container not running | gh workflow run deploy-lt-<name>.yml -f force_recreate=true, wait for completion |
| Polygon parse error recurrence | Verify container image post-dates commit 35011d07e; force_recreate if older |
| Multi-scenario WFT lane silent | Check lt_wft.yaml, verify 11 scenarios loaded, re-read Sprint 1b wiring |
| Python aegis-wft stopped | ⛔ CRITICAL — restore Python container via deploy-pt.yml before anything else (Rust is dry-run, Python is real money) |
| lt-wft-parity command not found | Container image predates commit d07c99bd5, force_recreate |
| Both logs present but parity FAIL | Dig into the specific scenario's equity/trades delta, check whether Python had a bugfix merged that Rust didn't |
Next steps after Day 1¶
If Day 1 is clean, repeat steps 6-10 daily Tue-Fri (Day 2-5 of
the 5-business-day observation window). If all 5 days are
Overall: PASS, the cutover judgment meeting happens on the
Monday of Week 3 (2026-04-20).