Add MANIFOLD_MATCHER_VERSION="v3_outcome_guard" tag persisted to
manifold_match_audit.matcher_version so metrics can isolate current-matcher
stats from pre-versioning records, whose accepted matches the outcome
guard would now reject.
- schema: add matcher_version column + index; idempotent startup backfill
tagging NULL rows as legacy_pre_outcome_guard (no outcome types) or
v2_outcome_guard_no_version (has outcome type, version not persisted)
- save_manifold_audit: write matcher_version on every new record
- get_manifold_matches: split summary into current_version / all_time /
legacy; recent_matches now carry matcher_version
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reject false-positive matches where Jaccard overlap is high but the outcome is
not equivalent (e.g. Poly nomination vs Manifold "If X is nominee, will he win").
- _is_conditional(): detect conditional Manifold markets (If/Conditional on/
Assuming/Given that prefixes + mid-sentence " if ...," clauses) -> reject with
reason "conditional_market".
- _classify_outcome(): classify into nomination|primary_win|general_win|
conditional|other; reject when poly/mfld types differ or either is conditional
-> reason "outcome_mismatch: poly=... manifold=...".
- Persist poly_outcome_type/mfld_outcome_type on ManifoldMatchResult, in
manifold_match_audit (CREATE + idempotent ALTER), save_manifold_audit() and
the bayesian call site.
- Tests covering classification, conditional detection and the Graham Platner
regression (now rejected); valid nomination<->nomination still accepted.
Untouched: _MATCH_THRESHOLD (0.40), MANIFOLD_LOGODDS_WEIGHT, edge thresholds,
exposure, trading logic.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds feat_fg_lo / feat_mom_lo / feat_news_lo / feat_mfld_lo / feat_btc_dom_lo
to every trade, all normalized to log-odds contribution for direct comparability.
- fg / mom / btc_dom: raw probability-delta × 2 → log-odds
- news / mfld: already log-odds (LOGODDS_WEIGHT already applied), no scaling
- btc_dom tracked separately in bayesian.py instead of bundled in total_adj
- reasoning string updated to fg_lo= / mom_lo= notation for self-documentation
Schema: 5 new DOUBLE PRECISION columns + 2 partial indexes
Stack: TradingSignal → Order → Trade → save_trade all carry feat fields
Startup: backfill_feature_columns() recovers fg/mom/news/mfld from old
reasoning strings (×2 applied to fg/mom); btc_dom_lo stays NULL for legacy
API: /api/metrics/features — triggered/material split per feature with
two-level thresholds (0.05 for fg/mom/btc_dom, 0.10 for news/mfld)
API: /api/trades/legacy — exposes pre-Phase-1 trades (edge_net IS NULL)
API: _enrich_trade backward-compat: reads DB columns first, falls back to
reasoning regex with unit conversion for pre-Phase-6 trades
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
schema.sql
trades: + close_pnl, resolution (market outcome storage)
metrics_daily: + unrealized_pnl_est, realized_pnl, open/closed/resolved_count
db.py
close_paper_position(): accepts resolution; computes close_pnl in SQL
BUY_YES: (resolution − entry_price) × shares
BUY_NO: ((1 − resolution) − entry_price) × shares
save_daily_metrics(): persists new columns
compute_metrics_from_db(): single DB query for all metrics; no in-memory state
tracker.py — complete rewrite (stateless)
Removed self._trades, self._daily_returns, compute_metrics(), _compute_sharpe(),
check_promotion_thresholds(), _empty_metrics()
update_daily_summary() now reads compute_metrics_from_db() every cycle
Safe across pod restarts: always reflects full DB history
paper.py
close_position(): passes resolution to close_paper_position()
api/main.py /api/summary
Added unrealized_pnl_est (estimated, open trades) and realized_pnl (exact,
closed+resolved) as separate fields alongside total_pnl
win_rate: null if < 5 resolved trades (was proxy on entry_price < 0.5)
calibration_score: Brier-based, null if < 10 resolved trades
resolved_count exposed as field
Each field annotated with: exact/estimated, source, null conditions
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds run_legacy_scan() that executes once at startup before the trading loop:
1. Re-keys every open DB position using the current market_family_key()
2. Groups by new family key; KEEP = highest edge_net, CLOSE_RECOMMENDED = sibling
3. Manifold re-query for positions whose family key changed; if corrected
probability contradicts the trade direction → CLOSE_RECOMMENDED
4. Logs full report (KEEP / REVIEW / CLOSE_RECOMMENDED) before any closures
5. In paper mode: auto-closes all CLOSE_RECOMMENDED positions
For the existing Ohio bug:
- Democrats win Ohio governor (629557): CLOSE_RECOMMENDED
family changed ohio-democrat-2026 → ohio-gubernatorial-2026
Manifold re-query confirms prob=0.05 contradicts BUY_YES (inversion bug)
$X returned to cash at break-even
- Republicans win Ohio governor (629558): KEEP
higher edge_net (0.349 > 0.247)
Infrastructure:
- schema.sql: closed_at TIMESTAMPTZ, close_reason TEXT on trades
- db.py: all open-position queries filter WHERE closed_at IS NULL
+ close_paper_position(market_id, reason)
- paper.py: close_legacy_position(market_id, reason) → float
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Phase 1 — Edge neto real (paper.py, bayesian.py, risk/manager.py, db.py):
- Trade records now store edge_gross, edge_net, prior_prob, final_prob,
mid_price, spread_estimate, commission, family_key
- edge_net = edge_gross - SPREAD_ESTIMATE(0.02) - COMMISSION_RATE(0.02)
NOTE: both constants are heuristics, not exact Polymarket exchange costs
- Execution gate changed from edge_gross > MIN_EDGE to edge_net > regime_min_edge
Phase 2 — Market families (polymarket.py):
- market_family_key(market) groups related markets:
texas-republican-2026, fed-april-2026, openai-2026, etc.
- At most 1 trade per family per cycle; occupied_families propagated via main.py
- Family key logged on every TRADE and SKIP line
Phase 3 — GNews priority (news.py, bayesian.py, main.py):
- NewsClient.get_freshness() returns 1.0/0.75/0.40/0.10 by cache age
- gnews_priority(market, news) = uncertainty × volume_score × freshness
- Politics markets sorted by priority DESC before eval so best markets get
the 5-query/cycle GNews budget first
Phase 4 — Regime min-edge by category/horizon (bayesian.py):
- politics >60d → 0.12, 30-60d → 0.10, <30d → 0.08
- tech / crypto/finance → 0.10
- All thresholds applied to edge_net (not edge_gross)
Phase 5 — Observability (bayesian.py, main.py):
- Structured skip labels: SKIP_UNSUPPORTED, SKIP_NO_SIGNALS,
SKIP_PRIOR_EXTREME, SKIP_FAMILY, SKIP_GNEWS_PRIORITY, SKIP_EDGE_NET
- TRADE lines now include family_key, edge_gross, edge_net, regime_min, days
- schema.sql: 8 new cols on trades, 7 new cols on signals (via ALTER TABLE IF NOT EXISTS)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>