feat(bot): 5-phase strategy upgrade — edge neto, families, GNews priority, regimes
CI/CD / build-and-push (push) Successful in 2m30s

Phase 1 — Edge neto real (paper.py, bayesian.py, risk/manager.py, db.py):
- Trade records now store edge_gross, edge_net, prior_prob, final_prob,
  mid_price, spread_estimate, commission, family_key
- edge_net = edge_gross - SPREAD_ESTIMATE(0.02) - COMMISSION_RATE(0.02)
  NOTE: both constants are heuristics, not exact Polymarket exchange costs
- Execution gate changed from edge_gross > MIN_EDGE to edge_net > regime_min_edge

Phase 2 — Market families (polymarket.py):
- market_family_key(market) groups related markets:
    texas-republican-2026, fed-april-2026, openai-2026, etc.
- At most 1 trade per family per cycle; occupied_families propagated via main.py
- Family key logged on every TRADE and SKIP line

Phase 3 — GNews priority (news.py, bayesian.py, main.py):
- NewsClient.get_freshness() returns 1.0/0.75/0.40/0.10 by cache age
- gnews_priority(market, news) = uncertainty × volume_score × freshness
- Politics markets sorted by priority DESC before eval so best markets get
  the 5-query/cycle GNews budget first

Phase 4 — Regime min-edge by category/horizon (bayesian.py):
- politics >60d → 0.12, 30-60d → 0.10, <30d → 0.08
- tech / crypto/finance → 0.10
- All thresholds applied to edge_net (not edge_gross)

Phase 5 — Observability (bayesian.py, main.py):
- Structured skip labels: SKIP_UNSUPPORTED, SKIP_NO_SIGNALS,
  SKIP_PRIOR_EXTREME, SKIP_FAMILY, SKIP_GNEWS_PRIORITY, SKIP_EDGE_NET
- TRADE lines now include family_key, edge_gross, edge_net, regime_min, days
- schema.sql: 8 new cols on trades, 7 new cols on signals (via ALTER TABLE IF NOT EXISTS)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
chemavx
2026-04-16 15:34:46 +00:00
parent a0cbdc0256
commit 63d9f637ff
8 changed files with 620 additions and 141 deletions
+153
View File
@@ -5,6 +5,7 @@ Docs: https://docs.polymarket.com
import asyncio
import logging
import os
import re
from dataclasses import dataclass, field
from datetime import datetime, timezone, timedelta
from typing import Optional
@@ -15,6 +16,158 @@ log = logging.getLogger(__name__)
POLYMARKET_API = "https://clob.polymarket.com"
GAMMA_API = "https://gamma-api.polymarket.com"
# ─────────────────────────────────────────────────────────────────────────────
# Phase 2 — Market family classification helpers
# Used by market_family_key() below.
# ─────────────────────────────────────────────────────────────────────────────
_YEAR_RE = re.compile(r"\b(202\d|203\d)\b")
_MONTH_RE = re.compile(
r"\b(january|february|march|april|may|june|july|august|"
r"september|october|november|december)\b",
re.IGNORECASE,
)
_FED_TRIGGER_RE = re.compile(
r"\b(federal reserve|interest rate|bps|basis point|fed\s+(rate|meeting|decision))",
re.IGNORECASE,
)
_US_STATE_RE = re.compile(
r"\b(Alabama|Alaska|Arizona|Arkansas|California|Colorado|Connecticut|"
r"Delaware|Florida|Georgia|Hawaii|Idaho|Illinois|Indiana|Iowa|Kansas|"
r"Kentucky|Louisiana|Maine|Maryland|Massachusetts|Michigan|Minnesota|"
r"Mississippi|Missouri|Montana|Nebraska|Nevada|New\s+Hampshire|"
r"New\s+Jersey|New\s+Mexico|New\s+York|North\s+Carolina|North\s+Dakota|"
r"Ohio|Oklahoma|Oregon|Pennsylvania|Rhode\s+Island|South\s+Carolina|"
r"South\s+Dakota|Tennessee|Texas|Utah|Vermont|Virginia|Washington|"
r"West\s+Virginia|Wisconsin|Wyoming)\b",
re.IGNORECASE,
)
_PARTY_RE = re.compile(r"\b(Republican|Democrats?|Democratic|GOP)\b", re.IGNORECASE)
_ELECTION_TYPE_RE = re.compile(
r"\b(presidential|president|mayoral|mayor|gubernatorial|governor|"
r"senate|congress(?:ional)?|primary|election)\b",
re.IGNORECASE,
)
# Ordered list of (pattern, place_slug) for named non-US locations.
# Checked after US-state patterns so US city/state names don't shadow these.
_NAMED_PLACES: list[tuple[re.Pattern, str]] = [
(re.compile(r"\bColomb", re.IGNORECASE), "colombia"),
(re.compile(r"\bSeoul\b", re.IGNORECASE), "seoul"),
(re.compile(r"\bBusan\b", re.IGNORECASE), "busan"),
(re.compile(r"\bGyeonggi\b", re.IGNORECASE), "gyeonggi"),
(re.compile(r"\bChungcheong", re.IGNORECASE), "chungcheong"),
(re.compile(r"\bSouth\s+Korean?\b", re.IGNORECASE), "south-korea"),
(re.compile(r"\bLos\s+Angeles\b", re.IGNORECASE), "los-angeles"),
(re.compile(r"\bCuba\b", re.IGNORECASE), "cuba"),
(re.compile(r"\bLebanon\b", re.IGNORECASE), "lebanon"),
(re.compile(r"\bIsrael\b", re.IGNORECASE), "israel"),
(re.compile(r"\bUkraine\b", re.IGNORECASE), "ukraine"),
(re.compile(r"\bRussia\b", re.IGNORECASE), "russia"),
]
# Ordered list of (pattern, company_slug) for tech/company markets.
_NAMED_COMPANIES: list[tuple[re.Pattern, str]] = [
(re.compile(r"\bopenai\b", re.IGNORECASE), "openai"),
(re.compile(r"\banthropic\b", re.IGNORECASE), "anthropic"),
(re.compile(r"\bnvidia\b", re.IGNORECASE), "nvidia"),
(re.compile(r"\bapple\b", re.IGNORECASE), "apple"),
(re.compile(r"\bmicrosoft\b", re.IGNORECASE), "microsoft"),
(re.compile(r"\bgoogle\b", re.IGNORECASE), "google"),
(re.compile(r"\btesla\b", re.IGNORECASE), "tesla"),
# \bmeta\b does NOT match MetaMask (no word boundary mid-compound-word)
(re.compile(r"\bmeta\b", re.IGNORECASE), "meta"),
]
def _end_month(market: "Market") -> str:
"""Return market end_date formatted as YYYY-MM, or '' if unparseable."""
raw = market.end_date
if not raw:
return ""
try:
dt = datetime.fromisoformat(raw.replace("Z", "+00:00"))
return dt.strftime("%Y-%m")
except (ValueError, TypeError):
return ""
def market_family_key(market: "Market") -> str:
"""
Return a stable slug that groups related markets together.
Markets in the same family share an underlying event (same election,
same Fed meeting decision, same company). The bot allows at most one
open position per family per cycle to avoid correlated exposure.
Priority order (first match wins):
1. Fed / interest-rate decision → fed-{month}-{year}
2. US state + party election → {state}-{party}-{year}
3. Named non-US city/country → {place}-{event_type}-{year}
4. Named tech company → {company}-{year}
5. Fallback → {category}-{end_YYYY-MM}
Examples:
"Will Ken Paxton win the 2026 Texas Republican Primary"
→ texas-republican-2026
"Will the Fed decrease rates by 25 bps after April 2026 meeting"
→ fed-april-2026
"Will OpenAI IPO by December 31 2026?"
→ openai-2026
"""
q = market.question
# Prefer year from question text; fall back to end_date year if absent
year_m = _YEAR_RE.search(q)
if year_m:
year = year_m.group(1)
else:
end_m = _end_month(market) # e.g. "2026-06"
year = end_m[:4] if end_m else "unknown"
# 1. Fed / interest-rate meeting
if _FED_TRIGGER_RE.search(q):
month_m = _MONTH_RE.search(q)
if month_m:
return f"fed-{month_m.group(1).lower()}-{year}"
return f"fed-{year}"
# 2. US state + party (primary, senate, governor, etc.)
state_m = _US_STATE_RE.search(q)
party_m = _PARTY_RE.search(q)
if state_m and party_m:
state = re.sub(r"\s+", "-", state_m.group(1).lower())
raw_party = party_m.group(1).lower()
# "democrat" prefix covers "democrat", "democrats", "democratic"
party = "democrat" if "democrat" in raw_party else "republican"
return f"{state}-{party}-{year}"
# 3. Named non-US city / country
for place_re, place_slug in _NAMED_PLACES:
if place_re.search(q):
etype_m = _ELECTION_TYPE_RE.search(q)
if etype_m:
raw_etype = etype_m.group(1).lower()
# Normalise synonyms
etype = {
"president": "presidential",
"mayor": "mayoral",
"governor": "gubernatorial",
}.get(raw_etype, raw_etype)
else:
etype = "event"
return f"{place_slug}-{etype}-{year}"
# 4. Named tech company
for company_re, company_slug in _NAMED_COMPANIES:
if company_re.search(q):
return f"{company_slug}-{year}"
# 5. Fallback: category + end_date month
end_month = _end_month(market)
base = market.category if market.category else "misc"
return f"{base}-{end_month}" if end_month else f"{base}-{year}"
@dataclass
class Market: