fix(critical): complementary market family grouping + Manifold inversion guard
CI/CD / build-and-push (push) Successful in 2m23s

FASE 1 — market_family_key() general election fix
General elections now group by office, not by party, so complementary
markets ("Republicans win Ohio governor" / "Democrats win Ohio governor")
share the same family key (ohio-gubernatorial-2026).  The second market
is blocked by the occupied_families check rather than traded as independent.

Primaries still keep the party (texas-republican-2026) because each party
runs its own separate primary race.

FASE 2 — Manifold party inversion guard
_detect_party() identifies the winning side in both the Polymarket question
and the matched Manifold title.  If they are confirmed opposites (republican
vs democrat), the probability is inverted (1 - prob) before use.

Full audit log per query:
  poly_question / manifold_title / manifold_url / match_score /
  prob_raw / inverted / prob_final

Root cause of Ohio Manifold:0.95 on both sides: both queries matched the
same Manifold market ("Republicans win Ohio governor" prob=0.95).  For the
"Democrats win" query the inversion now produces prob_final=0.05 instead of
blindly applying 0.95 to the wrong direction.

FASE 4 — startup contradiction scan
get_open_position_details() added to db.py.  main.py checks all open
positions at startup, warns on any family with >1 position, and recommends
keeping the one with the highest edge_net.  No auto-close.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
chemavx
2026-04-17 10:26:29 +00:00
parent 0cdb0758c4
commit ebdcff5a6e
4 changed files with 161 additions and 30 deletions
+17
View File
@@ -89,6 +89,23 @@ class Database:
) )
return {r["family_key"] for r in rows if r["family_key"]} return {r["family_key"] for r in rows if r["family_key"]}
async def get_open_position_details(self) -> list[dict]:
"""Return one row per open position with family_key and direction.
Used at startup to detect positions that share a family_key (same
underlying event), which indicates a contradictory paper trade entered
before the general-election family fix was deployed.
"""
async with self._pool.acquire() as conn:
rows = await conn.fetch("""
SELECT DISTINCT ON (market_id)
market_id, question, direction, edge_net, family_key, timestamp
FROM trades
WHERE paper = TRUE
ORDER BY market_id, timestamp DESC
""")
return [dict(r) for r in rows]
async def get_recent_trades(self, limit: int = 100) -> list[dict]: async def get_recent_trades(self, limit: int = 100) -> list[dict]:
async with self._pool.acquire() as conn: async with self._pool.acquire() as conn:
rows = await conn.fetch( rows = await conn.fetch(
+88 -25
View File
@@ -4,15 +4,18 @@ Manifold Markets client — cross-platform prediction market probability signals
For each Polymarket question, searches Manifold for a matching binary market For each Polymarket question, searches Manifold for a matching binary market
by keyword overlap and returns its probability as a calibration signal. by keyword overlap and returns its probability as a calibration signal.
Used for politics and tech markets where Manifold often has independent Inversion guard: if the Manifold market's winning side (Republican / Democrat)
probability estimates that diverge from Polymarket. is the complement of the Polymarket question's winning side, the probability is
automatically inverted (1 - prob). This prevents "Democrats win Ohio governor"
from consuming the probability of a Manifold market titled "Republicans win Ohio
governor" without adjustment.
Rejection guard: if the match score falls below _MATCH_THRESHOLD the market is
rejected, even if inversion would otherwise apply. All decisions are logged at
INFO so they can be audited per-cycle.
Cache TTL: 30 minutes (Manifold markets move slowly vs our 60 s cycle). Cache TTL: 30 minutes (Manifold markets move slowly vs our 60 s cycle).
Match threshold: >= 0.25 keyword overlap ratio between significant tokens. Match threshold: >= 0.25 keyword overlap ratio between significant tokens.
Weight choice: MANIFOLD_LOGODDS_WEIGHT = 0.6 in bayesian.py means a 30 pp
divergence (Manifold 0.75 vs Poly 0.45) produces edge_gross ≈ 0.19, which
clears the politics far-horizon regime threshold of 0.12 after costs.
""" """
import logging import logging
import re import re
@@ -40,6 +43,10 @@ _STOP_WORDS = frozenset([
"before", "during", "until", "against", "between", "through", "before", "during", "until", "against", "between", "through",
]) ])
# Mutually exclusive political parties used for complement detection
_REPUBLICAN_WORDS = frozenset(["republican", "republicans", "gop"])
_DEMOCRAT_WORDS = frozenset(["democrat", "democrats", "democratic"])
def _significant_words(text: str) -> set[str]: def _significant_words(text: str) -> set[str]:
words = re.findall(r"[a-zA-Z]+", text.lower()) words = re.findall(r"[a-zA-Z]+", text.lower())
@@ -52,14 +59,37 @@ def _build_search_query(question: str, max_words: int = 6) -> str:
return " ".join(sig[:max_words]) return " ".join(sig[:max_words])
def _best_match(poly_question: str, results: list[dict]) -> Optional[dict]: def _detect_party(text: str) -> Optional[str]:
"""Return best-matching open binary Manifold market, or None if below threshold.""" """Return 'republican', 'democrat', or None if no party detected."""
words = set(re.findall(r"[a-zA-Z]+", text.lower()))
if words & _REPUBLICAN_WORDS:
return "republican"
if words & _DEMOCRAT_WORDS:
return "democrat"
return None
def _best_match_with_audit(
poly_question: str,
results: list[dict],
) -> tuple[Optional[dict], float, bool]:
"""
Find the best-matching open binary Manifold market.
Returns (match, score, needs_inversion):
match — best result dict, or None if below threshold
score — keyword overlap score of best candidate (even if rejected)
needs_inversion — True when Manifold market favours the OPPOSITE party/side
to the Polymarket question (probability should be 1 - prob)
"""
poly_words = _significant_words(poly_question) poly_words = _significant_words(poly_question)
poly_party = _detect_party(poly_question)
if not poly_words: if not poly_words:
return None return None, 0.0, False
best_score = 0.0 best_score = 0.0
best: Optional[dict] = None best: Optional[dict] = None
best_needs_inv = False
for result in results: for result in results:
if result.get("outcomeType") != "BINARY": if result.get("outcomeType") != "BINARY":
@@ -76,10 +106,18 @@ def _best_match(poly_question: str, results: list[dict]) -> Optional[dict]:
if score > best_score: if score > best_score:
best_score = score best_score = score
best = result best = result
manifold_party = _detect_party(title)
# Inversion is warranted only when both sides are unambiguously detected
# and they are confirmed opposites (republican ≠ democrat).
best_needs_inv = (
poly_party is not None
and manifold_party is not None
and poly_party != manifold_party
)
if best_score >= _MATCH_THRESHOLD and best is not None: if best_score >= _MATCH_THRESHOLD and best is not None:
return best return best, best_score, best_needs_inv
return None return None, best_score, False
class ManifoldClient: class ManifoldClient:
@@ -94,8 +132,10 @@ class ManifoldClient:
""" """
Return Manifold probability for a matching market, or None. Return Manifold probability for a matching market, or None.
Searches by keyword overlap. Returns None if no match exceeds Probability is already adjusted for party-direction inversion when
_MATCH_THRESHOLD or on any API error (caller degrades gracefully). the matched Manifold market is the complement of our question.
Full audit log is emitted at INFO for every resolved query.
""" """
now = time.monotonic() now = time.monotonic()
cached = self._cache.get(question) cached = self._cache.get(question)
@@ -114,22 +154,45 @@ class ManifoldClient:
) )
resp.raise_for_status() resp.raise_for_status()
results = resp.json() results = resp.json()
match = _best_match(question, results)
prob = float(match["probability"]) if match else None
self._cache[question] = (now, prob)
if prob is not None:
log.info(
"Manifold match: %-50s%.3f | %s",
question[:50], prob, match.get("question", "")[:60],
)
else:
log.debug("Manifold no match for: %s (query=%r)", question[:50], query)
return prob
except Exception as e: except Exception as e:
log.warning("Manifold API error for %r: %s", question[:40], e) log.warning("Manifold API error for %r: %s", question[:40], e)
self._cache[question] = (now, None) self._cache[question] = (now, None)
return None return None
match, score, needs_inv = _best_match_with_audit(question, results)
if match is None:
log.info(
"Manifold no_match: %-50s | best_score=%.2f < %.2f | query=%r",
question[:50], score, _MATCH_THRESHOLD, query,
)
self._cache[question] = (now, None)
return None
prob_raw = float(match["probability"])
prob_final = (1.0 - prob_raw) if needs_inv else prob_raw
# Build market URL from slug (best-effort; may be missing)
slug = match.get("slug", "")
creator = match.get("creatorUsername", "")
url = f"https://manifold.markets/{creator}/{slug}" if slug else "n/a"
log.info(
"Manifold %s: %-50s\n"
" poly_question: %s\n"
" manifold_title: %s\n"
" manifold_url: %s\n"
" match_score: %.2f | prob_raw=%.3f | inverted=%s | prob_final=%.3f",
"MATCH_INVERTED" if needs_inv else "MATCH",
question[:50],
question,
match.get("question", ""),
url,
score, prob_raw, needs_inv, prob_final,
)
self._cache[question] = (now, prob_final)
return prob_final
async def close(self) -> None: async def close(self) -> None:
await self._client.aclose() await self._client.aclose()
+32 -5
View File
@@ -132,20 +132,47 @@ def market_family_key(market: "Market") -> str:
return f"fed-{month_m.group(1).lower()}-{year}" return f"fed-{month_m.group(1).lower()}-{year}"
return f"fed-{year}" return f"fed-{year}"
# 2. US state + party (primary, senate, governor, etc.) # 2. US state + election event
# Key design: general elections group by office, not by party, so
# "Republicans win Ohio governor" and "Democrats win Ohio governor"
# share the same family (ohio-gubernatorial-2026) and the bot can only
# hold one position. Primaries keep the party because each party runs
# its own primary (texas-republican-primary is distinct from texas-democrat-primary).
state_m = _US_STATE_RE.search(q) state_m = _US_STATE_RE.search(q)
party_m = _PARTY_RE.search(q) party_m = _PARTY_RE.search(q)
if state_m and party_m: etype_m = _ELECTION_TYPE_RE.search(q)
if state_m and (party_m or etype_m):
state = re.sub(r"\s+", "-", state_m.group(1).lower()) state = re.sub(r"\s+", "-", state_m.group(1).lower())
raw_party = party_m.group(1).lower() is_primary = etype_m is not None and "primary" in etype_m.group(1).lower()
# "democrat" prefix covers "democrat", "democrats", "democratic"
if party_m and is_primary:
# Primary race: party is the disambiguation (each party has its own primary)
raw_party = party_m.group(1).lower()
party = "democrat" if "democrat" in raw_party else "republican"
return f"{state}-{party}-{year}"
if etype_m:
# General election: family = office, not party
# "Republicans win Ohio governor" == "Democrats win Ohio governor" → same race
raw_etype = etype_m.group(1).lower()
etype = {
"president": "presidential",
"mayor": "mayoral",
"governor": "gubernatorial",
}.get(raw_etype, raw_etype)
return f"{state}-{etype}-{year}"
# Has party but no election type — preserve old behaviour (e.g. "Texas Republican")
raw_party = party_m.group(1).lower() # type: ignore[union-attr]
party = "democrat" if "democrat" in raw_party else "republican" party = "democrat" if "democrat" in raw_party else "republican"
return f"{state}-{party}-{year}" return f"{state}-{party}-{year}"
# 3. Named non-US city / country # 3. Named non-US city / country
for place_re, place_slug in _NAMED_PLACES: for place_re, place_slug in _NAMED_PLACES:
if place_re.search(q): if place_re.search(q):
etype_m = _ELECTION_TYPE_RE.search(q) if etype_m is None:
etype_m = _ELECTION_TYPE_RE.search(q)
if etype_m: if etype_m:
raw_etype = etype_m.group(1).lower() raw_etype = etype_m.group(1).lower()
# Normalise synonyms # Normalise synonyms
+24
View File
@@ -202,6 +202,30 @@ async def main() -> None:
if PAPER_MODE: if PAPER_MODE:
await executor.initialize() await executor.initialize()
# Contradiction scan: warn if any two open positions share a family_key.
# This can happen when the family logic was less strict on a prior deploy.
# Bot does NOT auto-close — operator decides which position to keep.
positions = await db.get_open_position_details()
family_map: dict[str, list[dict]] = {}
for pos in positions:
fk = pos.get("family_key") or ""
if fk:
family_map.setdefault(fk, []).append(pos)
for fk, members in family_map.items():
if len(members) > 1:
best = max(members, key=lambda p: p.get("edge_net") or 0.0)
log.warning(
"CONTRADICTION family=%s has %d open positions — recommend keeping market_id=%s (edge_net=%.3f):",
fk, len(members), best["market_id"], best.get("edge_net") or 0.0,
)
for m in members:
marker = "KEEP" if m["market_id"] == best["market_id"] else "REVIEW"
log.warning(
" [%s] %s | dir=%s | edge_net=%.3f | %s",
marker, m["market_id"], m["direction"],
m.get("edge_net") or 0.0, m["question"][:60],
)
try: try:
await run_trading_loop(poly, external, strategy, risk, executor, metrics, db) await run_trading_loop(poly, external, strategy, risk, executor, metrics, db)
finally: finally: