Commit Graph

5 Commits

Author SHA1 Message Date
ChemaVX f7d62345b8 fix: relevance scoring per topic + URL keyword filter for child pages
Build & Deploy ResearchOwl / build-and-push (push) Successful in 6s
processor.py: simplify _score_quality prompt to single axis —
  "how relevant is this text to topic X?" — instead of averaging
  relevance + density + credibility, which let off-topic but
  well-written content pass through

exhaustive.py: pre-compute topic keywords (stopword-filtered) at
  scraper init; filter child URLs (discovered during crawl, depth>0)
  to only add ones whose URL path or title contains a topic keyword;
  seed URLs (depth=0, from DDG/Wikipedia/Reddit) are always included
  since those searches are already topic-scoped

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-27 20:52:43 +00:00
ChemaVX 0c7176dd0b fix: add /process command, log quality filtering, improve Reddit headers
Build & Deploy ResearchOwl / build-and-push (push) Successful in 5s
- bot.py: add cmd_process handler to manually trigger chunk processing
  on the last session; register CommandHandler("process")
- processor.py: log exceptions from asyncio.gather instead of silently
  dropping them; add per-chunk quality score debug logging; warn when
  all chunks filtered by quality threshold with actionable hint;
  raise fallback score to 0.6 so Ollama failures don't filter chunks
- exhaustive.py: replace bot User-Agent with full browser UA + headers
  for REDDIT_HEADERS; downgrade Reddit 403 from warning to info since
  server IPs are routinely blocked; use content_type=None on json()
  to avoid aiohttp content-type mismatch errors

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-27 20:37:39 +00:00
ChemaVX bb8171359d fix: scraper - DDG per-query instances, Wikipedia bilingual seed, Reddit throttling
Build & Deploy ResearchOwl / build-and-push (push) Successful in 6s
2026-04-27 20:22:16 +00:00
ChemaVX 6a88b7ab10 ci: rewrite workflow with internal registry + BuildKit (polymarket-bot pattern)
Build & Deploy ResearchOwl / build-and-push (push) Successful in 1m4s
2026-04-27 14:00:05 +00:00
ChemaVX ba08536337 feat: initial ResearchOwl
Build & Deploy ResearchOwl / build (push) Failing after 1m38s
2026-04-27 13:49:07 +00:00