researchowl

9 Commits

Author	SHA1	Message	Date
ChemaVXandClaude Sonnet 4.6	54b3841d32	feat: generate all outputs in Spanish Add "Escribe SIEMPRE en español" at the start of all system prompts (podcast, blog, report, thread) so Ollama generates content in Spanish. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 08:40:38 +00:00
ChemaVXandClaude Sonnet 4.6	d0e55ddb50	feat: Claude Haiku for relevance scoring, fallback to Ollama Build & Deploy ResearchOwl / build-and-push (push) Successful in 45s Details processor.py: split _score_quality into _score_with_claude and _score_with_ollama; if ANTHROPIC_API_KEY is set, use Claude Haiku (claude-haiku-4-5) with max_tokens=10 for fast, accurate 0-10 relevance scoring; falls back to Ollama on any error requirements.txt: add anthropic>=0.40.0 k8s: ANTHROPIC_API_KEY added to researchowl-secrets and mounted in deployment; QUALITY_THRESHOLD restored to 0.4 (Claude scoring is accurate enough to use the threshold) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 08:04:12 +00:00
ChemaVXandClaude Sonnet 4.6	5feff6073e	fix: send new message if edit_text fails silently in /process Build & Deploy ResearchOwl / build-and-push (push) Successful in 7s Details If the bot restarted between sending the progress message and the completion callback, edit_text may fail silently (Conflict/stale ref). Store completion text and reply_text as fallback so the user always sees the result. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 10:53:59 +00:00
ChemaVXandClaude Sonnet 4.6	c4fb33fbf5	fix: WAL mode for concurrent reads, skipped stats, anti-repetition prompts Build & Deploy ResearchOwl / build-and-push (push) Successful in 5s Details database.py: enable PRAGMA journal_mode=WAL + synchronous=NORMAL so /status reads from concurrent connections see committed data without blocking behind the scraper's writes; add 'skipped' to get_session_stats bot.py: show skipped count in fmt_progress and cmd_status; use 'or 0' to guard against NULL from SUM(); label active research in /status processor.py: raise generate() temperature default to 0.7 + add repeat_penalty=1.15/repeat_last_n=128 to Ollama options to stop qwen2.5:3b from looping; scoring prompt keeps temperature=0.1 generator.py: rewrite all prompts with explicit "NEVER repeat" constraints and distinct-content rules per section; podcast prompt now asks for spoken-word style (no formal headers); reduce thread to 12-18 tweets (was 15-25) to fit model context; pass temperature=0.7 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 10:15:30 +00:00
ChemaVXandClaude Sonnet 4.6	f7d62345b8	fix: relevance scoring per topic + URL keyword filter for child pages Build & Deploy ResearchOwl / build-and-push (push) Successful in 6s Details processor.py: simplify _score_quality prompt to single axis — "how relevant is this text to topic X?" — instead of averaging relevance + density + credibility, which let off-topic but well-written content pass through exhaustive.py: pre-compute topic keywords (stopword-filtered) at scraper init; filter child URLs (discovered during crawl, depth>0) to only add ones whose URL path or title contains a topic keyword; seed URLs (depth=0, from DDG/Wikipedia/Reddit) are always included since those searches are already topic-scoped Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-27 20:52:43 +00:00
ChemaVXandClaude Sonnet 4.6	0c7176dd0b	fix: add /process command, log quality filtering, improve Reddit headers Build & Deploy ResearchOwl / build-and-push (push) Successful in 5s Details - bot.py: add cmd_process handler to manually trigger chunk processing on the last session; register CommandHandler("process") - processor.py: log exceptions from asyncio.gather instead of silently dropping them; add per-chunk quality score debug logging; warn when all chunks filtered by quality threshold with actionable hint; raise fallback score to 0.6 so Ollama failures don't filter chunks - exhaustive.py: replace bot User-Agent with full browser UA + headers for REDDIT_HEADERS; downgrade Reddit 403 from warning to info since server IPs are routinely blocked; use content_type=None on json() to avoid aiohttp content-type mismatch errors Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-27 20:37:39 +00:00
ChemaVX	bb8171359d	fix: scraper - DDG per-query instances, Wikipedia bilingual seed, Reddit throttling Build & Deploy ResearchOwl / build-and-push (push) Successful in 6s Details	2026-04-27 20:22:16 +00:00
ChemaVX	6a88b7ab10	ci: rewrite workflow with internal registry + BuildKit (polymarket-bot pattern) Build & Deploy ResearchOwl / build-and-push (push) Successful in 1m4s Details	2026-04-27 14:00:05 +00:00
ChemaVX	ba08536337	feat: initial ResearchOwl Build & Deploy ResearchOwl / build (push) Failing after 1m38s Details	2026-04-27 13:49:07 +00:00