Commit Graph

16 Commits

Author SHA1 Message Date
ChemaVX b5518ac95a feat: scheduler /watch — watched_topics + scheduler loop + /watch /unwatch /watches
Build & Deploy ResearchOwl / build-and-push (push) Successful in 5s
2026-05-04 07:48:05 +00:00
ChemaVX b33ae202b8 feat: trackeo de coste por llamada Claude — tabla api_usage + /costs
Build & Deploy ResearchOwl / build-and-push (push) Successful in 6s
2026-05-03 20:06:06 +00:00
ChemaVX 65917518ce ci: retrigger build for a681627
Build & Deploy ResearchOwl / build-and-push (push) Successful in 5s
2026-05-03 17:14:50 +00:00
ChemaVX a681627d2e feat: TTL purge — purge_old_sessions + /purge command + startup hook
Build & Deploy ResearchOwl / build-and-push (push) Successful in 5s
2026-05-03 16:56:37 +00:00
ChemaVX 7704f071d6 feat: retry+backoff en scraper, ProgressReporter en bot
Build & Deploy ResearchOwl / build-and-push (push) Successful in 6s
2026-05-03 16:40:37 +00:00
ChemaVX e66d728d68 fix: wrap YouTubeTranscriptApi in run_in_executor with 30s timeout
Build & Deploy ResearchOwl / build-and-push (push) Successful in 5s
The synchronous get_transcript() call was blocking the asyncio event
loop indefinitely, freezing the entire bot (including Telegram polling).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-29 12:59:40 +00:00
ChemaVX 65b1739943 feat: Claude Haiku for content generation, Ollama fallback
Build & Deploy ResearchOwl / build-and-push (push) Successful in 6s
Use Claude Haiku (via ANTHROPIC_API_KEY) for all output generation.
Falls back to Ollama qwen2.5:3b if no API key is set.
Also translates all user-turn prompts to Spanish for consistency.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-29 09:06:06 +00:00
ChemaVX 54b3841d32 feat: generate all outputs in Spanish
Add "Escribe SIEMPRE en español" at the start of all system prompts
(podcast, blog, report, thread) so Ollama generates content in Spanish.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-29 08:40:38 +00:00
ChemaVX d0e55ddb50 feat: Claude Haiku for relevance scoring, fallback to Ollama
Build & Deploy ResearchOwl / build-and-push (push) Successful in 45s
processor.py: split _score_quality into _score_with_claude and
  _score_with_ollama; if ANTHROPIC_API_KEY is set, use Claude Haiku
  (claude-haiku-4-5) with max_tokens=10 for fast, accurate 0-10
  relevance scoring; falls back to Ollama on any error

requirements.txt: add anthropic>=0.40.0

k8s: ANTHROPIC_API_KEY added to researchowl-secrets and mounted in
  deployment; QUALITY_THRESHOLD restored to 0.4 (Claude scoring
  is accurate enough to use the threshold)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-29 08:04:12 +00:00
ChemaVX 5feff6073e fix: send new message if edit_text fails silently in /process
Build & Deploy ResearchOwl / build-and-push (push) Successful in 7s
If the bot restarted between sending the progress message and the
completion callback, edit_text may fail silently (Conflict/stale ref).
Store completion text and reply_text as fallback so the user always
sees the result.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-28 10:53:59 +00:00
ChemaVX c4fb33fbf5 fix: WAL mode for concurrent reads, skipped stats, anti-repetition prompts
Build & Deploy ResearchOwl / build-and-push (push) Successful in 5s
database.py: enable PRAGMA journal_mode=WAL + synchronous=NORMAL so
  /status reads from concurrent connections see committed data without
  blocking behind the scraper's writes; add 'skipped' to get_session_stats

bot.py: show skipped count in fmt_progress and cmd_status; use 'or 0'
  to guard against NULL from SUM(); label active research in /status

processor.py: raise generate() temperature default to 0.7 + add
  repeat_penalty=1.15/repeat_last_n=128 to Ollama options to stop
  qwen2.5:3b from looping; scoring prompt keeps temperature=0.1

generator.py: rewrite all prompts with explicit "NEVER repeat"
  constraints and distinct-content rules per section; podcast prompt
  now asks for spoken-word style (no formal headers); reduce thread
  to 12-18 tweets (was 15-25) to fit model context; pass temperature=0.7

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-28 10:15:30 +00:00
ChemaVX f7d62345b8 fix: relevance scoring per topic + URL keyword filter for child pages
Build & Deploy ResearchOwl / build-and-push (push) Successful in 6s
processor.py: simplify _score_quality prompt to single axis —
  "how relevant is this text to topic X?" — instead of averaging
  relevance + density + credibility, which let off-topic but
  well-written content pass through

exhaustive.py: pre-compute topic keywords (stopword-filtered) at
  scraper init; filter child URLs (discovered during crawl, depth>0)
  to only add ones whose URL path or title contains a topic keyword;
  seed URLs (depth=0, from DDG/Wikipedia/Reddit) are always included
  since those searches are already topic-scoped

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-27 20:52:43 +00:00
ChemaVX 0c7176dd0b fix: add /process command, log quality filtering, improve Reddit headers
Build & Deploy ResearchOwl / build-and-push (push) Successful in 5s
- bot.py: add cmd_process handler to manually trigger chunk processing
  on the last session; register CommandHandler("process")
- processor.py: log exceptions from asyncio.gather instead of silently
  dropping them; add per-chunk quality score debug logging; warn when
  all chunks filtered by quality threshold with actionable hint;
  raise fallback score to 0.6 so Ollama failures don't filter chunks
- exhaustive.py: replace bot User-Agent with full browser UA + headers
  for REDDIT_HEADERS; downgrade Reddit 403 from warning to info since
  server IPs are routinely blocked; use content_type=None on json()
  to avoid aiohttp content-type mismatch errors

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-27 20:37:39 +00:00
ChemaVX bb8171359d fix: scraper - DDG per-query instances, Wikipedia bilingual seed, Reddit throttling
Build & Deploy ResearchOwl / build-and-push (push) Successful in 6s
2026-04-27 20:22:16 +00:00
ChemaVX 6a88b7ab10 ci: rewrite workflow with internal registry + BuildKit (polymarket-bot pattern)
Build & Deploy ResearchOwl / build-and-push (push) Successful in 1m4s
2026-04-27 14:00:05 +00:00
ChemaVX ba08536337 feat: initial ResearchOwl
Build & Deploy ResearchOwl / build (push) Failing after 1m38s
2026-04-27 13:49:07 +00:00