El campo "html" en Ghost Admin API v5 (Lexical editor) es de solo
lectura. El contenido se debe enviar via mobiledoc con HTML card,
que Ghost acepta en todas las versiones de v5 y renderiza sin
conversión. Añadidos logs de diagnóstico y validación de HTML vacío.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The synchronous get_transcript() call was blocking the asyncio event
loop indefinitely, freezing the entire bot (including Telegram polling).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use Claude Haiku (via ANTHROPIC_API_KEY) for all output generation.
Falls back to Ollama qwen2.5:3b if no API key is set.
Also translates all user-turn prompts to Spanish for consistency.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add "Escribe SIEMPRE en español" at the start of all system prompts
(podcast, blog, report, thread) so Ollama generates content in Spanish.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
processor.py: split _score_quality into _score_with_claude and
_score_with_ollama; if ANTHROPIC_API_KEY is set, use Claude Haiku
(claude-haiku-4-5) with max_tokens=10 for fast, accurate 0-10
relevance scoring; falls back to Ollama on any error
requirements.txt: add anthropic>=0.40.0
k8s: ANTHROPIC_API_KEY added to researchowl-secrets and mounted in
deployment; QUALITY_THRESHOLD restored to 0.4 (Claude scoring
is accurate enough to use the threshold)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
If the bot restarted between sending the progress message and the
completion callback, edit_text may fail silently (Conflict/stale ref).
Store completion text and reply_text as fallback so the user always
sees the result.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
database.py: enable PRAGMA journal_mode=WAL + synchronous=NORMAL so
/status reads from concurrent connections see committed data without
blocking behind the scraper's writes; add 'skipped' to get_session_stats
bot.py: show skipped count in fmt_progress and cmd_status; use 'or 0'
to guard against NULL from SUM(); label active research in /status
processor.py: raise generate() temperature default to 0.7 + add
repeat_penalty=1.15/repeat_last_n=128 to Ollama options to stop
qwen2.5:3b from looping; scoring prompt keeps temperature=0.1
generator.py: rewrite all prompts with explicit "NEVER repeat"
constraints and distinct-content rules per section; podcast prompt
now asks for spoken-word style (no formal headers); reduce thread
to 12-18 tweets (was 15-25) to fit model context; pass temperature=0.7
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
processor.py: simplify _score_quality prompt to single axis —
"how relevant is this text to topic X?" — instead of averaging
relevance + density + credibility, which let off-topic but
well-written content pass through
exhaustive.py: pre-compute topic keywords (stopword-filtered) at
scraper init; filter child URLs (discovered during crawl, depth>0)
to only add ones whose URL path or title contains a topic keyword;
seed URLs (depth=0, from DDG/Wikipedia/Reddit) are always included
since those searches are already topic-scoped
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- bot.py: add cmd_process handler to manually trigger chunk processing
on the last session; register CommandHandler("process")
- processor.py: log exceptions from asyncio.gather instead of silently
dropping them; add per-chunk quality score debug logging; warn when
all chunks filtered by quality threshold with actionable hint;
raise fallback score to 0.6 so Ollama failures don't filter chunks
- exhaustive.py: replace bot User-Agent with full browser UA + headers
for REDDIT_HEADERS; downgrade Reddit 403 from warning to info since
server IPs are routinely blocked; use content_type=None on json()
to avoid aiohttp content-type mismatch errors
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>