fix: WAL mode for concurrent reads, skipped stats, anti-repetition prompts
Build & Deploy ResearchOwl / build-and-push (push) Successful in 5s

database.py: enable PRAGMA journal_mode=WAL + synchronous=NORMAL so
  /status reads from concurrent connections see committed data without
  blocking behind the scraper's writes; add 'skipped' to get_session_stats

bot.py: show skipped count in fmt_progress and cmd_status; use 'or 0'
  to guard against NULL from SUM(); label active research in /status

processor.py: raise generate() temperature default to 0.7 + add
  repeat_penalty=1.15/repeat_last_n=128 to Ollama options to stop
  qwen2.5:3b from looping; scoring prompt keeps temperature=0.1

generator.py: rewrite all prompts with explicit "NEVER repeat"
  constraints and distinct-content rules per section; podcast prompt
  now asks for spoken-word style (no formal headers); reduce thread
  to 12-18 tweets (was 15-25) to fit model context; pass temperature=0.7

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
ChemaVX
2026-04-28 10:15:30 +00:00
parent f7d62345b8
commit c4fb33fbf5
4 changed files with 115 additions and 73 deletions
+7 -4
View File
@@ -91,6 +91,8 @@ async def get_db() -> aiosqlite.Connection:
Path(settings.db_path).parent.mkdir(parents=True, exist_ok=True)
db = await aiosqlite.connect(settings.db_path)
db.row_factory = aiosqlite.Row
await db.execute("PRAGMA journal_mode=WAL")
await db.execute("PRAGMA synchronous=NORMAL")
await db.executescript(SCHEMA)
await db.commit()
return db
@@ -140,11 +142,12 @@ class ResearchDB:
async def get_session_stats(self, session_id: int) -> dict:
cursor = await self.db.execute(
"""SELECT
"""SELECT
COUNT(*) as total,
SUM(CASE WHEN status='scraped' THEN 1 ELSE 0 END) as scraped,
SUM(CASE WHEN status='failed' THEN 1 ELSE 0 END) as failed,
SUM(CASE WHEN status='pending' THEN 1 ELSE 0 END) as pending
SUM(CASE WHEN status='scraped' THEN 1 ELSE 0 END) as scraped,
SUM(CASE WHEN status='failed' THEN 1 ELSE 0 END) as failed,
SUM(CASE WHEN status='pending' THEN 1 ELSE 0 END) as pending,
SUM(CASE WHEN status='skipped' THEN 1 ELSE 0 END) as skipped
FROM sources WHERE session_id = ?""",
(session_id,)
)