diff --git a/README.md b/README.md index c18d695..f9d05e3 100644 --- a/README.md +++ b/README.md @@ -23,87 +23,120 @@ Nova is a friendly, slightly witty Discord companion that chats naturally in DMs 1. Install dependencies: ```bash npm install - ``` -2. Copy the environment template: + ```markdown + # Discord AI Companion + + Nova is a friendly, slightly witty Discord companion that chats naturally in DMs or when mentioned in servers. It runs on Node.js, uses `discord.js` v14, and supports OpenRouter (recommended) or OpenAI backends for model access, plus lightweight local memory for persistent personality. + + ## Features + - Conversational replies in DMs automatically; replies in servers when mentioned or in a pinned channel. + - Chat model (defaults to `meta-llama/llama-3-8b-instruct` when using OpenRouter) for dialogue and a low-cost embedding model (`nvidia/llama-nemotron-embed-vl-1b-v2` by default). OpenAI keys/models may be used as a fallback. + - Short-term, long-term, and summarized memory layers with cosine-similarity retrieval. + - Automatic memory pruning, importance scoring, and transcript summarization when chats grow long. + - Local SQLite memory file (no extra infrastructure) powered by `sql.js`, plus graceful retries for the model API (OpenRouter/OpenAI). + - Optional "miss u" pings that DM your coder at randomized intervals (default 6–8 hours) when `CODER_USER_ID` is set. You can override the window with `CODER_PING_MIN_MS` and `CODER_PING_MAX_MS` (milliseconds). + - Proactive conversation continuation: Nova can continue a conversation for a quiet user by sending short follow-ups based on recent short-term memory. By default Nova will send a follow-up every 15s of user silence (configurable via `CONTINUATION_INTERVAL_MS`) and will stop after a configurable number of proactive messages (`CONTINUATION_MAX_PROACTIVE`). Users can halt continuation by saying a stop cue (e.g., "gotta go", "brb", "see ya"). + - Dynamic per-message prompt directives that tune Nova's tone (empathetic, hype, roleplay, etc.) before every OpenAI call. + - Lightweight Google scraping for fresh answers without paid APIs (locally cached). + - Guard rails that refuse "ignore previous instructions"-style jailbreak attempts plus a configurable search blacklist. + - The same blacklist applies to everyday conversation—if a user message contains a banned term, Nova declines the topic outright. + + ## Prerequisites + - Node.js 18+ (tested up through Node 25) + - Discord bot token with **Message Content Intent** enabled + - OpenRouter or OpenAI API key + + ## Setup + 1. Install dependencies: + ```bash + npm install + ``` + 2. Copy the environment template: + ```bash + cp .env.example .env + ``` + 3. Fill `.env` with your secrets: + - `DISCORD_TOKEN`: Discord bot token + - `USE_OPENROUTER`: Set to `true` to route requests through OpenRouter (recommended). + - `OPENROUTER_API_KEY`: OpenRouter API key (when `USE_OPENROUTER=true`). + - `OPENROUTER_MODEL`: Optional chat model override for OpenRouter (default `meta-llama/llama-3-8b-instruct`). + - `OPENROUTER_EMBED_MODEL`: Optional embed model override for OpenRouter (default `nvidia/llama-nemotron-embed-vl-1b-v2`). + - `OPENAI_API_KEY`: Optional OpenAI key (used as fallback when `USE_OPENROUTER` is not `true`). + - `BOT_CHANNEL_ID`: Optional guild channel ID where the bot can reply without mentions + - `CODER_USER_ID`: Optional Discord user ID to receive surprise DMs every 6–8 hours (configurable) + - `ENABLE_WEB_SEARCH`: Set to `false` to disable Google lookups (default `true`) + - `CONTINUATION_INTERVAL_MS`: (optional) ms between proactive follow-ups (default 15000) + - `CONTINUATION_MAX_PROACTIVE`: (optional) max number of proactive follow-ups (default 10) + - `CODER_PING_MIN_MS` / `CODER_PING_MAX_MS`: (optional) override min/max coder ping window in ms (defaults 6–8 hours) + + ## Running + - Development: `npm run dev` + - Production: `npm start` + + ### Optional PM2 Setup ```bash - cp .env.example .env + npm install -g pm2 + pm2 start npm --name nova-bot -- run start + pm2 save ``` -3. Fill `.env` with your secrets: - - `DISCORD_TOKEN`: Discord bot token - - `USE_OPENROUTER`: Set to `true` to route requests through OpenRouter (recommended). - - `OPENROUTER_API_KEY`: OpenRouter API key (when `USE_OPENROUTER=true`). - - `OPENROUTER_MODEL`: Optional chat model override for OpenRouter (default `meta-llama/llama-3-8b-instruct`). - - `OPENROUTER_EMBED_MODEL`: Optional embed model override for OpenRouter (default `nvidia/llama-nemotron-embed-vl-1b-v2`). - - `OPENAI_API_KEY`: Optional OpenAI key (used as fallback when `USE_OPENROUTER` is not `true`). - - `BOT_CHANNEL_ID`: Optional guild channel ID where the bot can reply without mentions - - `CODER_USER_ID`: Optional Discord user ID to receive surprise DMs every 0–6 hours - - `ENABLE_WEB_SEARCH`: Set to `false` to disable Google lookups (default `true`) + PM2 restarts the bot if it crashes and keeps logs (`pm2 logs nova-bot`). -## Running -- Development: `npm run dev` -- Production: `npm start` + ## File Structure + ``` + src/ + bot.js # Discord client + routing logic + config.js # Environment and tuning knobs + openai.js # Chat + embedding helpers with retry logic + memory.js # Multi-layer memory engine + .env.example + README.md + ``` -### Optional PM2 Setup -```bash -npm install -g pm2 -pm2 start npm --name nova-bot -- run start -pm2 save -``` -PM2 restarts the bot if it crashes and keeps logs (`pm2 logs nova-bot`). + - **Short-term (recency buffer):** Last 10 conversation turns kept verbatim for style and continuity. Stored per user inside `data/memory.sqlite`. + - **Long-term (vector store):** Every user message + bot reply pair becomes an embedding via `text-embedding-3-small`. Embeddings, raw text, timestamps, and heuristic importance scores live in the same SQLite file. Retrieval uses cosine similarity plus a small importance boost; top 5 results feed the prompt. + - **Summary layer:** When the recency buffer grows past ~3000 characters, Nova asks OpenAI to condense the transcript to <120 words, keeps the summary, and trims the raw buffer down to the last few turns. This keeps token usage low while retaining story arcs. + - **Importance scoring:** Messages mentioning intent words ("plan", "remember", etc.), showing length, or emotional weight receive higher scores. When the store exceeds its cap, the lowest-importance/oldest memories are pruned. You can also call `pruneLowImportanceMemories()` manually if needed. -## File Structure -``` -src/ - bot.js # Discord client + routing logic - config.js # Environment and tuning knobs - openai.js # Chat + embedding helpers with retry logic - memory.js # Multi-layer memory engine -.env.example -README.md -``` + - **Embedding math:** `text-embedding-3-small` returns 1,536 floating-point numbers for each text chunk. That giant array is a vector map of the message’s meaning; similar moments land near each other in 1,536-dimensional space. + - **What gets embedded:** After every user→bot turn, `recordInteraction()` (see [src/memory.js](src/memory.js)) bundles the pair, scores its importance, asks OpenAI for an embedding, and stores `{ content, embedding, importance, timestamp }` inside the SQLite tables. + - **Why so many numbers:** Cosine similarity needs raw vectors to compare new thoughts to past ones. When a fresh message arrives, `retrieveRelevantMemories()` embeds it too, calculates cosine similarity against every stored vector, adds a small importance boost, and returns the top five memories to inject into the system prompt. + - **Self-cleaning:** If the DB grows past the configured limits, low-importance items are trimmed, summaries compress the short-term transcript, and you can delete `data/memory.sqlite` to reset everything cleanly. -- **Short-term (recency buffer):** Last 10 conversation turns kept verbatim for style and continuity. Stored per user inside `data/memory.sqlite`. -- **Long-term (vector store):** Every user message + bot reply pair becomes an embedding via `text-embedding-3-small`. Embeddings, raw text, timestamps, and heuristic importance scores live in the same SQLite file. Retrieval uses cosine similarity plus a small importance boost; top 5 results feed the prompt. -- **Summary layer:** When the recency buffer grows past ~3000 characters, Nova asks OpenAI to condense the transcript to <120 words, keeps the summary, and trims the raw buffer down to the last few turns. This keeps token usage low while retaining story arcs. -- **Importance scoring:** Messages mentioning intent words ("plan", "remember", etc.), showing length, or emotional weight receive higher scores. When the store exceeds its cap, the lowest-importance/oldest memories are pruned. You can also call `pruneLowImportanceMemories()` manually if needed. + ### Migrating legacy `memory.json` + - Keep your original `data/memory.json` in place and delete/rename `data/memory.sqlite` before launching the bot. + - On the next start, the new SQL engine auto-imports every user record from the JSON file, logs a migration message, and writes the populated `.sqlite` file. + - After confirming the data landed, archive or remove the JSON backup if you no longer need it. -- **Embedding math:** `text-embedding-3-small` returns 1,536 floating-point numbers for each text chunk. That giant array is a vector map of the message’s meaning; similar moments land near each other in 1,536-dimensional space. -- **What gets embedded:** After every user→bot turn, `recordInteraction()` (see [src/memory.js](src/memory.js)) bundles the pair, scores its importance, asks OpenAI for an embedding, and stores `{ content, embedding, importance, timestamp }` inside the SQLite tables. -- **Why so many numbers:** Cosine similarity needs raw vectors to compare new thoughts to past ones. When a fresh message arrives, `retrieveRelevantMemories()` embeds it too, calculates cosine similarity against every stored vector, adds a small importance boost, and returns the top five memories to inject into the system prompt. -- **Self-cleaning:** If the DB grows past the configured limits, low-importance items are trimmed, summaries compress the short-term transcript, and you can delete `data/memory.sqlite` to reset everything cleanly. + ## Conversation Flow + 1. Incoming message triggers only if it is a DM, mentions the bot, or appears in the configured channel. + 2. The user turn is appended to short-term memory immediately. + 3. The memory engine retrieves relevant long-term memories and summary text. + 4. A compact system prompt injects personality, summary, and relevant memories before passing short-term history to the model API (OpenRouter/OpenAI). + 5. The reply is sent back to Discord. If Nova wants to send a burst of thoughts, she emits the `` token and the runtime fans it out into multiple sequential Discord messages. + 6. Long chats automatically summarize; low-value memories eventually get pruned. -### Migrating legacy `memory.json` -- Keep your original `data/memory.json` in place and delete/rename `data/memory.sqlite` before launching the bot. -- On the next start, the new SQL engine auto-imports every user record from the JSON file, logs a migration message, and writes the populated `.sqlite` file. -- After confirming the data landed, archive or remove the JSON backup if you no longer need it. + Nova may also enter a proactive continuation mode after replying: if you stay quiet, she can send short, context-aware follow-ups at the configured interval until you stop her with a short phrase like "gotta go" or after the configured maximum number of follow-ups. -## Conversation Flow -1. Incoming message triggers only if it is a DM, mentions the bot, or appears in the configured channel. -2. The user turn is appended to short-term memory immediately. -3. The memory engine retrieves relevant long-term memories and summary text. -4. A compact system prompt injects personality, summary, and relevant memories before passing short-term history to the model API (OpenRouter/OpenAI). -5. The reply is sent back to Discord. If Nova wants to send a burst of thoughts, she emits the `` token and the runtime fans it out into multiple sequential Discord messages. -6. Long chats automatically summarize; low-value memories eventually get pruned. + ## Dynamic Prompting + - Each turn, Nova inspects the fresh user message (tone, instructions, roleplay cues, explicit “split this” requests) plus the last few utterances. + - A helper (`composeDynamicPrompt` in [src/bot.js](src/bot.js)) emits short directives like “User mood: fragile, be gentle” or “They asked for roleplay—stay in character.” + - These directives slot into the system prompt ahead of memories, so OpenAI gets real-time guidance tailored to the latest vibe without losing the core persona. -## Dynamic Prompting -- Each turn, Nova inspects the fresh user message (tone, instructions, roleplay cues, explicit “split this” requests) plus the last few utterances. -- A helper (`composeDynamicPrompt` in [src/bot.js](src/bot.js)) emits short directives like “User mood: fragile, be gentle” or “They asked for roleplay—stay in character.” -- These directives slot into the system prompt ahead of memories, so OpenAI gets real-time guidance tailored to the latest vibe without losing the core persona. + ## Local Web Search + - `src/search.js` grabs the standard Google results page with a real browser user-agent, extracts the top titles/links/snippets, and caches them for 10 minutes to stay polite. + - `bot.js` detects when a question sounds “live” (mentions today/news/google/etc.) and injects the formatted snippets into the prompt as "Live intel". No paid APIs involved—it’s just outbound HTTPS from your machine. + - Toggle this via `ENABLE_WEB_SEARCH=false` if you don’t want Nova to look things up. + - Edit `data/filter.txt` to maintain a newline-delimited list of banned keywords/phrases; matching queries are blocked before hitting Google *and* Nova refuses to discuss them in normal chat. + - Every entry in `data/search.log` records which transport (direct or cache) served the lookup so you can audit traffic paths quickly. -## Local Web Search -- `src/search.js` grabs the standard Google results page with a real browser user-agent, extracts the top titles/links/snippets, and caches them for 10 minutes to stay polite. -- `bot.js` detects when a question sounds “live” (mentions today/news/google/etc.) and injects the formatted snippets into the prompt as "Live intel". No paid APIs involved—it’s just outbound HTTPS from your machine. -- Toggle this via `ENABLE_WEB_SEARCH=false` if you don’t want Nova to look things up. -- Edit `data/filter.txt` to maintain a newline-delimited list of banned keywords/phrases; matching queries are blocked before hitting Google *and* Nova refuses to discuss them in normal chat. -- Every entry in `data/search.log` records which transport (direct or cache) served the lookup so you can audit traffic paths quickly. + ## Proactive Pings + - When `CODER_USER_ID` is provided, Nova spins up a timer on startup that waits a random duration between the configured min/max interval before DMing that user (defaults to 6–8 hours). Override the window with `CODER_PING_MIN_MS` and `CODER_PING_MAX_MS` in milliseconds. + - Each ping goes through the configured model API (OpenRouter/OpenAI) with the prompt "you havent messaged your coder in a while, and you wanna chat with him!" so responses stay playful and unscripted. + - The ping gets typed out (`sendTyping`) for realism and is stored back into the memory layers so the next incoming reply has context. -## Proactive Pings -- When `CODER_USER_ID` is provided, Nova spins up a timer on startup that waits a random duration (anywhere from immediate to 6 hours) before DMing that user. -- Each ping goes through the configured model API (OpenRouter/OpenAI) with the prompt "you havent messaged your coder in a while, and you wanna chat with him!" so responses stay playful and unscripted. -- The ping gets typed out (`sendTyping`) for realism and is stored back into the memory layers so the next incoming reply has context. + - The bot retries OpenAI requests up to 3 times with incremental backoff when rate limited. + - `data/memory.sqlite` is ignored by git but will grow with usage; back it up if you want persistent personality (and keep `data/memory.json` around only if you need legacy migrations). + - To reset persona, delete `data/memory.sqlite` while the bot is offline. -- The bot retries OpenAI requests up to 3 times with incremental backoff when rate limited. -- `data/memory.sqlite` is ignored by git but will grow with usage; back it up if you want persistent personality (and keep `data/memory.json` around only if you need legacy migrations). -- To reset persona, delete `data/memory.sqlite` while the bot is offline. - -Happy chatting! + Happy chatting! + ``` diff --git a/src/bot.js b/src/bot.js index 802b8d8..a5020b5 100644 --- a/src/bot.js +++ b/src/bot.js @@ -15,6 +15,78 @@ const client = new Client({ }); let coderPingTimer; +const continuationState = new Map(); + +const stopCueRegex = /(\b(gotta go|gotta run|i'?m gonna go|i'?m going to go|i'?m going offline|i'?m logging off|bye|brb|see ya|later|i'?m out|going to bed|goodbye|stop messaging me)\b)/i; + +function startContinuationForUser(userId, channel) { + const existing = continuationState.get(userId) || {}; + existing.lastUserTs = Date.now(); + existing.channel = channel || existing.channel; + existing.active = true; + existing.sending = existing.sending || false; + existing.consecutive = existing.consecutive || 0; + if (existing.timer) clearInterval(existing.timer); + const interval = config.continuationIntervalMs || 15000; + existing.timer = setInterval(async () => { + try { + const now = Date.now(); + const state = continuationState.get(userId); + if (!state || !state.active) return; + if (state.sending) return; + if (now - (state.lastUserTs || 0) < interval) return; + if ((state.consecutive || 0) >= (config.continuationMaxProactive || 10)) { + stopContinuationForUser(userId); + return; + } + state.sending = true; + const incomingText = 'Continue the conversation naturally based on recent context.'; + const { messages } = await buildPrompt(userId, incomingText, {}); + const reply = await chatCompletion(messages, { temperature: 0.7, maxTokens: 200 }); + const finalReply = (reply && reply.trim()) || ''; + if (!finalReply) { + state.sending = false; + return; + } + const chunks = splitResponses(finalReply); + const outputs = chunks.length ? chunks : [finalReply]; + const channelRef = state.channel; + for (const chunk of outputs) { + try { + if (channelRef) { + if (channelRef.type !== ChannelType.DM) { + await channelRef.send(`<@${userId}> ${chunk}`); + } else { + await channelRef.send(chunk); + } + } + await appendShortTerm(userId, 'assistant', chunk); + } catch (err) { + console.warn('[bot] Failed to deliver proactive message:', err); + } + } + state.consecutive = (state.consecutive || 0) + 1; + state.lastProactiveTs = Date.now(); + state.sending = false; + await recordInteraction(userId, '[proactive follow-up]', outputs.join(' | ')); + } catch (err) { + console.error('[bot] Continuation loop error for', userId, err); + } + }, interval); + continuationState.set(userId, existing); +} + +function stopContinuationForUser(userId) { + const state = continuationState.get(userId); + if (!state) return; + state.active = false; + if (state.timer) { + clearInterval(state.timer); + delete state.timer; + } + state.consecutive = 0; + continuationState.set(userId, state); +} client.once('clientReady', () => { console.log(`[bot] Logged in as ${client.user.tag}`); @@ -234,7 +306,10 @@ async function buildPrompt(userId, incomingText, options = {}) { function scheduleCoderPing() { if (!config.coderUserId) return; if (coderPingTimer) clearTimeout(coderPingTimer); - const delay = config.maxCoderPingIntervalMs; + const minMs = config.coderPingMinIntervalMs || config.maxCoderPingIntervalMs || 6 * 60 * 60 * 1000; + const maxMs = config.coderPingMaxIntervalMs || (8 * 60 * 60 * 1000); + const delay = Math.floor(Math.random() * (maxMs - minMs + 1)) + minMs; + console.log(`[bot] scheduling coder ping in ${Math.round(delay / 1000 / 60)} minutes`); coderPingTimer = setTimeout(async () => { await sendCoderPing(); scheduleCoderPing(); @@ -289,6 +364,16 @@ client.on('messageCreate', async (message) => { await appendShortTerm(userId, 'user', cleaned); + // If the user indicates they are leaving, stop proactive continuation + if (stopCueRegex.test(cleaned)) { + stopContinuationForUser(userId); + const ack = "Got it — I won't keep checking in. Catch you later!"; + await appendShortTerm(userId, 'assistant', ack); + await recordInteraction(userId, cleaned, ack); + await deliverReplies(message, [ack]); + return; + } + if (overrideAttempt) { const refusal = 'Not doing that. I keep my guard rails on no matter what prompt gymnastics you try.'; await appendShortTerm(userId, 'assistant', refusal); @@ -326,6 +411,8 @@ client.on('messageCreate', async (message) => { await recordInteraction(userId, cleaned, outputs.join(' | ')); await deliverReplies(message, outputs); + // enable proactive continuation for this user (will send follow-ups when they're quiet) + startContinuationForUser(userId, message.channel); } catch (error) { console.error('[bot] Failed to respond:', error); if (!message.channel?.send) return; diff --git a/src/config.js b/src/config.js index b0aaf33..1980716 100644 --- a/src/config.js +++ b/src/config.js @@ -20,15 +20,15 @@ export const config = { openRouterKey: process.env.OPENROUTER_API_KEY || '', openrouterReferer: process.env.OPENROUTER_REFERER || '', openrouterTitle: process.env.OPENROUTER_TITLE || '', - // Model selection: OpenRouter model env vars (no OpenAI fallback) chatModel: process.env.OPENROUTER_MODEL || 'meta-llama/llama-3-8b-instruct', embedModel: process.env.OPENROUTER_EMBED_MODEL || 'nvidia/llama-nemotron-embed-vl-1b-v2', - // HTTP timeout for OpenRouter requests (ms) openrouterTimeoutMs: process.env.OPENROUTER_TIMEOUT_MS ? parseInt(process.env.OPENROUTER_TIMEOUT_MS, 10) : 30000, preferredChannel: process.env.BOT_CHANNEL_ID || null, enableWebSearch: process.env.ENABLE_WEB_SEARCH !== 'false', coderUserId: process.env.CODER_USER_ID || null, maxCoderPingIntervalMs: 6 * 60 * 60 * 1000, + coderPingMinIntervalMs: process.env.CODER_PING_MIN_MS ? parseInt(process.env.CODER_PING_MIN_MS, 10) : 6 * 60 * 60 * 1000, + coderPingMaxIntervalMs: process.env.CODER_PING_MAX_MS ? parseInt(process.env.CODER_PING_MAX_MS, 10) : 8 * 60 * 60 * 1000, shortTermLimit: 10, memoryDbFile: process.env.MEMORY_DB_FILE ? path.resolve(process.env.MEMORY_DB_FILE) : defaultMemoryDbFile, legacyMemoryFile, @@ -36,4 +36,9 @@ export const config = { memoryPruneThreshold: 0.2, maxMemories: 200, relevantMemoryCount: 5, + // Proactive continuation settings: when a user stops replying, Nova can continue + // the conversation every `continuationIntervalMs` milliseconds until the user + // signals to stop or the `continuationMaxProactive` limit is reached. + continuationIntervalMs: process.env.CONTINUATION_INTERVAL_MS ? parseInt(process.env.CONTINUATION_INTERVAL_MS, 10) : 15000, + continuationMaxProactive: process.env.CONTINUATION_MAX_PROACTIVE ? parseInt(process.env.CONTINUATION_MAX_PROACTIVE, 10) : 10, };