Ask the Oracle
How the in-site AI assistant works — RAG over the docs, served by your own Ollama on your Ollama instance.
The amber button in the corner of every page is the Nibiru Oracle — an AI assistant grounded in this very documentation. Ask it about routing, modules, the CLI, the Smarty layer, the meaning of pageAction(). It cites its sources.
What’s powering it
Section titled “What’s powering it”By default, the Oracle runs entirely on your own infrastructure:
| Layer | Backend | Default model |
|---|---|---|
| Chat (answer generation) | Ollama on https://your-ollama-host.example | qwen2.5-coder:14b |
| Embeddings (RAG retrieval) | Ollama on https://your-ollama-host.example | nomic-embed-text |
No paid API keys. No data leaves your network. The 5-GPU Ollama cluster you already run handles the load.
If you’d rather use a paid provider — Claude for chat, OpenAI for embeddings — set LLM_PROVIDER=anthropic and/or EMBED_PROVIDER=openai plus the matching API keys. The code paths are identical.
How it works
Section titled “How it works”flowchart LR A[User question] --> B[Embed via Ollama<br/>nomic-embed-text] B --> C[Cosine search<br/>against pre-computed<br/>doc-chunk index] C --> D[Top-K chunks] D --> E[Ollama chat<br/>qwen2.5-coder:14b<br/>system + retrieved context] E --> F[Answer + source list] F --> G[Render in chat UI]- At build time, the docs site walks every Markdown page, splits it into ~600-token chunks at H2/H3 boundaries, embeds each chunk with
nomic-embed-text, and writes the result topublic/oracle-index.json. No database needed. - At request time, the user’s question is embedded the same way, the closest chunks are retrieved by cosine similarity, and they’re stitched into a system prompt for the chat model.
- The chat model answers in the user’s language, citing source chunks by URL.
| File | Purpose |
|---|---|
scripts/lib/providers.mjs | Shared chat + embedding adapter (Ollama / Anthropic / OpenAI). |
scripts/build-oracle-index.mjs | Builds public/oracle-index.json at build time. |
public/oracle-index.json | The committed/build-output embedding index. |
src/pages/api/oracle.ts | The SSR endpoint the chat widget POSTs to. Also serves a GET for diagnostics. |
src/components/CosmicHeader.astro | The floating launcher + chat UI. |
One-time setup on your Ollama instance
Section titled “One-time setup on your Ollama instance”Pull the two models the Oracle uses:
curl https://your-ollama-host.example/api/pull -d '{"name":"qwen2.5-coder:14b"}'curl https://your-ollama-host.example/api/pull -d '{"name":"nomic-embed-text"}'qwen2.5-coder:14b is already installed (verified live). nomic-embed-text is the missing piece; without it the Oracle runs in chat-only (no-RAG) mode.
Configuring it
Section titled “Configuring it”The Oracle reads its config from environment variables. Sensible defaults are baked in.
# Default mode (Ollama on your Ollama instance)LLM_PROVIDER=ollama # defaultOLLAMA_BASE_URL=https://your-ollama-host.example # defaultOLLAMA_CHAT_MODEL=qwen2.5-coder:14b # defaultOLLAMA_EMBED_MODEL=nomic-embed-text # default
# Optional fallbacksLLM_PROVIDER=anthropicANTHROPIC_API_KEY=sk-ant-...ANTHROPIC_MODEL=claude-haiku-4-5-20251001
EMBED_PROVIDER=openaiOPENAI_API_KEY=sk-...OPENAI_EMBED_MODEL=text-embedding-3-small
# BehaviourORACLE_TOP_K=6ORACLE_MAX_TOKENS=800Diagnostics endpoint
Section titled “Diagnostics endpoint”GET /api/oracle returns the current config (no secrets):
curl https://nibiru-framework.com/api/oracle{ "status": "ok", "llm": { "provider": "ollama", "ollamaUrl": "https://your-ollama-host.example", "model": "qwen2.5-coder:14b" }, "embed": { "provider": "ollama", "ollamaUrl": "https://your-ollama-host.example", "model": "nomic-embed-text" }, "index": { "present": true, "chunks": 177, "provider": "ollama", "model": "nomic-embed-text" }}Handy for verifying a freshly-deployed container is using the backend you expected.
Privacy
Section titled “Privacy”- Questions and conversation history are sent to your Ollama server. They are not stored by the docs site or by Anthropic/OpenAI in the default Ollama config.
- The OpenAI key (if used) is invoked only for embeddings.
- No analytics or cookies are set by the Oracle widget itself.
Why a Nibiru-trained model?
Section titled “Why a Nibiru-trained model?”The roadmap (see AI Roadmap) is to fine-tune a LoRA on the Training Corpus export so the chat model itself is Nibiru-native. When that’s ready, the Oracle’s OLLAMA_CHAT_MODEL flips to the fine-tuned model and the system prompt simplifies. Same code, smarter answers.
Try it
Section titled “Try it”Open the Oracle (the amber planet, bottom-right) and try one of these:
- “How do I create a new module?”
- “What does
pageActiondo?” - “Show me how to handle a JSON endpoint.”
- “Wie schreibe ich eine Migration?” (German works.)
- “認証フローを教えて” (Japanese works.)