Skip to content
Nibiru docsv0.9.2

Ask the Oracle

How the in-site AI assistant works — RAG over the docs, served by your own Ollama on your Ollama instance.

Stable Reading time ~ 3 min Edit on GitHub

The amber button in the corner of every page is the Nibiru Oracle — an AI assistant grounded in this very documentation. Ask it about routing, modules, the CLI, the Smarty layer, the meaning of pageAction(). It cites its sources.

By default, the Oracle runs entirely on your own infrastructure:

LayerBackendDefault model
Chat (answer generation)Ollama on https://your-ollama-host.exampleqwen2.5-coder:14b
Embeddings (RAG retrieval)Ollama on https://your-ollama-host.examplenomic-embed-text

No paid API keys. No data leaves your network. The 5-GPU Ollama cluster you already run handles the load.

If you’d rather use a paid provider — Claude for chat, OpenAI for embeddings — set LLM_PROVIDER=anthropic and/or EMBED_PROVIDER=openai plus the matching API keys. The code paths are identical.

flowchart LR
A[User question] --> B[Embed via Ollama<br/>nomic-embed-text]
B --> C[Cosine search<br/>against pre-computed<br/>doc-chunk index]
C --> D[Top-K chunks]
D --> E[Ollama chat<br/>qwen2.5-coder:14b<br/>system + retrieved context]
E --> F[Answer + source list]
F --> G[Render in chat UI]
  1. At build time, the docs site walks every Markdown page, splits it into ~600-token chunks at H2/H3 boundaries, embeds each chunk with nomic-embed-text, and writes the result to public/oracle-index.json. No database needed.
  2. At request time, the user’s question is embedded the same way, the closest chunks are retrieved by cosine similarity, and they’re stitched into a system prompt for the chat model.
  3. The chat model answers in the user’s language, citing source chunks by URL.
FilePurpose
scripts/lib/providers.mjsShared chat + embedding adapter (Ollama / Anthropic / OpenAI).
scripts/build-oracle-index.mjsBuilds public/oracle-index.json at build time.
public/oracle-index.jsonThe committed/build-output embedding index.
src/pages/api/oracle.tsThe SSR endpoint the chat widget POSTs to. Also serves a GET for diagnostics.
src/components/CosmicHeader.astroThe floating launcher + chat UI.

Pull the two models the Oracle uses:

Terminal window
curl https://your-ollama-host.example/api/pull -d '{"name":"qwen2.5-coder:14b"}'
curl https://your-ollama-host.example/api/pull -d '{"name":"nomic-embed-text"}'

qwen2.5-coder:14b is already installed (verified live). nomic-embed-text is the missing piece; without it the Oracle runs in chat-only (no-RAG) mode.

The Oracle reads its config from environment variables. Sensible defaults are baked in.

Terminal window
# Default mode (Ollama on your Ollama instance)
LLM_PROVIDER=ollama # default
OLLAMA_BASE_URL=https://your-ollama-host.example # default
OLLAMA_CHAT_MODEL=qwen2.5-coder:14b # default
OLLAMA_EMBED_MODEL=nomic-embed-text # default
# Optional fallbacks
LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_MODEL=claude-haiku-4-5-20251001
EMBED_PROVIDER=openai
OPENAI_API_KEY=sk-...
OPENAI_EMBED_MODEL=text-embedding-3-small
# Behaviour
ORACLE_TOP_K=6
ORACLE_MAX_TOKENS=800

GET /api/oracle returns the current config (no secrets):

Terminal window
curl https://nibiru-framework.com/api/oracle
{
"status": "ok",
"llm": { "provider": "ollama", "ollamaUrl": "https://your-ollama-host.example",
"model": "qwen2.5-coder:14b" },
"embed": { "provider": "ollama", "ollamaUrl": "https://your-ollama-host.example",
"model": "nomic-embed-text" },
"index": { "present": true, "chunks": 177,
"provider": "ollama", "model": "nomic-embed-text" }
}

Handy for verifying a freshly-deployed container is using the backend you expected.

  • Questions and conversation history are sent to your Ollama server. They are not stored by the docs site or by Anthropic/OpenAI in the default Ollama config.
  • The OpenAI key (if used) is invoked only for embeddings.
  • No analytics or cookies are set by the Oracle widget itself.

The roadmap (see AI Roadmap) is to fine-tune a LoRA on the Training Corpus export so the chat model itself is Nibiru-native. When that’s ready, the Oracle’s OLLAMA_CHAT_MODEL flips to the fine-tuned model and the system prompt simplifies. Same code, smarter answers.

Open the Oracle (the amber planet, bottom-right) and try one of these:

  • “How do I create a new module?”
  • “What does pageAction do?”
  • “Show me how to handle a JSON endpoint.”
  • “Wie schreibe ich eine Migration?” (German works.)
  • “認証フローを教えて” (Japanese works.)