Skip to content

Nibiru docsv0.9.2

Docs MMVC AI module CLI Showcase

Get started
The framework
CLI
Design System
In production
- Real-World Projects
- Patterns from Production
AI in Nibiru

Select language

On this page

Overview
Phase 1 — Today: RAG Oracle ✓
Phase 2 — Next: Public corpus + LoRA recipe
Phase 3 — Then: Hosted LoRA endpoint
Phase 4 — Eventually: Editor agents
Phase 5 — Aspirational: Active learning
How to help

On this page

Overview
Phase 1 — Today: RAG Oracle ✓
Phase 2 — Next: Public corpus + LoRA recipe
Phase 3 — Then: Hosted LoRA endpoint
Phase 4 — Eventually: Editor agents
Phase 5 — Aspirational: Active learning
How to help

Docs/AI in Nibiru/Roadmap

AI Roadmap

Where Nibiru's AI integration is going — the plan that gets us from RAG-grounded Oracle to a fine-tuned LoRA in production.

Stable Reading time ~ 2 min Edit on GitHub

Nibiru’s ambition: be the first PHP framework with a fine-tuned model trained on its own knowledge, served as a first-class part of the dev experience. This page tracks the steps.

Phase 1 — Today: RAG Oracle ✓

Section titled “Phase 1 — Today: RAG Oracle ✓”

Markdown chunker with H2/H3 boundaries.
OpenAI embeddings (text-embedding-3-small).
Vector index as a single JSON file.
Astro endpoint /api/oracle calling Claude with retrieved context.
Floating chat widget on every doc page.
Multilingual (EN/DE/JA/ES/FR) input + output.

Why first. RAG works without training, scales linearly with content size, and is dirt cheap. Every doc edit improves answer quality the same hour.

Phase 2 — Next: Public corpus + LoRA recipe

Section titled “Phase 2 — Next: Public corpus + LoRA recipe”

npm run build:corpus ships in main (instructions/chat/chunks JSONL).
Published Hugging Face dataset (nibiru-framework/docs-corpus).
Reference Axolotl YAML for Llama 3.1 8B.
Reference recipe for Qwen 2.5 7B and Mistral Nemo 12B.
Eval set: 200 hand-graded Nibiru questions with golden answers.

Why second. Once the corpus is reproducible from docs, anyone can train. We treat docs as the source of truth and the corpus as a derivative artifact.

Phase 3 — Then: Hosted LoRA endpoint

Section titled “Phase 3 — Then: Hosted LoRA endpoint”

Train a first-pass LoRA on the public corpus.
Serve via vLLM behind /api/oracle with a feature flag.
Side-by-side UI comparing Claude (RAG) vs LoRA (no RAG) vs LoRA + RAG.
Telemetry: which form does the user prefer per question type?

Why third. Side-by-side comparison reveals where the LoRA helps (idiomatic Nibiru style) and where it hurts (very long context, fresh edits not yet retrained).

Phase 4 — Eventually: Editor agents

Section titled “Phase 4 — Eventually: Editor agents”

PHPStorm plugin: highlight a controller, ask the Oracle to convert it to a module.
CLI agent: ./nibiru ask "rewrite this controller as a JSON endpoint".
PR review bot: explain Nibiru-specific deviations in pull requests on framework forks.

Phase 5 — Aspirational: Active learning

Section titled “Phase 5 — Aspirational: Active learning”

User feedback in the chat widget (👍 / 👎) writes a row to a private dataset.
Weekly review queue surfaces low-rated answers for human annotation.
Improved answers re-enter the corpus on the next training cycle.

How to help

Section titled “How to help”

Ask the Oracle hard questions and rate the answers.
Open issues on the GitHub repo for missing topics.
Contribute translations — every translated doc page is also a parallel corpus row.
Try a LoRA fine-tune on the published corpus and share results.

Previous
Training corpus (LoRA)

Was this page helpful?

Yes Suggest an edit