AI Roadmap
Where Nibiru's AI integration is going — the plan that gets us from RAG-grounded Oracle to a fine-tuned LoRA in production.
Nibiru’s ambition: be the first PHP framework with a fine-tuned model trained on its own knowledge, served as a first-class part of the dev experience. This page tracks the steps.
Phase 1 — Today: RAG Oracle ✓
Section titled “Phase 1 — Today: RAG Oracle ✓”- Markdown chunker with H2/H3 boundaries.
- OpenAI embeddings (
text-embedding-3-small). - Vector index as a single JSON file.
- Astro endpoint
/api/oraclecalling Claude with retrieved context. - Floating chat widget on every doc page.
- Multilingual (EN/DE/JA/ES/FR) input + output.
Why first. RAG works without training, scales linearly with content size, and is dirt cheap. Every doc edit improves answer quality the same hour.
Phase 2 — Next: Public corpus + LoRA recipe
Section titled “Phase 2 — Next: Public corpus + LoRA recipe”-
npm run build:corpusships inmain(instructions/chat/chunks JSONL). - Published Hugging Face dataset (
nibiru-framework/docs-corpus). - Reference Axolotl YAML for Llama 3.1 8B.
- Reference recipe for Qwen 2.5 7B and Mistral Nemo 12B.
- Eval set: 200 hand-graded Nibiru questions with golden answers.
Why second. Once the corpus is reproducible from docs, anyone can train. We treat docs as the source of truth and the corpus as a derivative artifact.
Phase 3 — Then: Hosted LoRA endpoint
Section titled “Phase 3 — Then: Hosted LoRA endpoint”- Train a first-pass LoRA on the public corpus.
- Serve via vLLM behind
/api/oraclewith a feature flag. - Side-by-side UI comparing Claude (RAG) vs LoRA (no RAG) vs LoRA + RAG.
- Telemetry: which form does the user prefer per question type?
Why third. Side-by-side comparison reveals where the LoRA helps (idiomatic Nibiru style) and where it hurts (very long context, fresh edits not yet retrained).
Phase 4 — Eventually: Editor agents
Section titled “Phase 4 — Eventually: Editor agents”- PHPStorm plugin: highlight a controller, ask the Oracle to convert it to a module.
- CLI agent:
./nibiru ask "rewrite this controller as a JSON endpoint". - PR review bot: explain Nibiru-specific deviations in pull requests on framework forks.
Phase 5 — Aspirational: Active learning
Section titled “Phase 5 — Aspirational: Active learning”- User feedback in the chat widget (👍 / 👎) writes a row to a private dataset.
- Weekly review queue surfaces low-rated answers for human annotation.
- Improved answers re-enter the corpus on the next training cycle.
How to help
Section titled “How to help”- Ask the Oracle hard questions and rate the answers.
- Open issues on the GitHub repo for missing topics.
- Contribute translations — every translated doc page is also a parallel corpus row.
- Try a LoRA fine-tune on the published corpus and share results.