Training nibiru-coder

How to register a Nibiru-flavoured chat model on your own Ollama. One Modelfile, one shell script, sixty seconds.

Stable Reading time ~ 3 min Edit on GitHub

The framework’s default chat model is nibiru-coder:1.0 — a Nibiru-flavoured Qwen 2.5 Coder 14B that you register on your Ollama server. The training pipeline lives in application/module/ai/training/.

What `nibiru-coder` is (and isn’t)

nibiru-coder:1.0 is not a LoRA fine-tune. It’s the same qwen2.5-coder:14b weights wrapped with a baked-in system prompt that:

explains MMVC, modules, the dispatcher, the singletons,
enforces Nibiru’s conventions (pageAction, navigationAction, View::assign, Form::create, the spelling of Pageination),
pushes the model toward Nibiru-idiomatic answers instead of generic Laravel / Symfony advice.

System-prompt customisation runs instantly — no GPU training, no dataset preparation. It gives roughly 80 % of the value of a real LoRA at zero training cost. When you have budget for a real fine-tune, see Real LoRA path below.

Build it

./application/module/ai/training/build.sh           # builds nibiru-coder:1.0
./application/module/ai/training/build.sh 1.1       # bump tag for iterations

The script:

Reads the Modelfile next to it.
POSTs to ${OLLAMA_BASE_URL}/api/create (default https://your-ollama-host.example).
Runs a smoke-test chat call to confirm the new tag responds.

After it succeeds, set the model in application/module/ai/settings/ai.ini:

[AI]
chat.model          = "nibiru-coder:1.0"
chat.fallback_model = "qwen2.5-coder:14b"

…and every \Nibiru\Module\Ai\Ai instance in your app talks to it. The fallback ensures nothing breaks if you’ve not built the tag yet.

Iterate on the system prompt

The Modelfile’s SYSTEM """ ... """ block is the lever. Tighten the conventions, add new examples, add citations to specific framework files. Re-run build.sh with a new tag (1.1, 1.2, …) and A/B against the previous tag in your app.

./application/module/ai/training/build.sh 1.1
# Edit ai.ini → chat.model = "nibiru-coder:1.1"
# Compare answers in the Oracle widget or via smoke-test.php

Real LoRA path

When you want a model whose weights know Nibiru — not just its system prompt — the corpus exporter has you covered.

cd docs
npm run build:corpus

Outputs JSONL files under dist/corpus/:

File	Format	Use
`chat.jsonl`	sharegpt-style messages	Axolotl, LLaMA-Factory, Unsloth
`instructions.jsonl`	instruction/input/output	Alpaca-style trainers
`completion.jsonl`	prompt/completion	Legacy text-completion fine-tunes
`chunks.jsonl`	chunk metadata	RAG / evaluation set construction

A pragmatic recipe for an 8B base on a single A100 / 4090:

base_model: meta-llama/Llama-3.1-8B-Instruct
adapter: lora
lora_r: 16
lora_alpha: 32
lora_dropout: 0.05
lora_target_modules: [q_proj, k_proj, v_proj, o_proj]

datasets:
  - path: docs/dist/corpus/chat.jsonl
    type: sharegpt

sequence_len: 4096
sample_packing: true
gradient_accumulation_steps: 4
micro_batch_size: 2
num_epochs: 3
optimizer: adamw_bnb_8bit
learning_rate: 0.0002
warmup_ratio: 0.05
bf16: true

After training:

Convert the LoRA to GGUF (llama.cpp’s convert_hf_to_gguf.py).
Build an Ollama Modelfile with FROM ./your-lora.gguf.
./build.sh 2.0 registers it as nibiru-coder:2.0.

The framework code doesn’t change — flip chat.model in ai.ini and you’re on the new weights.

Smoke test

php application/module/ai/training/smoke-test.php

Verifies:

The Ollama server is reachable.
The model responds to a single-turn ask.
Multi-turn conversation context works.
Embeddings work (skipped with a clear message if nomic-embed-text isn’t pulled).

Run after every Modelfile change, before deploying.

Common pitfalls

Modelfile system prompt too long. Some Ollama versions cap system prompts. Keep it under ~3000 tokens.
Forgetting to pull the FROM model. qwen2.5-coder:14b must already be on the server. curl ${OLLAMA_BASE_URL}/api/tags to check.
Tag collisions. Re-running build.sh 1.0 overwrites the existing nibiru-coder:1.0. Use new tags for iteration; pin specific tags in ai.ini for production.
--no-stream confusion. The build script uses stream: false so the response comes back as one JSON. If you change to streamed, parse line-by-line.