Skip to main content

Search

Items tagged with: llamacpp


Got #qwen3coder 30B #LLM to work on #Framework Desktop. I tested it with e.g. "Write an Angular login page to oauth idp server. Use best practices. use HttpOnly cookies." and few others. It thinks few seconds and writes the code about in 30 seconds. The codes looked okay. I'm satisfied what i got.

And it's quiet - even when thinking! The monster multi-GPU AI machines are so outdated with a price of one 5090.

I followed github.com/pablo-ross/strix-ha… to install #llamacpp. Some changes from it:
- Skipped kernel update due to newer Ubuntu 25.10.
- Got sudo group error from distrobox. Removed --group-add sudo from distrobox create.
- Tweaked run parameters. I hacked them with trial and error. Looks like i can increase "context size" a lot.
Current command to run Qwen is:
```
distrobox enter llama-rocm-7rc-rocwmma -- ~/llama.cpp/build/bin/llama-cli -m ~/models/qwen3-coder-30B-A3B/BF16/Qwen3-Coder-30B-A3B-Instruct-BF16-00001-of-00002.gguf --no-mmap -ngl 99 --ctx_size 16384 -n 20000
```
#homelab #AI