
Running local LLM models with llama.cpp
A practical llama.cpp setup note covering CUDA builds, server commands, MoE tuning flags, and benchmarking local LLM performance.

A practical llama.cpp setup note covering CUDA builds, server commands, MoE tuning flags, and benchmarking local LLM performance.