NemoClaw with Local Inference
How It Works
OpenClaw agent
↓
https://inference.local
↓ (intercepted by OpenShell proxy at 10.200.0.1:3128)
OpenShell gateway (privacy router — injects credentials, rewrites model)
↓
llama-server on host (your GPU, your model)
↓
Qwen3.5-35B-A3B (MoE, 3B active params per token)Prerequisites
Part 1 — Install NemoClaw

Part 2 — Build llama.cpp
Part 3 — Download the Model
Part 4 — Start llama-server
Part 5 — Register the Local Inference Provider
Part 6 — Configure OpenClaw
Part 7 — Test
Performance
Known Issues and Workarounds
Tool call parse error: Failed to parse input at pos N: <tool_call>
Failed to parse input at pos N: <tool_call>inference.local returns DNS resolution error
inference.local returns DNS resolution erroropenshell inference set times out during verification
openshell inference set times out during verificationopenshell policy set fails with "filesystem policy cannot be removed"
openshell policy set fails with "filesystem policy cannot be removed"openclaw config set inference.* fails with "Unrecognized key"
openclaw config set inference.* fails with "Unrecognized key"Provider base URL with 127.0.0.1 or localhost does not work
127.0.0.1 or localhost does not workConfiguration Reference
llama-server flags
openclaw.json provider block
cmake flags by GPU architecture
Quick Reference
Last updated
Was this helpful?