Two-stage embedding router that picks the right CLI from hundreds of tools in ~36 ms with ~150 tokens. Open source, multilingual, no fine-tuning required.
Source on GitHub · MIT licensed · Python 3.10+
Use the router from a Python REPL, a notebook, or your own tool. You type an intent, the terminal returns a structured tool-call object — cli name, params, confidence. You run the command yourself. No magic.
See the sourceInside an agent loop the router still returns a structured object — but the user never sees raw JSON. The agent reads choices, picks one, runs the tool, and reports back in plain language. No prompt-stuffing, no token blowup.
Write a manifestInstall the package, build the index, route your first intent. Three steps, about five minutes.
# pip install clibrary-hub
# clibrary-build-index --manifest-dir ./manifests
from clibrary_hub import router
result = router.route("convert demo.mp4 to a 10-fps gif")
print(result["cli"]) # 'video-to-gif'
print(result["params"]) # {'input': 'demo.mp4', ...}
print(result["confidence"]) # 0.94from clibrary_hub import validator
r = validator.validate_file("manifests/data/sql-runner.json")
if not r.ok:
for e in r.errors: print("ERROR:", e)
for w in r.warnings: print("WARN: ", w)
# Or validate every manifest under a tree:
results = validator.validate_dir("manifests/")from clibrary_hub import router
import subprocess
def handle_user(message: str):
plan = router.route(message)
if plan["action"] == "clarify":
# Let your LLM choose from plan["choices"]
plan = llm_pick(plan["choices"], chat_history)
cmd = [plan["cli"]] + flatten(plan["params"])
return subprocess.run(cmd, check=True)WHY EMBEDDING ROUTING
When an agent has 100+ tools, the LLM has to read every description before picking one. Token cost, latency, and accuracy all degrade.
PERFORMANCE
Evaluated against 2,050 queries spanning 559 CLI tools (CLIbrary + MCP). Numbers from the latest router build (clibrary_top3 strategy, multilingual-e5-base).
| EVAL SET | SIZE | TOP-1 | TOP-3 |
|---|---|---|---|
| in_domain | 500 | 86.8% | 88% |
| paraphrase | 1,500 | 81.8% | 83.4% |
| adversarial | 204 | 84.3% | 100% |
Adversarial set hardened by 3 rounds of intent-trigger patching. No model fine-tuning involved. Full benchmark →
HOW IT WORKS
Stage 1 retrieves candidate CLIs via FAISS. Stage 2 matches the closest example template. ~80% of queries skip the LLM entirely.
Each CLI's intent_triggers are embedded and mean-pooled into a single vector. FAISS returns top-3 candidates. Re-ranked using MaxSim over the trigger_index.
Find the closest example.query within the winning CLI. If similarity ≥ 0.85, fill the template (Path A — no LLM). Else, ask a small LLM to extract params (Path B).
If top-1 confidence is low and the gap to top-2 is small, return all three candidates and let the agent ask the user. Reduces silent mis-routes to near zero.
AVAILABLE NOW
Every tool is pure offline — no LLM dependency, no cloud credentials. Each ships with manifest, implementation, README, and pinned requirements.
Browse the registry on
GitHub
— clone the repo, run clibrary-build-index --manifest-dir ./manifests, and the router picks them up.
EXTEND · USER-AUTHORED TOOLS
Other tool routers ship a fixed catalog. clibrary-hub ships a format. Anyone can describe a CLI in 10 minutes and the router picks it up.
intent_triggers in any languages your users speakclibrary_hub.validator — no PR needed{
"name": "video-to-gif",
"category": "media",
"description": "Convert a video file to an animated GIF",
"intent_triggers": [
"convert mp4 to gif",
"turn this video into a gif",
"export the first 5 seconds as a gif",
"把影片轉成 gif",
"幫我做一個動圖"
],
"input_schema": {
"input": { "type": "string", "required": true },
"fps": { "type": "integer", "default": 15 }
},
"examples": [...] // 3-5 templates
}
10 categories · 504 manifests so far · MIT-licensed registry