Zerostack – A Unix-inspired coding agent written in pure Rust
565 points
• 2 days ago
• Article
Link
zerostack 是用 Rust 编写的极简编码代理,追求低内存占用和高性能。它支持多种 AI 提供商(包括 OpenRouter 、 OpenAI 、 Anthropic 、 Gemini 、 Ollama 及自定义提供商),提供文件操作、受权限控制的 bash 执行、会话管理和支持 Markdown 渲染的终端 UI 。代码量约 7,000 行,空会话内存占用约 8MB,工作时约 12MB,二进制体积约 8.9MB 。
权限系统有四个等级,从严格到宽松,可为各工具配置模式并设定会话白名单;内置 doom-loop 检测可防止重复调用相同工具。代理配备了提示(prompts)系统,内置模式包括 code 、 plan 、 review 、 debug 、 ask 、 brainstorm 、 frontend-design 、 review-security 和 simplify,运行时可切换,且可通过 Markdown 文件添加自定义提示;代理还会自动从项目目录加载 AGENTS.md 或 CLAUDE.md 。
zerostack 包含面向长周期任务的实验性循环功能,能让代理按计划迭代执行直至完成;并集成了 git worktrees,支持"每任务一分支"的工作流,可在聊天 UI 中创建、切换、合并和退出 worktree 。该工具支持 MCP 服务器以扩展工具能力,并集成了 Exa 搜索用于网络抓取与检索。安装依赖 Cargo 和 git,且可选通过 bubblewrap 开启沙盒模式以隔离命令执行。
zerostack is a minimalistic coding agent written in Rust, designed for low memory usage and high performance. It supports multiple AI providers including OpenRouter, OpenAI, Anthropic, Gemini, and Ollama, along with custom providers. The tool offers file operations, bash execution with permission controls, session management, and a terminal UI with markdown rendering. At around 7,000 lines of code, it uses roughly 8MB of RAM for empty sessions and 12MB during work, with a binary size of 8.9MB.
The permission system includes four modes ranging from restrictive to permissive, with configurable per-tool patterns and session allowlists. Doom-loop detection prevents repeated identical tool calls. The agent features a prompts system with built-in modes like code, plan, review, debug, ask, brainstorm, frontend-design, review-security, and simplify, which can be switched at runtime. Custom prompts can be added via markdown files, and the agent automatically loads AGENTS.md or CLAUDE.md from the project directory.
zerostack includes an experimental loop system for long-horizon tasks, where the agent iteratively works through a plan until completion. It also provides git worktrees integration for branch-per-task workflows, allowing users to create, work in, merge, and exit worktrees from the chat UI. The tool supports MCP servers for extended tooling and includes integrated Exa search for web fetching and searching. Installation requires Cargo and git, with optional sandbox mode via bubblewrap for isolated command execution.
307 comments • Comments Link
• 用 Rust 构建定制代理在安全性和性能上有明显优势,但在启用自变异工具而不引入任意代码执行的情况下仍存在挑战,这促使有人尝试嵌入 Deno 或为工具执行定义二进制 API 。
• Rust 的编译特性限制了运行时脚本能力,因此建议使用 Rhai——一种轻量、可嵌入的脚本语言——作为安全且对 LLM 友好的定制方案,无需引入重量级运行时。
• Zerostack 利用 Rust 的高效性、栈分配数据结构(smallvec 、 compactstring)、 LTO 优化和单线程异步运行时,实现了极低的内存占用(约 8–12MB),而不是靠限制上下文窗口大小来降内存。
• 上下文窗口大小并不直接决定本地 RAM 使用,因为模型及其 KV 缓存通常驻留在服务器端;本地内存更多由应用逻辑消耗,而非 token 的存储。
• Panic 处理方式(abort 与 unwind)需要权衡:abort 能减小二进制体积并避免复杂的展开状态,但牺牲了可调试性;unwind 提供堆栈跟踪,但会让二进制体积增加约 50KB 。
• 像 Zerostack 和 nanoin(少于 200 行)这样的轻量级代理表明,有效的编码代理可以非常小、启动快且占用内存低,与占用数 GB 的臃肿工具形成鲜明对比。
• 代理设计理念各有不同:有人倾向于集成功能(如 git worktrees 和 Ralph Wiggum 循环)以改善用户体验,而另一些人则主张将编排与执行层分离。
• 技能与提示模板能提供可扩展性,但 Zerostack 选择更简洁的提示库机制,通过 .md 文件替换整个系统提示,降低复杂度同时保持灵活性。
• 对开源、独立于公司的极简编码代理的需求在增长;本地运行、对 Ollama 的支持、权限模式和高效资源使用等成为关键差异化要素。
• JetBrains 被认为在构建深度集成的 IDE 代理方面具有独特优势,能够访问丰富的代码索引,但尽管用户兴趣明显,行动仍显迟缓。
讨论反映出社区对轻量、透明且可由用户控制的编码代理有强烈需求,Rust 被视为优先的性能与安全选择。内存效率、最小依赖和避免被公司掌控是反复出现的关注点,尤其是在用户对主流工具臃肿与不透明性的不满下。关于设计权衡——例如嵌入式脚本 vs. 编译、 panic 处理方式与功能集成——以务实的角度展开讨论,通常基于旧硬件或安全等现实约束。虽然并无统一的架构共识,但普遍认为代理框架应赋予用户控制权而非将他们抽象化,简单性、可审计性和本地执行对于信任与可用性至关重要。 • Building a custom agent in Rust offers deep learning and performance benefits, though challenges remain around enabling self-mutating tools without arbitrary code execution, leading some to experiment with embedding Deno or defining binary APIs for tool execution.
• Rust's compiled nature limits runtime scriptability compared to TypeScript-based agents, so alternatives like Rhai—a lightweight, embeddable scripting language—are suggested for safe, LLM-friendly customization without heavy runtimes.
• Zerostack achieves a minimal memory footprint (~8–12MB) through Rust's efficiency, stack-allocated data structures (`smallvec`, `compactstring`), LTO optimizations, and a single-threaded async runtime, not by limiting context window size.
• Context window size does not directly impact local RAM usage since the model and its KV cache reside on the server; local memory is dominated by application logic, not token storage.
• Panic handler choice (`abort` vs. `unwind`) involves trade-offs: `abort` reduces binary size and avoids complex unwinding states but sacrifices debuggability, while `unwind` provides stack traces at the cost of ~50KB larger binaries.
• Lightweight agents like Zerostack and nanoin (under 200 lines) demonstrate that effective coding agents can be minimal, fast-starting, and low-memory, contrasting sharply with bloated tools like Claude Code that consume gigabytes.
• Agent design philosophies vary: some favor integrated features like git worktrees and Ralph Wiggum loops for UX, while others argue orchestration should remain separate from the executor layer.
• Skills and prompt templates offer extensibility, but Zerostack opts for a simpler prompt library system that replaces entire system prompts via `.md` files, reducing complexity while preserving flexibility.
• There's growing demand for open-source, company-independent, minimalistic coding agents that run locally, with features like Ollama support, permission modes, and efficient resource use being key differentiators.
• JetBrains is seen as uniquely positioned to build deeply integrated IDE agents with access to rich code indexes, yet has been slow to act on this opportunity despite clear user interest.
The discussion reflects a strong community drive toward lightweight, transparent, and user-controlled coding agents, with Rust emerging as the preferred language for performance and safety. Memory efficiency, minimal dependencies, and avoidance of corporate control are recurring priorities, especially as users grow frustrated with the bloat and opacity of mainstream tools like Claude Code. Design trade-offs—such as embedded scripting vs. compilation, panic handling, and feature integration—are debated pragmatically, often grounded in real-world constraints like old hardware or security concerns. While no consensus exists on architecture, there's broad agreement that agent harnesses should empower users rather than abstract them away, and that simplicity, auditability, and local execution are critical for trust and usability.