Bun Rust rewrite: "codebase fails basic miri checks, allows for UB in safe rust"
484 points
• 3 days ago
• Article
Link
一位名为 AwesomeQubic 的用户在 Bun 运行时的 GitHub 仓库提交了一个 issue,称整个 Rust 代码库连最基本的 Miri 检查都过不了,并在安全的 Rust 中允许出现未定义行为(UB)。报告给出了一个针对 `PathString::init` 的具体示例:该函数接收一个带隐式生命周期的 `&[u8]`,但在返回时擦除了生命周期,使得返回的 Self 实际上表现为 `'static`,从而产生悬垂引用。由此可以发生 use-after-free:创建一个 Box 、用它初始化一个 `PathString` 、释放 Box 后再打印该切片,Miri 因缺乏 provenance 将其标为 UB 。
该 issue 在社区引发强烈反响,许多人对这样基础的内存安全漏洞出现在一个依赖 Rust 提供性能与安全保障的项目中感到沮丧。评论者 JavaDerg 强调了问题的严重性,指出 Rust 的安全模型建立在强假设之上,UB 可能在意想不到的地方引发不可预测的问题,从而抹杀使用 Rust 的优势。讨论还涉及 AI 编码助手的角色;原报告者认为"vibe coding" 加上 AI 容易导致此类错误,建议雇佣有经验的 Rust 开发者。
作为回应,合作者 robobun 确认了该 bug 并链接了一个修复用的拉取请求(#30728)。修复方案包括将 `PathString::init` 及 `dir_iterator::next()` 中的类似漏洞标注为带有文档化 outlives 合约的 `unsafe fn`,对大约 70 个仓内调用点逐一审计并添加每处的 SAFETY 注释,并新增回归测试。 robobun 指出,尽管 diff 本地显示通过,但 CI 在无关的分支上不稳定,问题源于既有的 WebKit/GC 问题。
讨论期间还出现了若干其他旨在减少 unsafe 使用的 PR,例如用安全等价物替换 `ArrayHashMap` 中的不安全代码块,以及将 `DynamicBitSet` 重写为基于 `Box<[usize]>` 的实现。但该线程逐渐偏离主题并变得激烈,一些人争论 Zig 与 Rust 的优劣,另一些人批评项目过度依赖 AI 生成的代码。有人还用 grep 展示问题规模,在 Rust 文件中发现超过 13,000 处 `unsafe` 。最终,仓库维护者将该 issue 置为离题并锁定,限制进一步讨论仅限合作者。
A user named AwesomeQubic opened an issue on the Bun runtime's GitHub repository, alleging that the entire Rust codebase fails even the most basic Miri checks and allows for undefined behavior (UB) in safe Rust. The reporter provided a specific code example involving `PathString::init`, where a dangling reference is created because the function takes a `&[u8]` with an implicit lifetime but erases it, returning a `Self` that is effectively `'static`. This allows for use-after-free scenarios, as demonstrated by creating a `Box`, initializing a `PathString` from it, dropping the `Box`, and then attempting to print the slice, which Miri flags as UB due to a lack of provenance.
The issue sparked significant community reaction, with many expressing frustration over the presence of such fundamental memory safety bugs in a project that leverages Rust for its performance and safety guarantees. Commenters like JavaDerg elaborated on the severity, noting that Rust's safety model relies on strong assumptions, and UB can cause unpredictable issues in unexpected places, effectively nullifying the advantages of using Rust. The discussion also touched on the role of AI coding assistants, with the original reporter suggesting that "vibe coding" with AI leads to these mistakes and recommending the hiring of experienced Rust developers.
In response, a collaborator named robobun confirmed the reproduction of the bug and linked a pull request (#30728) intended to fix it. The fix involves marking `PathString::init` and a similar hole in `dir_iterator::next()` as `unsafe fn` with documented outlives contracts, auditing approximately 70 in-tree call sites with per-site SAFETY comments, and adding a regression test. Robobun noted that while the diff was green, CI was flaky on unrelated lanes due to pre-existing WebKit/GC issues.
The conversation also saw the emergence of several other pull requests aimed at reducing unsafe usage across the codebase, such as replacing unsafe blocks in `ArrayHashMap` with safe equivalents and rewriting `DynamicBitSet` on `Box<[usize]>`. However, the thread became increasingly off-topic and contentious, with some users debating the merits of Zig versus Rust and others criticizing the project's reliance on AI-generated code. One user even ran a grep command to highlight the scale of the problem, finding over 13,000 instances of `unsafe` in Rust files. Eventually, the repository maintainers locked the issue as off-topic and limited further conversation to collaborators.
344 comments • Comments Link
• 使用 LLM 将 Zig 代码翻译为不安全的 Rust 受到了质疑。批评者认为,像 Zig→C→Rust 这类确定性工具本可以生成更可靠、更易审计的结果。 AI 生成的代码既可能存在内存安全问题,又未经充分审查,因此可信度低于原始手写但不安全的代码。
• 像 c2rust 这样的自动化翻译工具会产生语义相同但极不惯用且冗长的 Rust 代码,依赖 unsafe 块来模拟 C 的指针语义。虽然这能提供一个功能等价的基线,但并未带来安全性提升,且难以供人类维护,类似于对编译器生成汇编的人工编辑。
• Bun 团队采用的大部分 1:1 翻译为不安全 Rust 的方法被视为实现渐进式安全改进的必要第一步。与原始代码库并行审查更方便发现 AI 幻觉,尽管这意味着初始移植版本保留了原始 Zig 代码的所有健壮性问题。
• 一个关键批评是,这次移植引入了原始 Zig 代码中不存在的新未定义行为(UB),具体表现为在 Rust API 中将 unsafe 函数标为 safe 。此做法违背了 Rust 的核心承诺——安全代码不应导致 UB——从根本上削弱了迁移到 Rust 的主要优势。
• 将百万行大部分未经审查的 AI 生成代码合并到主分支的决定被广泛批评为不负责任,尤其是在像 Bun 这样备受关注的项目中。此举绕过了标准代码审查流程,漠视社区信任,无论初衷是否只是作为起点,都是问题所在。
• AI 驱动改版的华丽公告与随后低调的修正和批评之间存在显著不对称。营销利用了这一动态,"内存安全的 Rust" 的初始声明被大量传播,而那份大多不安全且漏洞众多的移植版本却鲜少被关注。
• Zig 项目对贡献实行严格的禁止 AI 政策,被视为维护代码质量和减轻维护者工作负担的现实需要。审查 AI 生成的 PR 通常比处理普通贡献更耗人力,尤其在大多数 PR 质量不高的情况下,对小团队而言全面拒绝是合理的。
• 一些人认为强烈反弹不成比例,忽视了这只是早期移植工作的事实。期望一开始就做到完美不现实,Bun 团队也明确表示这只是长期渐进式安全改进过程中的第一步。
• 有人将 Bun 的改写视为 Anthropic 展示 AI 能力的营销噱头,而非真正的工程努力。该看法因 Anthropic 收购 Bun 的时机以及缺乏详尽说明长期计划的博客文章而加剧,导致用户指责其为"rug pull"。
• 该事件也引发了对软件工程劳动价值的更广泛质疑:如果 AI 真能在一周内移植百万行代码,行业就得重新思考什么才具有真正的经济价值,以及围绕 AI 编程的炒作是否与实际可维护性和效用相符。
讨论暴露出深刻分歧:一方认为 Bun 的改写是鲁莽且以营销为驱动的噱头,不尊重用户并破坏了 Rust 的安全保证;另一方则认为这是 AI 驱动的长期迁移策略中务实、尽管混乱的第一步。批评者强调合并未经审查的代码并引入新的未定义行为是不负责任的,而支持者则认为 1:1 翻译是未来改进的必要基线,并认为对正在进行的工作给予过度反弹不公平。背后的更大张力涉及 AI 在软件开发中的角色、开源维护的可持续性,以及成功的 AI 辅助移植是否会削弱传统工程专业知识的价值。 Zig 的禁止 AI 政策因此成为优先保障代码质量与维护者带宽的案例研究,而非接受可能有害的贡献,不论其来源如何。 • Using an LLM to translate Zig to unsafe Rust is questioned when deterministic tools like a Zig-to-C-to-Rust pipeline could have produced a more reliable, auditable result. The AI-generated code introduces new risks, as it is both memory-unsafe and unreviewed, making it less trustworthy than the original hand-written but unsafe code.
• Automated translation tools like c2rust produce semantically identical but highly unidiomatic and verbose Rust code that relies on unsafe blocks to emulate C pointer semantics. While this provides a functionally equivalent baseline, it offers no safety improvements and is difficult for humans to work with, similar to editing compiler-generated assembly.
• The Bun team's approach of a mostly 1:1 translation to unsafe Rust is seen as a necessary first step to enable incremental safety improvements. This method allows for easier review by comparing it to the original codebase and catching AI hallucinations, though it means the initial port retains all the soundness issues of the original Zig code.
• A key criticism is that the port introduced new undefined behavior (UB) not present in the original Zig code, specifically by marking unsafe functions as safe in the Rust API. This violates Rust's core promise that safe code cannot cause UB, undermining the primary benefit of migrating to Rust in the first place.
• The decision to merge a million lines of largely unreviewed, AI-generated code into the main branch is widely criticized as irresponsible, especially for a high-profile project like Bun. This bypasses standard code review processes and treats the community's trust carelessly, regardless of whether it's intended as a starting point.
• There is a significant asymmetry between the flashy announcement of an AI-driven rewrite and the subsequent, less-visible corrections and criticisms. This dynamic is exploited in marketing, where the initial claim of "memory-safe Rust" spreads widely, while the reality of a mostly unsafe, bug-ridden port receives far less attention.
• The Zig project's strict no-AI policy for contributions is defended as a practical necessity to maintain code quality and sustainable maintainer workload. Reviewing AI-generated PRs often takes more human effort than the contribution itself, especially when most are low-quality, making a blanket rejection policy reasonable for a small team.
• Some argue that the intense backlash is disproportionate and overlooks the fact that this is an early-stage port. The expectation of immediate perfection is unrealistic, and the Bun team has been clear that this is the first step in a longer process of incremental safety improvements.
• The Bun rewrite is viewed by some as a marketing stunt by Anthropic to showcase AI capabilities, rather than a genuine engineering effort. This perception is fueled by the timing after Anthropic's acquisition of Bun and the lack of a detailed blog post explaining the long-term plan, leading to accusations of a "rug pull" on users.
• The incident raises broader questions about the value of software engineering labor if AI can port a million-line codebase in a week. It challenges the industry to reconsider what is truly economically valuable and whether the hype around AI coding aligns with ground-truth utility and maintainability.
The discussion reveals a deep divide between those who see the Bun rewrite as a reckless, marketing-driven stunt that disrespects users and undermines Rust's safety guarantees, and those who view it as a pragmatic, if messy, first step in a long-term migration strategy enabled by AI. Critics emphasize the irresponsibility of merging unreviewed code and the introduction of new undefined behavior, while supporters argue that a 1:1 translation is a necessary baseline for future improvements and that the backlash is disproportionate for a work in progress. Underlying the debate are broader tensions about AI's role in software development, the sustainability of open-source maintenance, and the fear that successful AI-assisted ports could devalue traditional engineering expertise. The Zig project's no-AI policy is highlighted as a case study in prioritizing code quality and maintainer bandwidth over accepting potentially harmful contributions, regardless of their origin.