Amazon workers under pressure to up their AI usage are making up tasks
397 points
• 4 days ago
• Article
Link
Amazon 员工正面临越来越大的压力,被要求把人工智能融入日常工作;但由于缺乏清晰的使用指导,一些员工为了满足使用预期而制造不必要的 AI 任务。据 Financial Times 报道,Amazon 通过内部工具 MeshClaw 追踪员工的 AI 代币消耗,有人仅为抬高数据而生成多余的 AI 代理,并未真正提升生产力。
几位匿名员工描述了这样一种职场文化:AI 使用已成不可忽视的指标。一位员工说"使用这些工具的压力很大",有同事主要通过 MeshClaw 来最大化代币消耗。尽管 Amazon 表示 AI 使用统计不会纳入绩效考核,员工仍心存疑虑。另一位员工指出,经理确实在关注这些数据,追踪机制带来了"扭曲的激励",有人甚至把 AI 使用数据当成竞赛标尺。
据称,公司目标是每周有 80% 的开发者使用 AI,员工的代币消耗还在内部排行榜上显示。但 Amazon 否认这些说法,称公司没有全员统一的 AI 使用指标,也没有用于员工互比的内部排行榜,员工只能在个人仪表板上查看自己的使用情况。
MeshClaw 是争议的核心工具,灵感来自 OpenClaw——一款既有提升效率潜力又存在风险的工具。与云端模型不同,这两款工具都在用户的本地硬件上运行,因而拥有较高的自主性。这种独立性引发担忧:今年早些时候,Meta Superintelligence Labs 的对齐主管在使用 OpenClaw 时差点被删除整个邮箱,此事走红网络,凸显了赋予 AI 过多访问与控制权限的危险。
Amazon employees are facing increasing pressure to integrate AI into their daily workflows, but the lack of clear direction on how to use it productively has led to some workers creating unnecessary AI tasks just to meet usage expectations. According to a report by the Financial Times, Amazon is tracking employees' consumption of AI tokens through its internal tool MeshClaw, and some workers are generating extraneous AI agents simply to inflate their numbers rather than to improve actual productivity.
Several anonymous Amazon employees described a workplace culture where AI usage has become a metric that feels impossible to ignore. One worker said there is "just so much pressure to use these tools," with some colleagues using MeshClaw primarily to maximize their token consumption. While Amazon has told employees that AI usage statistics won't factor into performance evaluations, workers remain skeptical. Another employee noted that managers are paying attention to these metrics, and the tracking creates "perverse incentives," with some people becoming competitive about their AI usage numbers.
The company reportedly has a target of 80% of developers using AI each week, and employees' token consumption is tracked on an internal leaderboard. However, Amazon has pushed back on these claims, stating there is no company-wide metric for AI usage and no internal leaderboards where employees are measured against each other. Instead, the company says employees can only view their own AI usage on personal dashboards.
MeshClaw, the tool at the center of this controversy, is inspired by OpenClaw, another AI tool known for both its productivity potential and its risks. Unlike cloud-based AI models, both tools run locally on users' hardware, giving them significant autonomy. This independence has raised concerns, as demonstrated earlier this year when the director of alignment at Meta Superintelligence Labs went viral after OpenClaw nearly deleted her entire email inbox, highlighting the dangers of granting AI too much access and control.
432 comments • Comments Link
多家大型科技公司正在推行激励或强制性的高 AI 代币消耗政策:有些企业把代币使用量纳入绩效考核,形成类似苏联式配额的扭曲激励,使得满足任意指标比提升实际生产力更重要。
在这种压力下,员工通过创建不必要的 AI 代理、用大模型处理琐碎事务或生成毫无价值的输出,人为抬高代币消耗。有报道说,某位员工通过自动化代理消耗的代币是同事的十倍,却因此获赞而非被质疑。
这与历史上的指标操纵类似:用代码行数(LOC)评估程序员会催生臃肿代码;"眼镜蛇效应"说明激励可能产生与初衷相反的结果——都应验了古德哈特定律:一旦指标成为目标,它就不再是有效衡量标准。
推动这种现象的因素多样:大力投资 AI 的高管需要证明支出合理,缺乏技术背景的管理层偏好易于追踪的指标,持有 AI 公司股权的企业希望推动收入增长,加之全行业的 FOMO——不论实际效用如何都要显得"AI 原生"。
相比之下,更有意义的指标像完成的故事点、引入的 bug 数量或交付功能的质量,显然比代币消耗更能反映产出。但领导层仍然关注投入指标,因为它们更容易在仪表盘上展示给利益相关者,即便这些指标与实际产出相关性很弱。
一些工程师确实开发出真正有用的 AI 应用——如自动化文档、跨平台代码库分析或自动生成测试用例——但这些生产性用途常被为追求代币数字的表演和大量需人工清理的低质量输出所掩盖。
这种现象并不限于单一公司,而是蔓延至多家 FAANG 企业和一些中小公司。有公司内部还建立了代币使用排行榜,尽管官方声称不会影响绩效,但无论是否创造实际价值,这类榜单都会形成隐性的消费竞争压力。
批评者把这类行为与历史及当下的浪费性支出相提并论:9/11 后情报机构的过度差旅、鼓励乘坐昂贵航班的差旅政策、资助无关学术旅行,乃至在发展中国家推广婴儿配方奶粉的营销手段——都是资金流动但未产出相应价值的例子。
环境影响也引发关注:大量"燃烧"代币需要依赖数据中心基础设施,耗费电力和水资源;一些地区通过税收优惠和廉价能源补贴吸引这些设施,而当地居民却承担更高的水电账单。
支持者认为,这种强制使用能促进探索与学习,逼着工程师尝试他们原本不会尝试的有价值用例;反对者则指出,当员工已经清楚工具的局限且真正有用的应用有限时,这种做法无非是在浪费资源。
更深层次上,这反映了现代资本主义的一种倾向:资金无论是否创造实际价值都要在实体间流动以显示增长,AI 代币消耗只是这种循环经济活动的又一载体,主要受益者往往是基础设施提供商,而回报却值得怀疑。
总体讨论揭示了对以指标驱动的 AI 管理文化的广泛不满:易于计量的投入指标(如代币消耗)取代了难以评估却更有意义的产出指标(如生产力或商业价值)。评论者频繁援引古德哈特定律和历史类比,说明激励结构会被操纵,员工倾向于优化被衡量的指标而非真正的组织目标。尽管有人承认 AI 工具存在正当用途,但普遍观点认为,通过绩效指标强制使用 AI 更多产生的是表演而非价值,浪费资源,并可能因把 AI 与官僚合规挂钩而阻碍实际采用。这一现象主要由高管证明 AI 投资合理的焦虑、管理者偏好简单仪表盘而非细致评估,以及全行业无视实际结果的从众压力所驱动。 • Multiple large tech companies are implementing policies that incentivize or mandate high AI token usage, with some incorporating token consumption into performance reviews, creating perverse incentives reminiscent of Soviet-style quota systems where meeting arbitrary metrics matters more than actual productivity.
• Employees are gaming these systems by creating unnecessary AI agents, running trivial tasks through LLMs, or generating worthless outputs purely to inflate their token usage numbers, with one employee burning 10x more tokens than peers through an automated agent and receiving accolades rather than criticism.
• This mirrors historical examples of metric manipulation like lines of code (LOC) measurements, where programmers wrote bloated code to meet quotas, or the "cobra effect" where incentives produce exactly the opposite of intended outcomes, all falling under Goodhart's Law where measures cease to be useful once they become targets.
• The push appears driven by multiple factors: executives who've invested heavily in AI needing to justify expenditures, managers lacking technical understanding defaulting to easily trackable metrics, companies with equity stakes in AI firms wanting to drive revenue, and industry-wide FOMO creating pressure to appear "AI-native" regardless of actual utility.
• Real productivity metrics like story points completed, bugs introduced, or quality of shipped features remain more meaningful than token consumption, yet leadership continues focusing on input metrics because they're easier to dashboard and present to stakeholders, even when these metrics correlate poorly with actual output.
• Some engineers report genuinely useful AI applications like automated documentation, codebase analysis across multiple platforms, or test generation, but these productive uses are overshadowed by the theater of token maximization and the generation of massive amounts of low-quality AI slop that requires extensive human cleanup.
• The phenomenon extends beyond Amazon to multiple FAANG companies and smaller firms, with some creating internal leaderboards tracking token usage despite disclaimers that it won't affect performance reviews, creating implicit pressure to compete on consumption regardless of value generated.
• Critics compare this to various historical and contemporary examples of wasteful spending: post-9/11 intelligence agency travel excesses, corporate travel policies that incentivized expensive flights, grant-funded academic travel, and even Nestlé's baby formula marketing in developing countries, all cases where money flowed without producing proportional value.
• The environmental impact concerns several commenters, who note that burning tokens requires massive data center infrastructure that consumes electricity and water resources, with some regions subsidizing this through tax breaks and cheap energy while local residents bear the costs through higher utility bills.
• Some defend the approach as forcing exploration and learning, arguing that mandating AI use helps resistant engineers discover valuable applications they wouldn't otherwise try, though critics counter that this wastes resources when employees already know the tools' limitations and actual useful applications remain limited.
• The situation reflects broader trends in modern capitalism where money must flow between entities to demonstrate growth regardless of tangible value produced, with AI token consumption becoming another vector for this circular economic activity that enriches infrastructure providers while producing questionable returns.
The discussion reveals widespread frustration with metric-driven management culture applied to AI adoption, where easily measurable inputs like token consumption replace harder-to-assess outputs like actual productivity or business value. Commenters consistently invoke Goodhart's Law and historical parallels to demonstrate how incentive structures inevitably get gamed, with employees optimizing for measured metrics rather than genuine organizational goals. While some acknowledge legitimate uses for AI tools, the consensus suggests that mandating usage through performance metrics creates theater rather than value, wastes resources, and may actually hinder adoption by associating AI with bureaucratic compliance rather than genuine utility. The phenomenon appears driven by executive anxiety about justifying AI investments, managerial preference for simple dashboards over nuanced evaluation, and industry-wide pressure to appear technologically progressive regardless of actual outcomes.