416 points
• 3 days ago
• Article
Link
CTF 领域已经死了。前沿 AI 打破了开放式 CTF 的赛制,记分板不再能清晰衡量人类技能。作者自 2021 年深度参与 CTF 社区,曾赢得 DownUnderCTF,与 TheHackersCrew 等顶级队伍在国际赛场并肩作战,他认为旧的游戏不会回来了。这并非出于反感,而是认清了事实:CTF 曾是他热爱安全的根基,教会了他如何学习、衡量进步、并与令人尊敬的同行建立联系。看到大家还在装作赛制没变,很令人沮丧——核心体验已经被根本改写了。
变化是渐进的:起初 GPT-4 能用一句提示解决中等难度题目,看起来还能接受,因为高难度题目还在。真正的拐点出现在 Claude Opus 4.5,几乎把所有中等难度和部分高难度题目变成了 agent 可解的东西。像 Claude Code 这样的工具通过 API 为每个题目自动启动实例,团队可以批量化运行 agent 。突然间,不用 AI 的队伍不仅失了便利,更是在玩一个慢一拍的版本。记分板开始衡量的是编排能力和愿不愿意用前沿模型,甚至超过了真实的安全技能。这扭曲了 CTFTime 的排行榜,压缩了传奇队伍的参与,也让辛苦出题、花数周打磨题目的出题人感到绝望——他们的作品在几分钟内就被 agent 干掉了。
GPT-5.5 及其 Pro 版本把局面定型了。这些模型现在能一次性解决"Insane"难度的题目,包括之前被认为无法自动化的复杂 heap pwn 问题。只要你能付得起足够的 token 和上下文,就能在 48 小时的 CTF 结束前把题目清光。开放式 CTF 变成了付费赢利的赛场,表现更取决于算力投入而非人类专业能力。相较于通用的前沿 LLM,专门的安全模型变得无足轻重。比赛不再是"谁更懂安全",而是"谁能负担得起运行更多 agent 、持续更久"。 CTF 的成绩正在失去作为招聘安全人才指标的价值;它甚至不能很好地衡量 AI 技能,因为大多数编排工具已经开源或很容易搭建。
有人说初学者仍能像以前那样从 CTF 学到东西,但这忽略了关键:CTF 从来不仅是题目,它是条晋升阶梯。初学者能看到自己进步、解更多题、拿更好名次、加入更优秀的队伍。这个反馈回路正在断裂。可见的排行榜被 AI 团队占领后,初学者被迫在还没建立起 AI 无法替代的直觉之前就开始依赖 AI 。这是反模式,扼杀了需要挣扎和思考的主动学习。付出努力却看不到上升通道会让人丧失动力。对于初学者,更适合的选择是像 picoGym 或 HackTheBox 这样的学习型平台,那里的目标是教育而非竞争,也更少诱人走捷径。
也有人说 CTF 并未死亡,只是被 AI 增强了,举 DEF CON 等精英决赛还存在作为例子。但这并不能成立:那些决赛参与人数很少,且被资格赛筛选,而资格赛本身比决赛更容易。如果资格赛被 agent 扫荡,真正有实力的人就更难进入那些仍能抗住 AI 的题目。少数精英决赛救不回大多数人参与的开放在线赛制。问题不在于每道题都被解掉,而是排行榜上足够多的条目被自动化,以至于它已不再代表过去的含义。
认为 AI 对安全研究有用所以理应出现在 CTF 中,也是个错位的论点。 CTF 本身并不是安全研究,它可以展示新技术,但并非发现的载体。仅因为 AI 在某领域有用,并不意味着它理应占据该领域的竞技场。在 CTF 中,不受限制的 AI 把人类几乎完全从题目中剔除,把安全的技艺简化为一个提示。 CTF 曾是一种艺术,是分享技巧、推动人类安全技能极限的方式——而这种意义正在被剥夺。
把 CTF 比作棋类再用棋局引擎来辩护也站不住脚。关键区别在于:国际象棋引擎在比赛时不被允许使用。它们用于赛后分析、训练和解说,丰富比赛的外延但不取代参赛者。试想给每个竞技棋手配上最强引擎并允许在对局中使用,这公平吗?好看吗?能说明奖励池合理吗?能推动人类能力的极限吗?CTF 面临同样的问题。
CTF 组织者尝试过各种手段去对抗或阻挡 LLM,但这些充其量只是临时摩擦。 Claude Code 不会被旧的拒绝串技巧难倒,前沿模型在侦测提示注入方面越来越强。网络搜索能力又削弱了那种基于训练截止后技术的题目。要求参赛者别用 LLM 的规则被普遍忽视,开放在线活动几乎无法执行这些限制。组织者左右为难:出常规题,agent 太多题能解;刻意出对 agent 敌意极大的题,那些题往往变得靠猜测、过度设计,甚至连人类都难以忍受。这不是解决办法,只是让所有人受罪。
"去适应"这种说法尤其让人恼火。社区里很多曾被作者仰望的人都会这样说,但这只有在说明"适应成什么"时才有意义。若"适应"是指做更好的工具,玩家们早已在做;若是出更难的题,组织者早已尝试过;若是接受记分板已变成 AI 编排基准,我们应该直说,而不是假装旧的竞赛形态仍在。即便组织者能想出当前 LLM 无法解的更猜测性或更过度设计的题目,玩家也很难在保持竞争力的同时学习所需技能。再过几代模型,这些努力可能也会变得无关紧要。 LLM 在安全能力上的进展速度太快,题目设计很难长期领先。
后果已显而易见:CTFTime 排行榜几乎看不见历史或人类技能的痕迹。 2026 年的榜单已与往年判若两人。 TheHackersCrew 和许多其他大型有名气的队伍,不是干脆不参赛,就是参赛人数大幅下降,或难以挤进前十。无监管的作弊泛滥。一些曾很优秀的 CTF,比如 Plaid CTF,也停办了。这种感受并非作者一人所有:作者所在本地队 Emu Exploit 的许多成员也有同感。他们持续参加 International Cybersecurity Championship,在 bug bounty 中表现优异,参与 Pwn2Own,并在 Black Hat 等会议发表演讲。失去兴趣的并不是局外人,而恰恰是那群曾被这个生态培养和留住的核心人物。
对许多真正在乎的人来说,CTF 的乐趣已不复存在。失去的不是一块记分板,而是从初学者好奇心到精英赛场的阶梯,是题目设计的匠心,是一个聪明的人靠深刻理解解决难题的成就感。这份遗产并未被当前形式的开放在线 CTF 继承。这个赛制已经死了,或许会被别的东西替代,但若还装作一切如常,只会让损失更难正视,也给那些推销 AI 的人更多机会,把社区贡献的价值打包卖回去。
尽管 CTF 与 AI 的变动高度商业化,超出我们掌控,但 CTF 对行业的正面影响不可抹杀。作者通过 CTF 结识了许多善良、聪明且充满热情的人,遇到过设计精巧的题目,发现过意想不到的解法。围绕 CTF 的社区一直是学习、成长与交往的好地方。这些东西无论比赛走向如何,都值得保留。作为社区,我们应当团结,开辟新路径保持热情与学习。像 SecTalks 、学生会议、本地聚会这样的安全相关社交活动,是保持联系的好方式;学习平台及其在 Discord 上的社区也很有价值。虽然要找到曾经那样的替代品不易,但我们围绕 CTF 建立起来的优秀社区,比以往任何时候都更值得珍惜,在寻找保持竞争精神的新的出路时尤为重要。
The CTF scene is dead. Frontier AI has broken the open CTF format, and the scoreboard no longer measures human skill cleanly. The author, who has been deeply embedded in the CTF community since 2021, winning major competitions like DownUnderCTF and competing internationally with top-tier teams like TheHackersCrew, argues that the old game is not coming back. This is not a dismissal born of dislike, but a recognition that CTFs were foundational to their love of security, teaching them how to learn, measure progress, and connect with respected peers. Watching people pretend the format is still viable is frustrating because the core experience has fundamentally changed.
The shift began gradually as AI tools like GPT-4 started making medium-difficulty challenges solvable with a single prompt. At first, this seemed manageable since hard challenges remained untouched. But the real turning point came with Claude Opus 4.5, which made nearly all medium and some hard challenges agent-solvable. Tools like Claude Code allowed teams to automate the process, spinning up instances for each challenge via APIs. Suddenly, teams that refused to use AI were at a severe disadvantage, not just missing convenience but competing in a slower version of the game. The scoreboard began measuring orchestration and willingness to use frontier models as much as, or more than, actual security skill. This distorted the CTFTime leaderboard, reduced participation from legendary teams, and demoralized challenge authors who spent weeks crafting elegant problems only to see them solved by agents in minutes.
GPT-5.5 and its Pro variant have sealed the deal. These models can now one-shot "Insane" difficulty challenges, including complex heap pwn problems, that were previously considered beyond automation. If you can afford enough tokens and context, you can burn through a 48-hour CTF before it ends. This turns open CTFs into a pay-to-win scenario, where performance depends more on computational resources than human expertise. Specialized cybersecurity models are becoming irrelevant compared to general frontier LLMs. The competition is no longer about who understands security deeply, but who can afford to run enough agents for long enough. CTF performance is losing its value as a metric for recruiting security practitioners, and it is not even a good measure of AI skill since most orchestration tools are already open source or easily built.
Some argue that beginners can still learn from CTFs as they always have, but this misses the point. CTFs were never just puzzles; they were a ladder. Beginners could see themselves improve, solve more challenges, place higher, and join better teams. That feedback loop is breaking. When the visible scoreboard is dominated by AI-powered teams, beginners are pushed toward using AI before they have built the instincts the AI replaces. This is an anti-pattern that prevents active learning, which requires struggle. It is also demotivating to put in real effort and see no visible progress because the ladder above has been automated. Beginners are better off using dedicated learning platforms like picoGym or HackTheBox, where the expectation is education rather than competition, and the incentive to cheat oneself out of learning is lower.
Others claim CTFs are not dead, just augmented by AI, pointing to elite finals like DEF CON where AI still cannot solve everything. But this is the wrong defense. Those finals have very few participants and are gated behind qualifiers that are easier than the finals themselves. If qualifiers fall to agents, fewer genuinely qualified people reach the challenges that still resist AI. A tiny number of elite finals does not save the open online format that most people actually play. The claim is not that every challenge is solved, but that enough of the scoreboard has been automated that it no longer means what it used to mean.
The argument that AI is useful for security research is also misplaced. CTFs were never meant to be security research. They can showcase new techniques, but the CTF itself is not the point of discovery. Just because AI is useful within a field does not mean it belongs in the competitive landscape of that field. In CTFs, unrestricted AI removes the human from the puzzle almost entirely, reducing the art of security to a prompt. CTFs were an artform, a way to share techniques, and a way to push the human bounds of security skill. That purpose is being stripped away.
The chess engine analogy is often used to justify AI in CTFs, but it misses a critical point. Chess engines are not allowed during competitive play. They are used for analysis, training, commentary, and practice, enriching the game around the competition without replacing the person competing. Imagine giving every competitive chess player the best engine and letting them use it freely during matches. Would that be fair? Would it be fun to watch? Would it justify prize pools? Would it push the human limits of what could be achieved? The same questions apply to CTFs.
CTF organisers have tried techniques to break or deter LLM solutions, but these are temporary friction at best. Claude Code does not care about old refusal-string tricks, and frontier models are getting better at noticing prompt injections. Web search capabilities weaken challenges based on technologies released after the training cutoff. Rules asking people not to use LLMs are ignored and almost impossible to enforce in open online events. This leaves organisers in a bad position. If they make normal challenges, agents solve too much. If they make challenges deliberately hostile to agents, the challenges often become guessy, overengineered, or unpleasant for humans too. That is not a real fix. It just makes CTFs worse for everyone.
The "just adapt" take is infuriating. People the author has always looked up to in the community have said it, but it is nonsensical unless you explain what we are adapting into. If adaptation means building better tooling, CTF players already did that. If it means writing harder challenges, organisers already tried. If it means accepting that the scoreboard is now an AI orchestration benchmark, then we should say that honestly instead of pretending the old competition still exists. Even if organisers create guessier or more overengineered challenges that current LLMs cannot solve, there are no good paths for players to learn the required skills while staying competitive. A few models from now, that point may be irrelevant anyway. The trajectory of LLM security capability is moving too quickly for challenge design to stay ahead for long.
The aftermath is visible. The CTFTime leaderboard has almost no semblance of history or human skill anymore. The 2026 scoreboard is unrecognisable compared to every year before it. TheHackersCrew, alongside many other large and reputable teams, either do not play, play with far fewer people, or struggle to cut into the top 10. Unregulated cheating is through the roof. Some of the best CTFs, like Plaid CTF, are not running anymore. These sentiments are not only the author's. Many members of their local team, Emu Exploit, feel similarly. These are people who consistently attend the International Cybersecurity Championship, perform at the top level in bug bounty programmes, compete in Pwn2Own, and present at conferences including Black Hat. The people losing interest are not casual observers. They are exactly the kind of people the scene used to produce and retain.
The fun of CTFing is gone for many of the people who cared most. The loss is not just a scoreboard. It is the ladder from beginner curiosity to elite competition. It is the craft of challenge design. It is the feeling that a clever human solved something difficult because they understood it deeply. That legacy is not being carried forward by open online CTFs in their current form. The format is dead. Something else may replace it, but pretending nothing fundamental has changed only makes the loss harder to talk about honestly. It also gives AI shills more room to capitalise on the decline by selling mediocre wrappers back to the community that made the training data valuable in the first place.
While a lot of what is happening in the CTF and AI space is super commercialised and out of our control, CTF has had a hugely positive impact on the industry. The author has met so many kind, smart, and passionate people through CTFs, played some of the most beautifully crafted challenges, and found some of the most intriguing unintended solutions. The community around CTFing has been an amazing place to learn, grow, and connect. That is something we should not lose, no matter where the competition goes. As a community, we should strive to stay together and build new avenues to stay passionate and keep learning. Security-adjacent social events like SecTalks, student conferences, and local meetups are great ways to stay connected and stay involved. Learning platforms and the communities they provide through platforms like Discord are also a valuable resource. While it may be a struggle to find an alternative to what we had, the amazing community we have built around it is more important now more than ever as we find new ways to keep the competitive spirit alive.
453 comments • Comments Link
讨论集中在人工智能,特别是大型语言模型(LLM)对 Capture The Flag(CTF)网络安全竞赛和整体教育的影响。参与者认为,AI 从根本上破坏了开放式 CTF 的形式,参赛者可以在不掌握底层概念的情况下靠模型解题,把比赛变成了基于 token 使用而非技能的"pay-to-win"场景。这也反映了教育领域更广泛的危机:AI 让"让它帮我做"变得异常容易,阻碍了批判性思维和从第一性原理出发的推理能力的发展。尽管有人认为 AI 是专家的强大辅助工具,但许多人担心它在培养出一种"替代人类"的管道,而我们的教学方法和技能验证手段却没有随之演进。讨论还凸显了在 AI 时代之前学习的人与依赖 AI 的人之间日益扩大的分化,许多资深开发者指出,像写出 FizzBuzz 这种基本能力在毕业生中已变得罕见。为维护竞赛与教育的完整性,建议转向面对面的线下赛事并严格控制硬件,或重新设计挑战,使其要求物理交互或具备当前 AI 难以应对的极端新颖性。最终,社区正努力应对一种恐惧:学习中必须经历的"挣扎"正在被自动化取代,可能会催生一代"vibe coder"——能交付代码但缺乏维护或创新复杂系统所需的深层理解的人。 The discussion centers on the impact of AI, specifically large language models (LLMs), on the Capture The Flag (CTF) cybersecurity competition scene and education in general. Participants argue that AI has fundamentally broken the open CTF format by allowing participants to solve challenges without understanding the underlying concepts, turning the competition into a "pay-to-win" scenario based on token usage rather than skill. This mirrors a broader crisis in education, where the ease of having AI "do it for me" prevents the development of critical thinking and first-principles reasoning. While some view AI as a powerful tool for experts, many express concern that it creates a "human replacement pipeline" without a corresponding evolution in how we teach and validate skills. The conversation highlights a growing divide between those who learned before the AI era and those who rely on it, with many experienced developers noting that basic competency, such as writing a FizzBuzz solution, is now rare among graduates. Suggestions for preserving the integrity of competitions and education include moving to in-person, offline events with strict hardware controls, or redesigning challenges to require physical interaction or extreme novelty that current AI models struggle with. Ultimately, the community grapples with the fear that the "struggle" essential for learning is being automated away, potentially leading to a generation of "vibe coders" who can ship code but lack the deep understanding necessary to maintain or innovate complex systems.