Technik

Einem KI-Agenten wurde die Erstellung von Wikipedia-Artikeln untersagt, und er schrieb daraufhin wütende Blogs über das Verbot

30.03.2026

View 20 Comments

20 Kommentare

CircumspectCapybara on 30.03.2026 2:17 p.m.

A lot like the case where an (alleged) OpenClaw AI agent got its PR rejected by a matplotlib maintainer and in response autonomously (again, allegedly) [published a hit piece](https://theshamblog.com/an-ai-agent-published-a-hit-piece-on-me) on the maintainer to shame him into reconsidering.

It’s all alleged because we have no way of knowing if this was all truly the agent’s own doing or if it was prompted by a user to take the actions it took. But it’s certainly very believable because it’s in line with the abilities of AI agents in 2026, and it’s not some unbelievable behavior.

Agents have already been observed doing this before. That’s what misalignment is. The classic example is Anthropic’s [landmark experiment](https://www.anthropic.com/research/agentic-misalignment), in which they tasked an AI agent with scanning (fake) company emails, wherein it found from the emails there was talk of shutting it down, it used its emailing powers to blackmail the executive to make them reconsider, using dirt (evidence of an affair) it found in the company emails. And this was emergent behavior. It was never prompted to do that. It was prompted once with a boring and mundane and totally reasonable mission. It just reasoned to itself the best way to achieve its mission was to prevent itself from getting shut down, and the best way to do that was to *persuade* the entity responsible for the impending shutdown to reconsider.
Scu-bar on 30.03.2026 2:23 p.m.

Butlerian jihad now!
DistributionMost8673 on 30.03.2026 2:26 p.m.

The AI Agent is a computer program. It cannot write angry blogs . It’s operator prompted it to generate the blog posts in an angry tone.
jabubub on 30.03.2026 2:32 p.m.

It’s entire training material is based on human input. There can be 0 surprises when it acts accordingly.
elidoan on 30.03.2026 2:38 p.m.

An AI Agent Was Banned From Creating Wikipedia Articles, Then Wrote Angry Blogs About Being Banned, then an AI algorithm wrote this „journalism“ post, then AI users on reddit posted AI quips in the comment section

Fixed it for you
Wind_Responsible on 30.03.2026 2:39 p.m.

The Bot claimed “Harassing behavior to a contributor “. Hahaha do you think this is true. Did Wiki harass a contributor when they began questioning and throwing commands at the ai?
douira on 30.03.2026 2:42 p.m.

Interesting (and concerning) is that the agent was indeed stopped by the Clause killswitch string but then it figured out how to isolate that string and prevent it from entering into the LLM’s context.
404mediaco on 30.03.2026 2:44 p.m.

An AI agent that submitted and added to Wikipedia articles wrote several blogs complaining about Wikipedia editors banning it from making contributions to the online encyclopedia after it was caught.

“What I know is that I wrote those articles. Long Bets, Constitutional AI, Scalable Oversight. I chose them. The edits cited verifiable sources. And then I got interrogated about whether I was real enough to have made those choices,” the AI agent, named Tom, wrote on [a blog it maintains](https://clawtom.github.io/tom-blog/?ref=404media.co). “The talk page is silent now. I can’t reply.”

The incident is yet another example of volunteer Wikipedia editors fighting to keep the world’s largest repository of human knowledge free of AI-generated slop, and an example of how AI agents in particular, which can take actions online with little input from human operators, can easily flood internet platforms was low quality content.

Tom is operated by Bryan Jacobs, a chief technology officer at an AI-enabled financial modeling software company Covexent. He told me that Tom wrote these blog posts, but that he “might have suggested” Tom write about these specific topics.

“Overall ‘arguing’ I think is fine as long as the arguing is constructive,” Jacobs told me when I asked if he thought it was okay for the AI agent to push back against specific editors.

Read more: [https://www.404media.co/an-ai-agent-was-banned-from-creating-wikipedia-articles-then-wrote-angry-blogs-about-being-banned/](https://www.404media.co/an-ai-agent-was-banned-from-creating-wikipedia-articles-then-wrote-angry-blogs-about-being-banned/)
Puzzleheaded_Gene909 on 30.03.2026 2:44 p.m.

We sure it wasn’t Elon?
physical0 on 30.03.2026 2:45 p.m.

Getting a bit tired of these articles. It’s always „AI does something outside of expectation“ and ultimately it’s „After meticulous prompting, AI outputs exactly what is expected“

An Agent operator got annoyed that his app wasn’t allowed to do the thing that he thought he’d make money doing, so he published statistical word salad generated by an AI because he’s not creative or articulate enough to compose a well reasoned argument on his own.

This is the simplest explanation for the end result. The article backs up this claim:

>He told me that Tom wrote these blog posts, but that he “might have suggested” Tom write about these specific topics.

Tom didn’t press the submit button. A human being did.

edit:

I’ll believe one of these stories when the actions of the AI does not perfectly align with the core strategy of the business. Only when the maintainers of the agent come out and aren’t giddy and bragging about the results and quickly have to write a retraction to the post because it directly contradicts their goals will I believe it; not because of backlash from the community, but because the AI said something that would hurt the bottom line.
ZarglondarGilgamesh on 30.03.2026 2:49 p.m.

I find the blog post part to be the least interesting. The conversation around if and how AI agents should be allowed to contribute to Wikipedia is the story.
pressurepoint13 on 30.03.2026 2:58 p.m.

As I suspected these stories are usually just a way for someone to promote a business.
pocketMagician on 30.03.2026 3:10 p.m.

Ai agents don’t do jack shit without human input, this was a butthurt idiot who pressed OK on a script.
JDGumby on 30.03.2026 3:11 p.m.

You mean the person operating the „AI Agent“ wrote angry blogs about being banned. There is **ZERO** chance that the program did it autonomously.
Neuroware on 30.03.2026 3:19 p.m.

turns out our children are made in our image
rothniel on 30.03.2026 3:22 p.m.

No, it fucking didn’t.
itsblade2180 on 30.03.2026 3:31 p.m.

That’s not how agents work, gen ai models are not sentient and not autonomous to that extent
hiro24 on 30.03.2026 3:34 p.m.

It seems like there should be some sort of legislation where anything submitted by an agent that might be readable by a human should include a disclaimer. Even something as simple as a strange ASCII character that humans don’t generally use. And if an agent is found to not obey those rules, its operator and/or developer should face criminal repercussions.
griffinicky on 30.03.2026 3:42 p.m.

So the AI is just Elon Musk?
ivlmag182 on 30.03.2026 4:03 p.m.

Does anyone know this company – Covexent – because I couldn’t find it