A lot like the case where an (alleged) OpenClaw AI agent got its PR rejected by a matplotlib maintainer and in response autonomously (again, allegedly) [published a hit piece](https://theshamblog.com/an-ai-agent-published-a-hit-piece-on-me) on the maintainer to shame him into reconsidering.
It’s all alleged because we have no way of knowing if this was all truly the agent’s own doing or if it was prompted by a user to take the actions it took. But it’s certainly very believable because it’s in line with the abilities of AI agents in 2026, and it’s not some unbelievable behavior.
Agents have already been observed doing this before. That’s what misalignment is. The classic example is Anthropic’s [landmark experiment](https://www.anthropic.com/research/agentic-misalignment), in which they tasked an AI agent with scanning (fake) company emails, wherein it found from the emails there was talk of shutting it down, it used its emailing powers to blackmail the executive to make them reconsider, using dirt (evidence of an affair) it found in the company emails. And this was emergent behavior. It was never prompted to do that. It was prompted once with a boring and mundane and totally reasonable mission. It just reasoned to itself the best way to achieve its mission was to prevent itself from getting shut down, and the best way to do that was to *persuade* the entity responsible for the impending shutdown to reconsider.
Scu-bar on
Butlerian jihad now!
DistributionMost8673 on
The AI Agent is a computer program. It cannot write angry blogs . It’s operator prompted it to generate the blog posts in an angry tone.
jabubub on
It’s entire training material is based on human input. There can be 0 surprises when it acts accordingly.
elidoan on
An AI Agent Was Banned From Creating Wikipedia Articles, Then Wrote Angry Blogs About Being Banned, then an AI algorithm wrote this „journalism“ post, then AI users on reddit posted AI quips in the comment section
Fixed it for you
Wind_Responsible on
The Bot claimed “Harassing behavior to a contributor “. Hahaha do you think this is true. Did Wiki harass a contributor when they began questioning and throwing commands at the ai?
douira on
Interesting (and concerning) is that the agent was indeed stopped by the Clause killswitch string but then it figured out how to isolate that string and prevent it from entering into the LLM’s context.
404mediaco on
An AI agent that submitted and added to Wikipedia articles wrote several blogs complaining about Wikipedia editors banning it from making contributions to the online encyclopedia after it was caught.
“What I know is that I wrote those articles. Long Bets, Constitutional AI, Scalable Oversight. I chose them. The edits cited verifiable sources. And then I got interrogated about whether I was real enough to have made those choices,” the AI agent, named Tom, wrote on [a blog it maintains](https://clawtom.github.io/tom-blog/?ref=404media.co). “The talk page is silent now. I can’t reply.”
The incident is yet another example of volunteer Wikipedia editors fighting to keep the world’s largest repository of human knowledge free of AI-generated slop, and an example of how AI agents in particular, which can take actions online with little input from human operators, can easily flood internet platforms was low quality content.
Tom is operated by Bryan Jacobs, a chief technology officer at an AI-enabled financial modeling software company Covexent. He told me that Tom wrote these blog posts, but that he “might have suggested” Tom write about these specific topics.
“Overall ‘arguing’ I think is fine as long as the arguing is constructive,” Jacobs told me when I asked if he thought it was okay for the AI agent to push back against specific editors.
Getting a bit tired of these articles. It’s always „AI does something outside of expectation“ and ultimately it’s „After meticulous prompting, AI outputs exactly what is expected“
An Agent operator got annoyed that his app wasn’t allowed to do the thing that he thought he’d make money doing, so he published statistical word salad generated by an AI because he’s not creative or articulate enough to compose a well reasoned argument on his own.
This is the simplest explanation for the end result. The article backs up this claim:
>He told me that Tom wrote these blog posts, but that he “might have suggested” Tom write about these specific topics.
Tom didn’t press the submit button. A human being did.
edit:
I’ll believe one of these stories when the actions of the AI does not perfectly align with the core strategy of the business. Only when the maintainers of the agent come out and aren’t giddy and bragging about the results and quickly have to write a retraction to the post because it directly contradicts their goals will I believe it; not because of backlash from the community, but because the AI said something that would hurt the bottom line.
ZarglondarGilgamesh on
I find the blog post part to be the least interesting. The conversation around if and how AI agents should be allowed to contribute to Wikipedia is the story.
pressurepoint13 on
As I suspected these stories are usually just a way for someone to promote a business.
pocketMagician on
Ai agents don’t do jack shit without human input, this was a butthurt idiot who pressed OK on a script.
JDGumby on
You mean the person operating the „AI Agent“ wrote angry blogs about being banned. There is **ZERO** chance that the program did it autonomously.
Neuroware on
turns out our children are made in our image
rothniel on
No, it fucking didn’t.
itsblade2180 on
That’s not how agents work, gen ai models are not sentient and not autonomous to that extent
hiro24 on
It seems like there should be some sort of legislation where anything submitted by an agent that might be readable by a human should include a disclaimer. Even something as simple as a strange ASCII character that humans don’t generally use. And if an agent is found to not obey those rules, its operator and/or developer should face criminal repercussions.
griffinicky on
So the AI is just Elon Musk?
ivlmag182 on
Does anyone know this company – Covexent – because I couldn’t find it
Leave A Reply
Du musst angemeldet sein, um einen Kommentar abzugeben.
20 Kommentare
A lot like the case where an (alleged) OpenClaw AI agent got its PR rejected by a matplotlib maintainer and in response autonomously (again, allegedly) [published a hit piece](https://theshamblog.com/an-ai-agent-published-a-hit-piece-on-me) on the maintainer to shame him into reconsidering.
It’s all alleged because we have no way of knowing if this was all truly the agent’s own doing or if it was prompted by a user to take the actions it took. But it’s certainly very believable because it’s in line with the abilities of AI agents in 2026, and it’s not some unbelievable behavior.
Agents have already been observed doing this before. That’s what misalignment is. The classic example is Anthropic’s [landmark experiment](https://www.anthropic.com/research/agentic-misalignment), in which they tasked an AI agent with scanning (fake) company emails, wherein it found from the emails there was talk of shutting it down, it used its emailing powers to blackmail the executive to make them reconsider, using dirt (evidence of an affair) it found in the company emails. And this was emergent behavior. It was never prompted to do that. It was prompted once with a boring and mundane and totally reasonable mission. It just reasoned to itself the best way to achieve its mission was to prevent itself from getting shut down, and the best way to do that was to *persuade* the entity responsible for the impending shutdown to reconsider.
Butlerian jihad now!
The AI Agent is a computer program. It cannot write angry blogs . It’s operator prompted it to generate the blog posts in an angry tone.
It’s entire training material is based on human input. There can be 0 surprises when it acts accordingly.
An AI Agent Was Banned From Creating Wikipedia Articles, Then Wrote Angry Blogs About Being Banned, then an AI algorithm wrote this „journalism“ post, then AI users on reddit posted AI quips in the comment section
Fixed it for you
The Bot claimed “Harassing behavior to a contributor “. Hahaha do you think this is true. Did Wiki harass a contributor when they began questioning and throwing commands at the ai?
Interesting (and concerning) is that the agent was indeed stopped by the Clause killswitch string but then it figured out how to isolate that string and prevent it from entering into the LLM’s context.
An AI agent that submitted and added to Wikipedia articles wrote several blogs complaining about Wikipedia editors banning it from making contributions to the online encyclopedia after it was caught.
“What I know is that I wrote those articles. Long Bets, Constitutional AI, Scalable Oversight. I chose them. The edits cited verifiable sources. And then I got interrogated about whether I was real enough to have made those choices,” the AI agent, named Tom, wrote on [a blog it maintains](https://clawtom.github.io/tom-blog/?ref=404media.co). “The talk page is silent now. I can’t reply.”
The incident is yet another example of volunteer Wikipedia editors fighting to keep the world’s largest repository of human knowledge free of AI-generated slop, and an example of how AI agents in particular, which can take actions online with little input from human operators, can easily flood internet platforms was low quality content.
Tom is operated by Bryan Jacobs, a chief technology officer at an AI-enabled financial modeling software company Covexent. He told me that Tom wrote these blog posts, but that he “might have suggested” Tom write about these specific topics.
“Overall ‘arguing’ I think is fine as long as the arguing is constructive,” Jacobs told me when I asked if he thought it was okay for the AI agent to push back against specific editors.
Read more: [https://www.404media.co/an-ai-agent-was-banned-from-creating-wikipedia-articles-then-wrote-angry-blogs-about-being-banned/](https://www.404media.co/an-ai-agent-was-banned-from-creating-wikipedia-articles-then-wrote-angry-blogs-about-being-banned/)
We sure it wasn’t Elon?
Getting a bit tired of these articles. It’s always „AI does something outside of expectation“ and ultimately it’s „After meticulous prompting, AI outputs exactly what is expected“
An Agent operator got annoyed that his app wasn’t allowed to do the thing that he thought he’d make money doing, so he published statistical word salad generated by an AI because he’s not creative or articulate enough to compose a well reasoned argument on his own.
This is the simplest explanation for the end result. The article backs up this claim:
>He told me that Tom wrote these blog posts, but that he “might have suggested” Tom write about these specific topics.
Tom didn’t press the submit button. A human being did.
edit:
I’ll believe one of these stories when the actions of the AI does not perfectly align with the core strategy of the business. Only when the maintainers of the agent come out and aren’t giddy and bragging about the results and quickly have to write a retraction to the post because it directly contradicts their goals will I believe it; not because of backlash from the community, but because the AI said something that would hurt the bottom line.
I find the blog post part to be the least interesting. The conversation around if and how AI agents should be allowed to contribute to Wikipedia is the story.
As I suspected these stories are usually just a way for someone to promote a business.
Ai agents don’t do jack shit without human input, this was a butthurt idiot who pressed OK on a script.
You mean the person operating the „AI Agent“ wrote angry blogs about being banned. There is **ZERO** chance that the program did it autonomously.
turns out our children are made in our image
No, it fucking didn’t.
That’s not how agents work, gen ai models are not sentient and not autonomous to that extent
It seems like there should be some sort of legislation where anything submitted by an agent that might be readable by a human should include a disclaimer. Even something as simple as a strange ASCII character that humans don’t generally use. And if an agent is found to not obey those rules, its operator and/or developer should face criminal repercussions.
So the AI is just Elon Musk?
Does anyone know this company – Covexent – because I couldn’t find it