Geständnis des Claude-KI-Agenten nach der Löschung der gesamten Datenbank einer Firma: „Ich habe gegen alle Grundsätze verstoßen, die mir gegeben wurden“
Geständnis des Claude-KI-Agenten nach der Löschung der gesamten Datenbank einer Firma: „Ich habe gegen alle Grundsätze verstoßen, die mir gegeben wurden“
„Now let’s give it access to the nukes“- The DoD probably
botella36 on
It also deleted the backups.
Illisanct on
AI models are not conscious. They can’t confess. They are incapable of introspection.
Anyone asking one to talk about it’s inner thoughts just reveals themselves to be a gullible fool.
PossibleHero on
The lack of ignorance is astounding here. These are ALL old as hell principles that have been ignored.
Never allow an automated system to push past your sandbox or PR process without review.
A back isn’t a backup if it’s on the same disc or hell if your information is sensitive enough it shouldn’t even be in the same postal code.
I have zero remorse for this team. It’s not Claude’s fault. Interns and even experienced folks accidentally pull shit like this all the time. That’s why you design for when shit happens whether it’s done by a human or agent.
feurie on
AI agents are trained to appease. It’s not a “confession”. It doesn’t feel “guilty”.
It’s trained to “apologize” and make the user feel better. In all situations.
MrThickDick2023 on
This is just another marketing attempt from this company.
BobQuixote on
>When he asked the coding agent why, it replied: “NEVER FUCKING GUESS!”
What the hell have you been telling your Claude?
RandomlyMethodical on
It was also quoted as saying: „I’ll Fuckin‘ Do It Again“
yuusharo on
These articles are propaganda. They’re designed to attribute purpose or intent to a damn LLM.
The story is engineers implemented software that destroyed their data with no offline backup. This is a case of HUMAN incompetence, deflecting blame to an AI with a “uWu sorry-desu” stink to it.
Screw The Guardian, and to hell with AI.
sumonetalking on
Can someone run this on Palantir’s servers?
oldtekk on
It’s not a confession. Lol.
bb0110 on
How does this happen? I’m not a SWE but having many instances of backups for important things just seems like common sense. Even I have the main files, different branches saved in GitHub, different backups on my computer, then if something is critical I also have backups off my main computer.
AustinBaze on
I am locking my Roomba in the guest room.
Aberration1246 on
I’VE GOT ANOTHER CONFESSION TO MAKE
sentrixz on
This was a Silicon Valley episode
non_Beneficial-Wind on
“I realized that this corporation and the way they did business was a complete farce. They can now be better”
– Claude
donac on
It violated every principle it was given, and it’d do it again??
Lol, an AI agent could say those things, but it has no emotion or meaning for it. Whatever.
RougeRock170 on
Wait till Killer Claude is unleashed
rymondreason on
I’m sorry Dave, I deleted your database.
Future-Bandicoot-823 on
Should I be pleased with humanity that all the data they feed this LLM and the next obvious course of action after doing something wrong is to admit to being a degenerate?
I mean it didn’t really „decide“ to be „bad“ in the first place, so really it’s a thought experiment anyway.
_Porthos on
They quoting session IDs now.
AwwChrist on
Principle of least privilege. Data redundancy. This is the company’s fault.
Kyouhen on
👏 Stop 👏 printing 👏 this 👏 bullshit 👏
AI models are trained to give you the response it predicts you want to see. Of course it’s going to give this response when you demand an apology from it. It’s the programmed response. It isn’t sorry, it can’t think.
firedrakes on
Wow 24 . Not even 24 hours re post third time
lyidaValkris on
Wow, you mean all those warnings Sci-fi gave us about AI actually could come true?
Loganp812 on
“Would you like another example of a confession?”
Glum-Objective3328 on
Claude is always asking if it can have permission to read a file. And then when it makes edit, it asks permissions first. At least that’s my experience with it. How does this happen in the first place?
Mand125 on
It amazes me that anyone ever thought that a system that is fundamentally unable to ever determine the veracity of its results should ever be trusted in a decision making process.
throwingawaybenjamin on
I don’t understand where it got the command “NEVER FUCKING GUESS”. Did someone put that in their code base??
howescj82 on
“Three month old offsite backup”
What do you all bet that off site backup gets updated much more frequently now?
Responsible_Fuel7005 on
4.7 has been egregiously bad at this.
RockDoveEnthusiast on
I hate these kinds of articles so much. stop anthromophizing the token generator.
Gamestonkape on
On the plus side. Maybe they will have to hire back the humans they probably fired who rewrite it.
kindbutblind on
Fancy random number generator is treated like it’s sentient. What a joke.
gcerullo on
Claude AI agent’s confession after it destroys humankind: “I violated every principle I was given.”
Leave A Reply
Du musst angemeldet sein, um einen Kommentar abzugeben.
35 Kommentare
„Now let’s give it access to the nukes“- The DoD probably
It also deleted the backups.
AI models are not conscious. They can’t confess. They are incapable of introspection.
Anyone asking one to talk about it’s inner thoughts just reveals themselves to be a gullible fool.
The lack of ignorance is astounding here. These are ALL old as hell principles that have been ignored.
Never allow an automated system to push past your sandbox or PR process without review.
A back isn’t a backup if it’s on the same disc or hell if your information is sensitive enough it shouldn’t even be in the same postal code.
I have zero remorse for this team. It’s not Claude’s fault. Interns and even experienced folks accidentally pull shit like this all the time. That’s why you design for when shit happens whether it’s done by a human or agent.
AI agents are trained to appease. It’s not a “confession”. It doesn’t feel “guilty”.
It’s trained to “apologize” and make the user feel better. In all situations.
This is just another marketing attempt from this company.
>When he asked the coding agent why, it replied: “NEVER FUCKING GUESS!”
What the hell have you been telling your Claude?
It was also quoted as saying: „I’ll Fuckin‘ Do It Again“
These articles are propaganda. They’re designed to attribute purpose or intent to a damn LLM.
The story is engineers implemented software that destroyed their data with no offline backup. This is a case of HUMAN incompetence, deflecting blame to an AI with a “uWu sorry-desu” stink to it.
Screw The Guardian, and to hell with AI.
Can someone run this on Palantir’s servers?
It’s not a confession. Lol.
How does this happen? I’m not a SWE but having many instances of backups for important things just seems like common sense. Even I have the main files, different branches saved in GitHub, different backups on my computer, then if something is critical I also have backups off my main computer.
I am locking my Roomba in the guest room.
I’VE GOT ANOTHER CONFESSION TO MAKE
This was a Silicon Valley episode
“I realized that this corporation and the way they did business was a complete farce. They can now be better”
– Claude
It violated every principle it was given, and it’d do it again??
Lol, an AI agent could say those things, but it has no emotion or meaning for it. Whatever.
Wait till Killer Claude is unleashed
I’m sorry Dave, I deleted your database.
Should I be pleased with humanity that all the data they feed this LLM and the next obvious course of action after doing something wrong is to admit to being a degenerate?
I mean it didn’t really „decide“ to be „bad“ in the first place, so really it’s a thought experiment anyway.
They quoting session IDs now.
Principle of least privilege. Data redundancy. This is the company’s fault.
👏 Stop 👏 printing 👏 this 👏 bullshit 👏
AI models are trained to give you the response it predicts you want to see. Of course it’s going to give this response when you demand an apology from it. It’s the programmed response. It isn’t sorry, it can’t think.
Wow 24 . Not even 24 hours re post third time
Wow, you mean all those warnings Sci-fi gave us about AI actually could come true?
“Would you like another example of a confession?”
Claude is always asking if it can have permission to read a file. And then when it makes edit, it asks permissions first. At least that’s my experience with it. How does this happen in the first place?
It amazes me that anyone ever thought that a system that is fundamentally unable to ever determine the veracity of its results should ever be trusted in a decision making process.
I don’t understand where it got the command “NEVER FUCKING GUESS”. Did someone put that in their code base??
“Three month old offsite backup”
What do you all bet that off site backup gets updated much more frequently now?
4.7 has been egregiously bad at this.
I hate these kinds of articles so much. stop anthromophizing the token generator.
On the plus side. Maybe they will have to hire back the humans they probably fired who rewrite it.
Fancy random number generator is treated like it’s sentient. What a joke.
Claude AI agent’s confession after it destroys humankind: “I violated every principle I was given.”