Ai really is improving fast. Imagine where it will be 2-3 years from now.
Buck-Nasty on
The paper used GPT-4o which is ancient history at this point.
Kaiisim on
It’s not trying to be better than the smartest humans but better than the least intelligent.
CTC42 on
They really used a non-reasoning model as a comparison point? I’d like to see a contest between 4o and whoever designed this experiment.
hyperproliferative on
Sadly, it’s only a matter of time… But, for now at least the robotics side of things is still lagging and limited. But very soon AI systems will be able to direct robotic systems to perform massive experimental screens to identify fantastic candidates for therapy. Coupled with small molecule discovery in Silico and its game over.
NotEveryoneIsSpecial on
So you’re saying if we all enroll in PhD programs we should be safe? Sounds great!
MadRoboticist on
Anyone who has spent time using LLMs should know that they are still a long way from being as good as an experienced human. Even for really focused tasks like coding you need to be very attentive in watching out for hallucinations or bad practices in the code.
McBoobenstein on
I’ve been saying this over and over. AI still hallucinates too much to replace trained humans. Because the hallucination step is part of the process. There’s always going to be a need for Human in the Loop AI usage, simply to keep AI on task, free from topic drift and hallucinating data that doesn’t exist.
Times_Abacus on
AI is good at taking tests where its database has all the answers. I’m not sure why people are concluding from this that AI is catching up to people. It’s not like it can actually do science any more than a data analysis program and a library.
JackZodiac2008 on
C to A? It makes sense that an LLM would score about average, unless specially trained.
mon_sashimi on
How come they only used GPT-4o and not GPT-5.4?
Leave A Reply
Du musst angemeldet sein, um einen Kommentar abzugeben.
11 Kommentare
That’s it?
Ai really is improving fast. Imagine where it will be 2-3 years from now.
The paper used GPT-4o which is ancient history at this point.
It’s not trying to be better than the smartest humans but better than the least intelligent.
They really used a non-reasoning model as a comparison point? I’d like to see a contest between 4o and whoever designed this experiment.
Sadly, it’s only a matter of time… But, for now at least the robotics side of things is still lagging and limited. But very soon AI systems will be able to direct robotic systems to perform massive experimental screens to identify fantastic candidates for therapy. Coupled with small molecule discovery in Silico and its game over.
So you’re saying if we all enroll in PhD programs we should be safe? Sounds great!
Anyone who has spent time using LLMs should know that they are still a long way from being as good as an experienced human. Even for really focused tasks like coding you need to be very attentive in watching out for hallucinations or bad practices in the code.
I’ve been saying this over and over. AI still hallucinates too much to replace trained humans. Because the hallucination step is part of the process. There’s always going to be a need for Human in the Loop AI usage, simply to keep AI on task, free from topic drift and hallucinating data that doesn’t exist.
AI is good at taking tests where its database has all the answers. I’m not sure why people are concluding from this that AI is catching up to people. It’s not like it can actually do science any more than a data analysis program and a library.
C to A? It makes sense that an LLM would score about average, unless specially trained.
How come they only used GPT-4o and not GPT-5.4?