‘AI scientists’ are making progress, but what are the basic parameters?

Karin Verspoor of RMIT University explores how AI is impacting research in STEM.
Many of the most exciting scientific discoveries involve highly specialized knowledge and making connections between distant realities. Scientists must combine critical analysis with broad thinking techniques.
As with many information-rich tasks, researchers are looking to artificial intelligence (AI) systems to speed up their work. AI tools may be able to support key steps such as generating ideas, reviewing existing work and analyzing data.
Recent systems use large-scale linguistic models (LLMs) to allow scientists to naturally and directly interact with large volumes of information derived from words in the scientific literature.
But as two new The systems described in papers recently published in Nature show that, when it comes to science, language alone can only go so far.
What AI is doing in science
A number of organizations, such as Let’s talk AIthey try to automate the entire scientific process. So far, these efforts have focused mainly on computer science, where ‘experiments’ mainly involve designing and writing code.
However, the Agents4Science A conference held at Stanford last October showcased a wide range of AI-generated papers. They cover topics ranging from mechanical engineering and protein synthesis to a system called The BadScientist which deliberately produced “convincing but nonsensical” research.
I’ve done it before raised concerns about the implications of AI scientists for the scientific ecosystem. Recent work confirms these concerns, he points out increasing quantity but lower quality for both papers and peer review, identification fictional references to published worksto find out fabricated and misleading imagesand more.
What scientists are doing with AI
AI systems clearly cannot be trusted to run the full scientific process on their own. But what about using AI to help scientists do more faster?
This is the goal of two new programs described in Nature: Robinmade by non-profits Future Houseagain Co-Scientistfrom Google DeepMind.
Both programs aim to accelerate scientific discovery, working in partnership with scientists. Both are also ‘multi-agent’ AI systems, meaning they are built as a collection of specialized agents each targeting specific steps in the scientific discovery process, joined by a ‘manager’ agent.
Agents comprising the Co-Scientist aim to mirror abstract cognitive functions, as a ‘perceiving agent’ acting as a critical scientific peer reviewer who assesses the quality of the hypothesis. ‘Position agents’ conceptual research debates in ‘contests’, using multiple interactive LLMs to simulate a discussion about the relative merits of two ideas.
Robin’s agents, on the other hand, are very suitable for specific tasks related to drug repurposing, aimed at identifying new drugs for a specific disease. One agent focuses on selecting diagnostic tests, while another analyzes complex biomedical data.
How do the results stack up?
The Co-Scientist can assess the quality of their generated proposals, using a method called Elo rating best known for ranking chess players. Co-Scientist ratings of innovation and the impact of their results correspond well with human experts’ preferences and judgments of other LLM programs.
In a drug repurposing study, Co-Scientist selected 30 candidates as promising drugs for a type of cancer called acute myeloid leukemia. Expert oncologists (humans) refined the list, and five drugs were tested in the lab. Of these, three have shown positive results and one seems to show some promise.
Other experiments have demonstrated the ability of Co-Scientists to test combinations of multiple drugs.
Notably, Co-Scientist’s predictions were not compared to the plethora of supervised computational methods and machine learning methods for drug recycling which have been developed over decades of computational biology research. This means we don’t know if the new general-purpose tool is more efficient than the more specific AI methods.
Both systems stop short of directly verifying their ideas, which would involve actual physical testing. Both also rely heavily on human perception to define an important scientific question, to test intuition and to prioritize predictions for further investigation.
The Co-Scientist is more focused on generating ideas with a detailed thinking agent, leaving verification and interpretation to the next steps. Robin also uses the agent to analyze data generated from real-world testing.
Robin was used to lift 30 people on drugs for a condition called age-related macular degeneration. The top five were selected for testing.
Robin also made suggestions for testing, with a few suggestions put out by human scientists. Through several rounds of brainstorming and analysis, two drugs were identified as promising.
Tests by Robin’s agents showed that those holding previous research were better at work than general purpose LLMs. The analytical agent performed less well on questions about statistics and bioinformatics, and relied more on information provided by humans.
Language restrictions only
AI can help scientists navigate through thousands of years of written information. The use of computation to find patterns in large data sets, to synthesize scattered information, and to develop new findings from the existing literature it has already contributed to scientific progress for decades.
New models like the Robin and the Co-Scientist represent a shift towards working directly in the space of language of science, rather than a raw data environment. This allows for more natural interactions between scientist and machine, through language-based ‘conversations’.
However, more natural does not mean more efficient. Communication based on language can be abstract and ambiguous, where science must clarify.
Models that combine the best of both worlds are on the horizon. This aims to link structured quantitative data to the concepts and relationships that define the underlying facts.
Such models support scientific reasoning in structure information. They allow scientific evidence ranging from genomic sequences and protein structures to cellular imaging to be linked.
Words are how science is communicated. AI tools help make sense of the information hidden in all those important words. But the complexity of the natural world means that AI (co-)scientists will really work if they can go beyond linking words together, to depict the full complexity of the systems that those words describe.
By Karin Verspoor
Karin Verspoor is Head of the School of Computing Technologies at RMIT University. He works at the intersection of science and technology, using artificial intelligence methods to analyze and interpret biological and clinical data, with a particular focus on natural language processing of unstructured textual data. He is a fellow of the Australian Academy of Technological Sciences and Engineering and the Australasian Institute of Digital Health. He is also the founder and Victoria node lead of the Australian Alliance for Artificial Intelligence in Healthcare.
Don’t miss out on the information you need to succeed. Sign up for Daily BriefSilicon Republic’s digest of must-know sci-tech news.

