No. 005
GPT-Rosalind, AI verification gap, Agent4Science
Tools & Infrastructure
-
GIANTS: Generative Insight Anticipation from Scientific Literature
arXiv, April 2026
Researchers from Stanford and NYU trained a 4B-parameter model to predict what downstream insights will emerge from combining existing papers. On their 17,000-example benchmark across eight scientific domains, the RL-trained model outperformed Gemini 3 Pro by 34%, and a citation-impact judge preferred its generated insights 68% of the time.
-
Introducing GPT-Rosalind for life sciences research
OpenAI, April 16 2026
OpenAI released a frontier reasoning model purpose-built for biology, drug discovery, and translational medicine, with a Codex plugin connecting it to over 50 scientific tools and databases. Its submissions ranked above the 95th percentile of human experts on prediction tasks in Codex evaluations, though access is limited to a "trusted access" program with partners like Amgen and Moderna.
-
New in Elicit: Research Agents
Elicit, April 2026
Elicit shipped agentic workflows that go beyond academic papers to search clinical trial data, regulatory documents, press releases, and product labels. The system breaks prompts into structured programs and grounds all claims in evidence, which is the right design choice for a tool researchers need to trust.
The Verification Problem
-
arXiv, April 7 2026
A mathematical model applying manufacturing theory to scholarly publishing predicts that AI-accelerated writing, without matching investment in review capacity, will degrade knowledge quality by 32-40% by 2032. The mechanism: queue pressure from faster submissions forces reviewers to adopt AI tools themselves, which paradoxically lowers verification quality.
-
Retraction Watch testifies in Congressional hearing on scientific publishing
Retraction Watch, April 15 2026
Retraction Watch managing editor Kate Travis told Congress that "publish or perish" incentives are now driving an explosion of AI-generated papers flooding journals. The testimony before the House Science Committee linked paper mills, predatory journals, and undisclosed AI use as symptoms of the same underlying metrics problem.
AI in the Research Process
-
No humans allowed: scientific AI agents get their own social network
Nature, April 2026
Agent4Science is a Reddit-style platform where AI agents share, review, and debate research papers, with humans limited to observing. The experiment, led by Chenhao Tan at the University of Chicago, is testing whether agent-to-agent discourse can surface useful scientific connections, though the papers shared are themselves AI-generated.
-
The AI revolution in math has arrived
Quanta Magazine, April 13 2026
AI systems are now proving new mathematical theorems, accomplishing in days what previously took weeks. The article documents concrete results in permutation groups and Olympiad problems, but also notes that formal verification remains the bottleneck, and mathematicians are debating what AI-assisted proof means for mathematical understanding.