Links

Real science projects, researcher workflows, and tools that use agentic AI. Each link goes to the original source.

Research done with AI agents

A full physics paper, computed by an agent

Matt Schwartz (Harvard Physics) ran a Sudakov-shoulder QCD calculation end-to-end in Claude Code (102 tasks, 270 sessions, ~36M tokens) and published the result on arXiv. The Anthropic research post walks through the architecture: a tree of markdown files for long-horizon state, with Claude solving each task in a separate session.

Long-running Claude for scientific computing

Siddharth Mishra-Sharma (Anthropic, formerly astrophysics) demonstrated Claude Opus building a differentiable cosmological Boltzmann solver over several days of autonomous work: code that predicts CMB statistical properties with sub-percent agreement against the reference CLASS implementation.

The AI Scientist: fully automated research (Nature, 2026)

Sakana AI built a system that autonomously generates research ideas, searches literature, designs and runs experiments, writes papers in LaTeX, and performs its own peer review. A manuscript generated by the system passed peer review at a top-tier ML conference workshop. Published in Nature.

AI agents running quantum computing experiments (Patterns)

Shuxiang Cao (NVIDIA/Oxford), Zijian Zhang (Toronto) et al. introduced "k-agents," LLM-based agents that autonomously plan and execute experiments on a real superconducting quantum processor. Agents ran for hours, producing and characterizing entangled quantum states at human-level performance.

CellAtria: agentic AI for single-cell RNA-seq (npj AI)

Researchers at AstraZeneca published an agentic framework that automates the full lifecycle of single-cell RNA-seq data reuse through a chatbot interface. Uses a graph-based multi-actor architecture integrating an LLM with tool execution. Code is open-source.

Biomni: a general-purpose biomedical AI agent (Stanford)

Collected 150 tools, 105 software packages, and 59 databases from mining tens of thousands of publications across 25 subfields. Handles causal gene prioritization, drug repurposing, rare disease diagnosis, and molecular cloning without task-specific tuning. Used by 7,000+ labs. Won $1M for Alzheimer's research (Biomni-AD).

MozzareLLM: interpreting CRISPR knockouts at MIT

Iain Cheeseman's lab (Whitehead Institute / MIT) built a Claude-powered tool that interprets large-scale CRISPR gene knockout experiments. It takes gene clusters, identifies shared biological processes, flags understudied genes. In one case Claude correctly identified an RNA modification pathway that other models dismissed as noise. Code on GitHub.

Claude for hypothesis generation in cell biology (Stanford)

Emma Lundberg's lab at Stanford uses Claude to change how gene targets are selected for experiments. Instead of the traditional "educated guessing game," Claude analyzes molecular property maps to predict which genes should be involved in a specific process. They are benchmarking this against human researchers for a study on primary cilia.

MCPmed: MCP for bioinformatics discovery (Briefings in Bioinformatics)

A peer-reviewed paper and community project adapting the Model Context Protocol to bioinformatics web service backends. Enables LLMs to autonomously discover, invoke, and verify tools like GEO, STRING, and UCSC Cell Browser. Their GEOmcp prototype significantly outperforms manual keyword-based data discovery.

Claude vs ChatGPT for literature searches (Galaxy Project)

Anton Nekrutenko's group at Penn State tested Claude, ChatGPT, and GEO database searches for a literature review on Candida auris RNA-seq. Found zero overlap between Claude and ChatGPT results despite identical queries on the same day. Combined approach uncovered 100% more papers than any single method alone.

AlphaEvolve: autonomous algorithm discovery (DeepMind)

A Gemini-powered agent that proposes, tests, and refines code-based hypotheses through evolutionary search. Improved best-known solutions on 20% of 50+ open math problems. Discovered a new 48-multiplication algorithm for 4x4 complex matrix multiplication, beating a 56-year-old record.

Researcher writeups and tutorials

Claude Code for computational biology

Brian Naughton extracted a protein scoring function from one repository, re-implemented it inside another, and validated by diffing against the reference until every value matched. A clean demonstration of the "extract, re-implement, test against reference" pattern.

Claude Code for scientists

Patrick Mineault's guide to structuring a research project for Claude Code. Covers the data/raw, data/processed, data/generated folder layout, mamba + uv tooling conventions, and the CLAUDE.md patterns that keep the agent on track across sessions.

Claude Code for applied economists (Yale / Princeton)

Paul Goldsmith-Pinkham's seven-part series hosted through Princeton's Markus Academy. Covers data cleaning, web scraping SEC filings, large dataset handling with DuckDB/Parquet, spawning sub-agents, and using Skills to develop revision plans from referee reports. Probably the most comprehensive researcher-facing Claude Code tutorial for social science.

Claude Code for causal inference (Baylor)

Scott Cunningham's Substack regularly covers practical workflows for economists using coding agents. In one example he and Caitlin Myers used Claude Code to gather and process data for a study on travel distance to abortion clinics and its effect on marriages in Texas.

Creating your own research assistant with MCP

Aaron Tay chained a PubMed MCP server with Scite.ai to find papers that should have been cited by a given paper but were not. Shows how a general-purpose agent can compose tools nobody explicitly scripted it to combine.

Organizing your cloud storage with Claude Code

Chris Blattman (UChicago) catalogued 630,000 Dropbox files, freed 480 GB of disk, and wrote an onboarding guide for new lab members, all in a 4–6 hour session. The guide walks through seven phases from filesystem audit to Google Workspace cleanup.

A running diary of switching to AI agents for social science

Thomas Manandhar-Richardson's weekly-updated blog documenting his switch to Claude Code and Codex for all research work since October 2025. Key warning: "silent mistreatment of missing data" where agents drop null rows or impute zeros without notification.

How do scientists use Claude Code? (quantitative analysis)

Charles Yang used ORCID-linked GitHub profiles to measure Claude Code adoption among scientists: ~2.1% as of February 2026, with a U-shaped curve by career stage (early-career and senior scientists adopt most). Names specific power users at Cambridge, Lawrence Berkeley, UW-Madison, and UCSF.

Tools and repos

research30: weekly literature sweeps across five databases

Scott Handley's skill fires parallel queries at OpenAlex, PubMed, arXiv, Semantic Scholar, and HuggingFace Hub, scores results on relevance and recency, deduplicates, and writes a ranked markdown report to disk.

claude-for-bioinformatics: SOPs and guided tutorials

Also by Scott Handley: standard operating procedures, best practices, and a full guided RNA-seq tutorial (FastQC/MultiQC through alignment and differential expression) designed to be done conversationally inside Claude Code.

bioSkills: 438 bioinformatics skills for AI coding agents

SKILLS.md files that guide Claude Code, Codex, or Gemini through bioinformatics tasks. Covers single-cell preprocessing, spatial transcriptomics, proteomics (LC-MS), flow cytometry, population genetics, variant calling, and more.

Lobster: multi-agent bioinformatics through natural language

22 specialist agents across 10 installable packages for analyzing multi-omics data. Runs locally (patient data never leaves your machine), tracks provenance as reproducible Jupyter notebooks, and integrates with Claude Code as a Skill.

Paper2Agent: turning papers into interactive MCP servers

A Stanford framework that converts research papers into MCP servers with interactive tools, allowing natural-language interaction with a paper's methods. Applied to AlphaGenome, TISSUE, and Scanpy to let non-expert programmers test ideas and validate results. Code on GitHub.

paper-search-mcp: academic search across 20+ sources

An MCP server and Claude Code skill that searches and downloads papers from arXiv, PubMed, bioRxiv, medRxiv, Google Scholar, Semantic Scholar, Crossref, OpenAlex, SSRN, Zenodo, and more. Uses a free-first strategy prioritizing open data sources.

Institutional

Claude in life sciences

Anthropic's overview of how the Broad Institute, 10x Genomics, Schrödinger, FutureHouse, Novo Nordisk, and Sanofi are using Claude in genomics, drug discovery, and autonomous research.

10x Genomics + Anthropic: natural language single-cell analysis

10x Genomics integrated its Cloud Analysis pipelines with Claude via MCP, enabling researchers to perform single-cell and spatial biology analysis through natural language instead of code.