RL-Trained Recursive Language Models
The first models trained with reinforcement learning to develop autonomous recursive decomposition and cost-aware behavior — shipped as lightweight LoRA adapters that run on commodity hardware.
We build AI systems that accelerate scientific discovery — not by replacing the scientist, but by removing the architectural ceilings that prevent any single mind from holding an entire field at once.
Science now produces more knowledge than any mind — human or artificial — can fully hold. The frontier is not intelligence. It is memory. We are building the first AI systems that scale with the world's knowledge rather than being bounded by it.
MSc Computer Science, University of Twente. Visiting Researcher at ETH Zurich's Agentic Systems Lab. Previously Solutions Architect at AWS and Research Assistant at the University of Twente AI & IoT Lab.
My work focuses on post-training methods for large language models — reinforcement learning, parameter-efficient adaptation, and agentic systems. Anadromi Labs is where this research meets the real world.
Current AI tools for processing scientific literature all share the same fundamental limitation: a fixed-size context window. Feed them too much and they truncate, summarize away nuance, or hallucinate synthesis. The bottleneck is not intelligence — it is memory architecture.
Recursive Language Models offer a way out. Instead of forcing everything into a single window, RLMs treat the input as an external, programmable environment — recursively delegating sub-tasks to copies of themselves. The architecture was introduced in recent research, but only studied with prompting and distillation.
Our contribution is the first application of reinforcement learning to train RLMs. RL training unlocks two capabilities that prior work could not achieve: models that learn from their own decomposition mistakes, and cost-aware behavior where the system learns to balance synthesis quality against computational expenditure. The result processes corpora of any size without information loss — and improves through deployment.
Read the full research article →The first models trained with reinforcement learning to develop autonomous recursive decomposition and cost-aware behavior — shipped as lightweight LoRA adapters that run on commodity hardware.
Our first product: a system that can process an entire research field — millions of papers — without truncation, without sampling, without information loss. Auditable, reproducible, and improving with every run.
We're working with research institutions, clinical teams, and life sciences organisations to deploy AI-powered systematic review. If your work is bottlenecked by the scale of existing literature, we want to hear from you.