/research

Insights and blogs from the AI Group building Fin

What Does It Mean for a Model to ‘Think’? Reasoning, Recursion, and the Operator Design Space

Looped Transformers, thoughts on how LLMs Reason, and an overview of a recent paper we wrote.

Articles

23

Structured, Agentic RAG for Ecommerce

2026.05.07

Low-Rank Key Value Attention: Reducing KV Cache Memory and Maintaining Head Diversity

2026.04.09

Unsupervised Learning Meets Generative AI: Topic Modelling for Real-World Dialogue

Mariia Matskevichus

2026.03.30

Podcast EP2: Shipping reliable AI actions

2025.09.19

Podcast EP1: Closing the loop

2025.09.18

How We Built a World-Class Reranker for Fin

2025.09.11

Using LLMs as a Reranker for RAG: A Practical Guide

Ramil Yarullin, Fedor Parfenov

2025.09.11

Finetuning Retrieval for Fin

2025.09.11

David vs Goliath: are small LLMs any good?

Sagar Joglekar, Ramil Yarullin

2025.09.11

Building out Intercom’s AI infra

2025.09.11

“Was that helpful?” Understanding User Feedback in Customer Support AI Agents

Vinicius Ribeiro

2025.09.11

To escalate, or not to escalate, that is the question

2025.09.11

Building a Better Language Detection Model for Fin

2025.09.11

Cost of Serving LLMs

Stefan Ivanovici

2025.09.10

We Don’t Need Higher Peak Intelligence, Only More Intelligence Density

2025.08.20

Fin: Running a Reliable Service over Unreliable Parts

2025.08.08

Think Fast: Reasoning at 3ms a Token

2025.07.18

A Causal Inference Approach to Measuring the Impact of Improved RAG Content

2025.07.07

Generating Knowledge Center Content from Customer Service Conversations

2025.07.07

Do you really need a Vector Search Database?

2025.04.29

The Agency, Control, Reliability (ACR) Tradeoff for Agents

2025.04.11

An Actor-Critic Approach to Reduce Hallucinations

2025.04.10

Slower Feels Smarter? Experimenting with AI Agent Latency

2025.04.10