Enterprise APIThe AI engine behind the #1 AI agent for customer service.
Now available as an API.

Purpose-built models trained specifically for customer service. From full agent orchestration to granular retrieval and reranking endpoints — integrate Fin's AI infrastructure into your own product.

Fin API hero image
85%
Resolution rate
+20pp
Retrieval precision vs. baselines
+17.5%
Reranker MAP vs. Cohere v3.5
98%+
Routing accuracy
up to80%
Cost reduction vs. general LLMs
API Products

Choose the level to integrate at — turnkey agent to bare-metal.

HIGH LEVEL
Fin Agent APIFull AI agent orchestration. Send a conversation, get a resolution. Fin handles retrieval, reasoning, response generation, and escalation decisions — end to end.
  • End-to-end conversation resolution

  • Multi-turn context management

  • Intelligent escalation routing

  • Brand voice & policy enforcement

  • Source citation & accuracy validation

POST /V1/AGENT/RESOLVE
mid level
Fin RAG Pipeline APIThe full retrieval-augmented generation pipeline as a single call. Retrieve, rerank, and generate — using Fin's purpose-built models at each stage.
  • Semantic retrieval across your knowledge base

  • Cross-encoder reranking for precision

  • Grounded answer generation with citations

  • Configurable retrieval depth (K=5 to K=40)

  • Multilingual query support

POST /v1/rag/query
COMPONENT LEVEL
Fin Component APIsIndividual endpoints for each stage of the pipeline. Embed Fin's proprietary models directly into your own architecture, exactly where you need them.
  • Retrieval/V1/RETRIEVE
  • Reranking/V1/RERANK
  • Embeddings/V1/EMBED
  • Escalation Routing/V1/ROUTE
  • Answer Validation/V1/VALIDATE
MIX & MATCH TO FIT YOUR STACK
Under the hood

Purpose-built models, trained on real customer service data.

02Every component in the Fin AI Engine is a custom model fine-tuned on hundreds of millions of real support interactions. Not general-purpose. Not off-the-shelf. Built for this domain.

Phase 1

Query Refinement

Incoming queries are optimized before they hit the retrieval pipeline. Context from the full conversation history, user metadata, and timestamp are distilled into a semantically rich search query — maximizing retrieval precision from the first hop.

Phase 2

Semantic Retrieval

Our proprietary fin-cx-retrieval model — an embedding model trained on signal extracted from millions of Fin retrievals — scans your knowledge base and returns the top 40 candidate documents by semantic similarity.

Training uses InfoNCE contrastive loss with hard positives and hard negatives mined from production query logs — teaching the model the nuances of customer support language across industries and languages.

fin-cx-retrieval performance
Precision72.3%
Recall@596.5%
vs. Voyage-Large-3+20pp precision
Phase 3

Cross-Encoder Reranking

The fin-cx-reranker has an 8,192 token context window, using ModernBERT-large as a building block. It reorders the 40 candidates by true relevance, trained using RankNet loss across 10s of millions of passage pairs.

Replaces other commercial reranking services with lower latency, higher throughput, and significantly higher quality on customer service content.

fin-cx-reranker vs. Cohere v3.5
MAP+17.5%
NDCG@10+16.7%
Recall@10+13.1%
P50 Latency150ms
Phase 4

Escalation Routing

A multi-task ModernBERT classifier making three simultaneous predictions: escalation decision (3-way classification), reason categorization, and guideline citation (multi-label). Trained on millions of multilingual examples.

Achieves >98% accuracy at a competitive cost and a fraction of the latency of larger models often used for this task.

Routing Model
Accuracy>98%
Training examples4M
LanguagesMultilingual
Phase 5

Generation & Validation

The top-ranked documents, full conversation context, user metadata, and brand guidelines are synthesized into a precise, cited response. A dedicated validation layer then checks for accuracy, hallucination, tone alignment, and policy compliance before the response is returned. Every answer is grounded in your content — never fabricated.

Benchmarks

Industry leading performance for customer experience. Measured.

Retrieval: fin-cx-retrieval vs. baselines
fin-cx-retrieval72.3%
Voyage-Large-354.6%
Stella 1.5B51.2%
Snowflake Arctic-2 (base)44.8%
BGE-Large (prev. production)36.2%
Reranking: fin-cx-reranker vs. Cohere v3.5
Metricfin-cx-rerankerCohere v3.5Delta
MAP0.6120.521+17.5%
NDCG@100.6650.570+16.7%
Recall@100.7200.636+13.1%
Kendall tau0.4000.326+22.7%
Validated in production
Every model undergoes a three-stage evaluation: static benchmarks on held-out datasets, production evaluation on out-of-distribution conversations, and live A/B testing across millions of real conversations.All improvements are validated at p<0.01 before deployment. We ship models that win in the real world, not just on leaderboards.
Enterprise grade

Built for trust and isolation.

Data Isolation

Per-workspace document isolation. No cross-tenant access. Query vectors discarded immediately after retrieval.

Secure Infrastructure

All fine-tuning on secure AWS infrastructure. No third-party data exposure. Per-passage independent embedding.

Audit & Transparency

Every routing decision is logged with reason categorization and guideline citation. Full explainability for compliance.

Research

We publish our work. Read the technical details.

Enterprise Inquiry

Integrate the Fin AI Engine
into your product.

The Fin API is available to qualified enterprise partners building AI-powered customer experiences. Tell us about your use case and we'll be in touch.

  • Custom pricing based on volume and endpoints

  • Dedicated onboarding and integration support

  • SLA-backed uptime guarantees

  • Enterprise security review available

We'll respond within 2 business days. Enterprise buyers only.