Open Source RAG Framework

The RAG framework for grounded AI in production

Larkup RAG makes it easy to create fully functional RAG applications from ingestion to deployment. Connect any model to any source with a single typed pipeline ,fully observable, framework agnostic, and open source.

Get Started

Plug into every model & vector store

AWSAWS
AzureAzure
ChromaChroma
CohereCohere
DeepSeekDeepSeek
Digital OceanDigital Ocean
DockerDocker
GCPGCP
GeminiGemini
GitHubGitHub
GoogleGoogle
GroqGroq
HetznerHetzner
JinaJina
LanceDBLanceDB
MilvusMilvus
MistralMistral
NomicNomic
OpenAIOpenAI
PGVectorPGVector
PineconePinecone
QdrantQdrant
QwenQwen
SupabaseSupabase
VercelVercel
VoyageVoyage
WeaviateWeaviate
xAIxAI
AWSAWS
AzureAzure
ChromaChroma
CohereCohere
DeepSeekDeepSeek
Digital OceanDigital Ocean
DockerDocker
GCPGCP
GeminiGemini
GitHubGitHub
GoogleGoogle
GroqGroq
HetznerHetzner
JinaJina
LanceDBLanceDB
MilvusMilvus
MistralMistral
NomicNomic
OpenAIOpenAI
PGVectorPGVector
PineconePinecone
QdrantQdrant
QwenQwen
SupabaseSupabase
VercelVercel
VoyageVoyage
WeaviateWeaviate
xAIxAI

Everything in the box

A complete retrieval engine, not just a wrapper

Larkup RAG ships the primitives you actually need to take RAG from prototype to production — typed, composable, and observable end to end.

Typed retrieval pipeline

Compose ingestion, chunking, embedding, retrieval, and reranking as type-safe steps. Swap any stage without touching the rest.

pipeline.ts
const pipeline = createPipeline({
   embed: openai("text-embedding-3"),
   store: pgvector(db),
   indexType:"hybrid",
})

await pipeline.query(question)
42ms

median retrieval latency at p50, fully cached

Hybrid search

Dense + sparse fusion with built-in reranking for higher recall.

10+ vector stores

Pinecone, pgvector, Qdrant, LanceDB, Weaviate, Chroma and more — swap with one line.

Deploy anywhere

Ship a standalone Node server from the CLI — includes Dockerfile, docker-compose, and vercel.json.

MIT Licensed

Fully open source, self-host on your own infra. No paywalls, no feature limits.

How it works

From raw documents to grounded answers

Four typed stages, one pipeline. Pick a step to see what Larkup RAG does under the hood.

Configure — Embedding & Vector Store
Embedding model
🤖

OpenAI

1536 dims · 8k tokens

active
Vector store
🗄️

Pinecone

Connection verified

Connected
Index type
Lexical
Semantic
Hybrid

FAQ

Frequently asked questions

Everything you need to know about Larkup RAG. Can't find an answer? Talk to our team.

Ready to build your RAG app?

Whether you're wiring up a custom knowledge-base bot, an agentic research assistant, or a fully observable production pipeline — Larkup RAG gets you there, fast.