The RAG framework for grounded AI in production
Larkup RAG makes it easy to create fully functional RAG applications from ingestion to deployment. Connect any model to any source with a single typed pipeline ,fully observable, framework agnostic, and open source.
Plug into every model & vector store
Everything in the box
A complete retrieval engine, not just a wrapper
Larkup RAG ships the primitives you actually need to take RAG from prototype to production — typed, composable, and observable end to end.
Typed retrieval pipeline
Compose ingestion, chunking, embedding, retrieval, and reranking as type-safe steps. Swap any stage without touching the rest.
const pipeline = createPipeline({
embed: openai("text-embedding-3"),
store: pgvector(db),
indexType:"hybrid",
})
await pipeline.query(question)median retrieval latency at p50, fully cached
Hybrid search
Dense + sparse fusion with built-in reranking for higher recall.
10+ vector stores
Pinecone, pgvector, Qdrant, LanceDB, Weaviate, Chroma and more — swap with one line.
Deploy anywhere
Ship a standalone Node server from the CLI — includes Dockerfile, docker-compose, and vercel.json.
MIT Licensed
Fully open source, self-host on your own infra. No paywalls, no feature limits.
How it works
From raw documents to grounded answers
Four typed stages, one pipeline. Pick a step to see what Larkup RAG does under the hood.
FAQ
Frequently asked questions
Everything you need to know about Larkup RAG. Can't find an answer? Talk to our team.
Ready to build your RAG app?
Whether you're wiring up a custom knowledge-base bot, an agentic research assistant, or a fully observable production pipeline — Larkup RAG gets you there, fast.
