Integration · Qdrant

Self-hosted-or-managed vector DB with Rust performance — parsr chunks + Qdrant + payload filters.

Qdrant (~22K GitHub stars) is the leading Rust-built vector database, optimised for production-scale recall. Two flavours: Qdrant Cloud (managed, EU regions including Frankfurt) and Qdrant self-hosted via Docker. The pull for parsr customers is two-fold: payload filters (Qdrant's term for metadata) compose powerfully with HNSW search via the `must`/`should`/`must_not` query language, and self-hosting gives EU compliance teams a path to vector RAG with zero third-party data residency. Qdrant's `quantization` modes (scalar, product, binary) are the fastest path to fitting >10M finance-document chunks on a single box.

Get an API key Read the docs →

Install

One command

pip install parsr-sdk qdrant-client openai

Code

Working sample

Qdrant integrationcode

from parsr_sdk import AsyncParsr
from qdrant_client import AsyncQdrantClient

parsr = AsyncParsr(api_key="sk_eu_live_...")
qd = AsyncQdrantClient(host="localhost", port=6333)

What you get

Highlights

Rust core — sub-5ms p95 retrieval at 10M+ vectors on a single box
Self-hosted via Docker keeps data in your VPC — strongest EU compliance story
Payload filters compose with HNSW (must/should/must_not) for complex finance queries
Quantization (scalar/product/binary) trades a few % recall for 4–32x storage savings
Qdrant Cloud Frankfurt + parsr EU = managed end-to-end EU residency

Architecture

How the pieces fit

One Qdrant collection per doc_type (or a single collection with a doc_type payload field). parsr.parse(*, include_chunks=true) → embed each chunk → upsert into the collection with the chunk's text + metadata as `payload`. Query via `query_points` with the question embedding + a payload filter on org_id and doc_type.

Quickstart

End-to-end example

Parse a document with `include_chunks=true`, embed each chunk, upsert into Qdrant, query.

parsr → embed → Qdrant → querypython

import os
from parsr_sdk import AsyncParsr
from qdrant_client import AsyncQdrantClient
from qdrant_client.models import (
    Distance, VectorParams, PointStruct, Filter, FieldCondition, MatchValue,
)
from openai import AsyncOpenAI

parsr = AsyncParsr(api_key=os.environ["PARSR_API_KEY"])
qd = AsyncQdrantClient(
    url=os.environ["QDRANT_URL"], api_key=os.environ.get("QDRANT_API_KEY")
)
openai = AsyncOpenAI()

COLLECTION = "parsr-invoices"
DIM = 1536

# 1. Collection setup (run once).
if not await qd.collection_exists(COLLECTION):
    await qd.create_collection(
        collection_name=COLLECTION,
        vectors_config=VectorParams(size=DIM, distance=Distance.COSINE),
    )

# 2. Parse + chunks.
result = await parsr.parse_invoice(
    document_url="https://files.example.com/invoice.pdf",
    include_chunks=True,
    chunking={"strategy": "block"},
)

# 3. Embed + upsert. UUIDs from chunk.id keep upserts idempotent.
texts = [c.text for c in result.chunks]
embeds = await openai.embeddings.create(model="text-embedding-3-small", input=texts)
points = [
    PointStruct(
        id=c.id,
        vector=e.embedding,
        payload={
            "text": c.text,
            "org_id": "org_acme",
            "doc_type": c.metadata.get("doc_type", "invoice"),
            "page_numbers": c.page_numbers,
            "section": c.metadata.get("section", ""),
        },
    )
    for c, e in zip(result.chunks, embeds.data)
]
await qd.upsert(collection_name=COLLECTION, points=points)

# 4. Filtered semantic query.
question = "Largest line item"
qe = await openai.embeddings.create(
    model="text-embedding-3-small", input=[question]
)
hits = await qd.query_points(
    collection_name=COLLECTION,
    query=qe.data[0].embedding,
    query_filter=Filter(
        must=[
            FieldCondition(key="org_id", match=MatchValue(value="org_acme")),
            FieldCondition(key="doc_type", match=MatchValue(value="invoice")),
        ]
    ),
    limit=3,
)
for h in hits.points:
    print(h.score, h.payload["section"], h.payload["page_numbers"])

Cost

What you'll actually pay

Qdrant Cloud Free covers up to 1 GB / ~250K vectors for development. Production EU starts ~€25/mo for 4 GB. Self-hosted on a CAX21 box is ~€7/mo and supports up to ~5M unquantized 1536-dim vectors comfortably. parsr cost is unchanged. Qdrant is the cheapest path to >10M-vector production deployments thanks to quantization — binary quantization can drop storage 32x with <2% recall loss on finance docs.

Performance

Tuning tips

Enable scalar quantization (or product, or binary) once you cross ~1M vectors — recall stays within 1-2% on finance docs but RAM drops 4–32x
Index payload fields you filter on (`create_payload_index` for org_id, doc_type) — without it, large filters degrade to full scans
Set `hnsw_config.m=16` and `ef_construct=100` for finance docs (defaults are tuned for general-purpose); recall improves measurably
Use named vectors when you mix embedding models (e.g., a `dense` and a `sparse` vector per chunk for hybrid retrieval)

Three lines and you're calling parsr from Qdrant.

Start building