parsr.

Integration · Pinecone

Index parsr-extracted financial documents in Pinecone for sub-100ms RAG retrieval.

Pinecone is the most-deployed managed vector database for production RAG (>10K paying customers as of 2026). The combination is common: parsr extracts structured fields from a bank statement or invoice with `include_chunks=true`, you embed each chunk with OpenAI / Voyage / Cohere, upsert into a Pinecone index keyed by your customer's org, and query with semantic similarity at request time. Pinecone Serverless launched general availability in early 2025; it's the right starting point for fintechs and AI bookkeeping agents who don't want to manage capacity. parsr's chunk metadata (doc_type, page_numbers, section) maps directly onto Pinecone metadata filters — query for `doc_type=invoice AND month=2026-04` and get just the relevant transactions back. EU-residency note: Pinecone hosts in `eu-west-1` (AWS Frankfurt); pair with parsr EU keys to keep the whole pipeline EU-only.

Install

One command

pip install parsr-sdk pinecone openai

Code

Working sample

Pinecone integrationcode
from parsr_sdk import AsyncParsr
from pinecone import Pinecone

parsr = AsyncParsr(api_key="sk_eu_live_...")
pc = Pinecone(api_key="...")
index = pc.Index("invoices")

result = await parsr.parse_invoice(
    document_url="https://files.example.com/invoice.pdf",
    include_chunks=True,
)

What you get

Highlights

  • Per-chunk metadata (doc_type, page_numbers, section) maps to Pinecone metadata filters
  • Pinecone Serverless EU region + parsr EU keys = end-to-end EU residency
  • Sub-100ms p95 retrieval at production scale
  • Hybrid search (sparse + dense) supported via Pinecone Inference
  • Idempotent upserts via parsr chunk.id keys for safe retries

Architecture

How the pieces fit

parsr.parse(*, include_chunks=true) → list[Chunk] with text + metadata → embed each chunk's text via your embedding provider (OpenAI text-embedding-3-small is the default starting point) → pinecone.upsert(vectors=[(chunk.id, embedding, metadata)]) → query with index.query(vector=question_embedding, top_k=5, filter={'doc_type': 'invoice'}). Each chunk carries doc_type + page_numbers + section so retrieved hits can cite the source page back to the customer.

Quickstart

End-to-end example

Parse a document with `include_chunks=true`, embed each chunk, upsert into Pinecone, query.

parsr → embed → Pinecone → querypython
import os
from parsr_sdk import AsyncParsr
from pinecone import Pinecone, ServerlessSpec
from openai import AsyncOpenAI

parsr = AsyncParsr(api_key=os.environ["PARSR_API_KEY"])
pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
openai = AsyncOpenAI()

INDEX = "parsr-invoices"
DIM = 1536  # text-embedding-3-small

if INDEX not in pc.list_indexes().names():
    pc.create_index(
        name=INDEX,
        dimension=DIM,
        metric="cosine",
        spec=ServerlessSpec(cloud="aws", region="eu-west-1"),
    )
index = pc.Index(INDEX)

# 1. Parse with chunks for RAG ingestion.
result = await parsr.parse_invoice(
    document_url="https://files.example.com/invoice.pdf",
    include_chunks=True,
    chunking={"strategy": "block"},
)
if result.status != "succeeded":
    raise RuntimeError("parse failed")

# 2. Embed every chunk in one batch.
texts = [c.text for c in result.chunks]
embeds = await openai.embeddings.create(model="text-embedding-3-small", input=texts)
vectors = [
    (c.id, e.embedding, {**c.metadata, "page_numbers": c.page_numbers})
    for c, e in zip(result.chunks, embeds.data)
]

# 3. Upsert into Pinecone, namespaced by customer.
index.upsert(vectors=vectors, namespace=f"org_{result.result['org_id']}")

# 4. Semantic query at request time.
question = "What's the largest line item on this invoice?"
qe = await openai.embeddings.create(
    model="text-embedding-3-small", input=[question]
)
hits = index.query(
    vector=qe.data[0].embedding,
    top_k=3,
    namespace=f"org_{result.result['org_id']}",
    filter={"doc_type": "invoice"},
    include_metadata=True,
)
for hit in hits["matches"]:
    print(hit["score"], hit["metadata"]["section"], "page", hit["metadata"]["page_numbers"])

Cost

What you'll actually pay

Pinecone Serverless bills per write + per query + per stored vector. A typical EU fintech indexing 10K invoices (~5 chunks each = 50K vectors) at 1 query/sec sustained sees ~$50/mo at the time of writing; check pinecone.io/pricing for current rates. parsr cost for the 10K-invoice ingest at the Growth plan is €99 (5K pages included) + €30 overage on 5K extra invoice-pages, so the ingestion cost is ≈€130 one-shot. The combined pipeline stays under €200/mo for most production fintechs.

Performance

Tuning tips

  • Use chunking strategy 'block' for invoices/receipts — one chunk per line item gives the cleanest retrieval boundaries
  • Namespace by customer org_id so multi-tenant queries don't fan out across all customers' data
  • Pin embeddings to text-embedding-3-small (1536 dims) before scaling to text-embedding-3-large; the small variant is 5x cheaper and rarely the bottleneck for finance docs
  • Batch parsr-chunk embeddings into single OpenAI requests (up to 2048 inputs / call) to amortise round-trip latency

Three lines and you're calling parsr from Pinecone.

Start building