Integration · Pinecone
Index parsr-extracted financial documents in Pinecone for sub-100ms RAG retrieval.
Pinecone is the most-deployed managed vector database for production RAG (>10K paying customers as of 2026). The combination is common: parsr extracts structured fields from a bank statement or invoice with `include_chunks=true`, you embed each chunk with OpenAI / Voyage / Cohere, upsert into a Pinecone index keyed by your customer's org, and query with semantic similarity at request time. Pinecone Serverless launched general availability in early 2025; it's the right starting point for fintechs and AI bookkeeping agents who don't want to manage capacity. parsr's chunk metadata (doc_type, page_numbers, section) maps directly onto Pinecone metadata filters — query for `doc_type=invoice AND month=2026-04` and get just the relevant transactions back. EU-residency note: Pinecone hosts in `eu-west-1` (AWS Frankfurt); pair with parsr EU keys to keep the whole pipeline EU-only.
Install
One command
pip install parsr-sdk pinecone openaiCode
Working sample
from parsr_sdk import AsyncParsr
from pinecone import Pinecone
parsr = AsyncParsr(api_key="sk_eu_live_...")
pc = Pinecone(api_key="...")
index = pc.Index("invoices")
result = await parsr.parse_invoice(
document_url="https://files.example.com/invoice.pdf",
include_chunks=True,
)What you get
Highlights
- Per-chunk metadata (doc_type, page_numbers, section) maps to Pinecone metadata filters
- Pinecone Serverless EU region + parsr EU keys = end-to-end EU residency
- Sub-100ms p95 retrieval at production scale
- Hybrid search (sparse + dense) supported via Pinecone Inference
- Idempotent upserts via parsr chunk.id keys for safe retries
Architecture
How the pieces fit
parsr.parse(*, include_chunks=true) → list[Chunk] with text + metadata → embed each chunk's text via your embedding provider (OpenAI text-embedding-3-small is the default starting point) → pinecone.upsert(vectors=[(chunk.id, embedding, metadata)]) → query with index.query(vector=question_embedding, top_k=5, filter={'doc_type': 'invoice'}). Each chunk carries doc_type + page_numbers + section so retrieved hits can cite the source page back to the customer.
Quickstart
End-to-end example
Parse a document with `include_chunks=true`, embed each chunk, upsert into Pinecone, query.
import os
from parsr_sdk import AsyncParsr
from pinecone import Pinecone, ServerlessSpec
from openai import AsyncOpenAI
parsr = AsyncParsr(api_key=os.environ["PARSR_API_KEY"])
pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
openai = AsyncOpenAI()
INDEX = "parsr-invoices"
DIM = 1536 # text-embedding-3-small
if INDEX not in pc.list_indexes().names():
pc.create_index(
name=INDEX,
dimension=DIM,
metric="cosine",
spec=ServerlessSpec(cloud="aws", region="eu-west-1"),
)
index = pc.Index(INDEX)
# 1. Parse with chunks for RAG ingestion.
result = await parsr.parse_invoice(
document_url="https://files.example.com/invoice.pdf",
include_chunks=True,
chunking={"strategy": "block"},
)
if result.status != "succeeded":
raise RuntimeError("parse failed")
# 2. Embed every chunk in one batch.
texts = [c.text for c in result.chunks]
embeds = await openai.embeddings.create(model="text-embedding-3-small", input=texts)
vectors = [
(c.id, e.embedding, {**c.metadata, "page_numbers": c.page_numbers})
for c, e in zip(result.chunks, embeds.data)
]
# 3. Upsert into Pinecone, namespaced by customer.
index.upsert(vectors=vectors, namespace=f"org_{result.result['org_id']}")
# 4. Semantic query at request time.
question = "What's the largest line item on this invoice?"
qe = await openai.embeddings.create(
model="text-embedding-3-small", input=[question]
)
hits = index.query(
vector=qe.data[0].embedding,
top_k=3,
namespace=f"org_{result.result['org_id']}",
filter={"doc_type": "invoice"},
include_metadata=True,
)
for hit in hits["matches"]:
print(hit["score"], hit["metadata"]["section"], "page", hit["metadata"]["page_numbers"])Cost
What you'll actually pay
Pinecone Serverless bills per write + per query + per stored vector. A typical EU fintech indexing 10K invoices (~5 chunks each = 50K vectors) at 1 query/sec sustained sees ~$50/mo at the time of writing; check pinecone.io/pricing for current rates. parsr cost for the 10K-invoice ingest at the Growth plan is €99 (5K pages included) + €30 overage on 5K extra invoice-pages, so the ingestion cost is ≈€130 one-shot. The combined pipeline stays under €200/mo for most production fintechs.
Performance
Tuning tips
- Use chunking strategy 'block' for invoices/receipts — one chunk per line item gives the cleanest retrieval boundaries
- Namespace by customer org_id so multi-tenant queries don't fan out across all customers' data
- Pin embeddings to text-embedding-3-small (1536 dims) before scaling to text-embedding-3-large; the small variant is 5x cheaper and rarely the bottleneck for finance docs
- Batch parsr-chunk embeddings into single OpenAI requests (up to 2048 inputs / call) to amortise round-trip latency
Three lines and you're calling parsr from Pinecone.
Start building