parsr.

Specialist parser

Invoices to structured data, multi-format

Parse PDFs and emailed invoices into structured JSON. Line-item parsing, multi-currency, EU and US tax formats. Validates line-item math and totals reconciliation. EU residency by default.

parse-invoice.shbash
curl -X POST https://eu-api.tryparsr.dev/v1/parse/invoice \
  -H "Authorization: Bearer $PARSR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document_url": "https://example.com/inv-2026-04-118.pdf",
    "wait": 60
  }'

Format coverage

SMB cloud, ERP, DACH, FR, Benelux

Languages

30+

Avg latency

~2.6s p50

Field accuracy

94%+ on structured PDFs

What we extract

Every field, with confidence and citations

Every field comes back with a confidence score in [0,1] and a normalized bounding box on the source page. Line items are extracted as an array — quantity, unit price, tax, line total — and the response includes a math-validated totals block your agent can trust.

Input

Anonymized April 2026 reverse-charge invoice — German vendor billing a French customer in EUR (B2B intra-community supply, VAT 0%)

Anonymized invoices preview
response.jsonjson
{
  "schema_version": "invoice.v1",
  "result": {
    "invoice_number": "RE-2026-04-118",
    "issue_date": "2026-04-12",
    "due_date":   "2026-05-12",
    "vendor": {
      "name": "Müller Industriebedarf GmbH",
      "address": "Schillerstraße 14, 70173 Stuttgart, Germany",
      "vat_number": "DE123456789"
    },
    "customer": {
      "name": "Atelier Dupont SARL",
      "address": "12 rue de Rivoli, 75001 Paris, France",
      "vat_number": "FR40123456824"
    },
    "currency": "EUR",
    "line_items": [
      {
        "description": "CNC spindle bearings, 6204-2RS",
        "quantity": "40",
        "unit_price":  { "amount": "12.50", "currency": "EUR" },
        "tax_rate":    "0.00",
        "line_total":  { "amount": "500.00", "currency": "EUR" },
        "confidence": 0.96,
        "bbox": { "page": 1, "x": 0.06, "y": 0.41, "w": 0.88, "h": 0.022 }
      },
      {
        "description": "Express shipping (DHL Economy Select)",
        "quantity": "1",
        "unit_price":  { "amount": "48.00", "currency": "EUR" },
        "tax_rate":    "0.00",
        "line_total":  { "amount": "48.00", "currency": "EUR" },
        "confidence": 0.94,
        "bbox": { "page": 1, "x": 0.06, "y": 0.46, "w": 0.88, "h": 0.022 }
      }
    ],
    "subtotal": { "amount": "548.00", "currency": "EUR" },
    "tax_breakdown": [
      { "rate": "0.00", "amount": { "amount": "0.00", "currency": "EUR" }, "note": "Reverse-charge — Art. 196 VAT Directive 2006/112/EC" }
    ],
    "total": { "amount": "548.00", "currency": "EUR" },
    "payment_terms": "Net 30 — IBAN DE89370400440532013000",
    "validation": {
      "line_item_sum_match": {
        "valid": true,
        "computed_subtotal": "548.00",
        "declared_subtotal": "548.00",
        "diff": "0.00",
        "tolerance": "0.01"
      },
      "totals_reconcile": {
        "valid": true,
        "computed_total": "548.00",
        "declared_total": "548.00",
        "diff": "0.00",
        "tolerance": "0.01"
      }
    }
  },
  "field_metadata": {
    "vendor.vat_number":   { "confidence": 0.99, "format_valid": true },
    "customer.vat_number": { "confidence": 0.98, "format_valid": true },
    "total.amount":        { "confidence": 0.99 }
  }
}
FieldTypeDescriptionConf. typical
invoice_numberstringVendor-issued invoice identifier as printed. Not normalized — preserved exactly to match vendor records.98%
issue_date / due_datedate (ISO 8601)Issue and due dates normalized to ISO. due_date is null when payment terms are stated only as text (e.g. 'on receipt').97%
vendorobject { name, address, vat_number }Issuing party. vat_number is format-checked (country prefix + length + checksum where defined). Format failures are surfaced in field_metadata.97%
customerobject { name, address, vat_number? }Bill-to party. vat_number is optional — present on B2B invoices, omitted on B2C.96%
currencystring (ISO 4217)Invoice currency. Multi-currency invoices (rare) return per-line currency and surface fx_rate when present.99%
line_items[]array of LineItemPer-line description, quantity, unit_price, tax_rate, line_total, confidence, bbox. Quantity is a string (preserves '1.5 kg' style units).95%
subtotal / totalmoney { amount, currency }Net subtotal (pre-tax) and gross total (post-tax). Both feed the validation block.98%
tax_breakdown[]array of { rate, amount, note? }Per-rate breakdown — handles split VAT (e.g. 21% + 6% on a Belgian invoice) and reverse-charge (rate 0% with note).96%
payment_termsstringFree-text payment instructions as printed. IBAN, BIC, and 'Net N' patterns are surfaced in field_metadata when detected.93%
validation.line_item_sum_matchobjectΣ line_items.line_total = subtotal, within 1-cent tolerance. valid=false flags missing or duplicated lines.100%
validation.totals_reconcileobjectsubtotal + Σ tax_breakdown.amount = total, within 1-cent tolerance. valid=false flags tax-math errors or hidden surcharges.100%

Domain-specific validation

What makes this a specialist

Line-item sum matches subtotal

Σ(line_items.line_total) = declared subtotal, within a 1-cent tolerance for rounding. Catches missing lines, duplicated lines, and OCR drops on long multi-page invoices — the failure modes that silently corrupt accounts payable pipelines.

validation.line_item_sum_match.valid

exampleInvoice declares subtotal 548.00 EUR; computed sum from extracted line items is 500.00 EUR. diff: 48.00 — the shipping line on page 2 was missed by the model. AP automation rejects, queues for review.

Totals reconcile (subtotal + tax = total)

subtotal + Σ(tax_breakdown.amount) = declared total, within a 1-cent tolerance. Catches tax-math errors, hidden surcharges, and the classic 'rounded subtotal but unrounded total' inconsistency that still ships from older ERP exports.

validation.totals_reconcile.valid

exampleSubtotal 1000.00 + VAT 21% (210.00) = computed 1210.00, but declared total is 1215.00 EUR. diff: 5.00 — likely an unprinted handling fee. AP flags before posting to the ledger.

VAT number format check

Extracted VAT numbers are checked against the per-country format spec (country prefix + length + checksum where defined: BE mod-97, NL mod-11, DE/FR length-only). Format failures are surfaced in field_metadata, not silently dropped. We do NOT validate against VIES — that's a runtime EU service call.

field_metadata.vendor.vat_number.format_valid

exampleExtracted 'BE 0123.456.788' — country code valid, length valid, but mod-97 checksum fails. Returned as format_valid=false; downstream pipeline schedules a VIES check before payment.

Format coverage

Tested across the invoice formats your customers actually send

Six format families · 30+ accounting/invoicing systems · structured PDFs and email-attached scans

SMB cloud accounting

  • Xero
  • QuickBooks
  • FreshBooks
  • Wave
  • Zoho Books
  • Stripe Invoicing
  • GoCardless

ERP

  • SAP
  • NetSuite
  • Sage Intacct
  • Microsoft Dynamics 365
  • Odoo
  • Oracle Fusion

DACH

  • DATEV
  • lexoffice
  • sevDesk
  • BillBee
  • FastBill
  • Buchhaltungsbutler

France

  • Sage Compta / Paie
  • Cegid Quadra
  • EBP
  • Ciel
  • Pennylane
  • Tiime

Benelux

  • Yuki
  • Exact Online
  • e-Boekhouden.nl
  • Octopus
  • Teamleader
  • Billit

Custom / template-based

  • LaTeX invoice templates
  • Word / Pages exports
  • Notion / Coda PDFs
  • Hand-built HTML→PDF
  • Apple Numbers
  • Google Docs

Code recipes

From document to JSON in five lines

parse.shbash
curl -X POST https://eu-api.tryparsr.dev/v1/parse/invoice \
  -H "Authorization: Bearer $PARSR_API_KEY" \
  -H "Idempotency-Key: $(uuidgen)" \
  -H "Content-Type: application/json" \
  -d '{
    "document_url": "https://example.com/inv-2026-04-118.pdf",
    "wait": 60
  }'
parse_invoice.pypython
import os, uuid, httpx

resp = httpx.post(
    "https://eu-api.tryparsr.dev/v1/parse/invoice",
    headers={
        "Authorization": f"Bearer {os.environ['PARSR_API_KEY']}",
        "Idempotency-Key": str(uuid.uuid4()),
    },
    json={
            "document_url": "https://example.com/inv-2026-04-118.pdf",
        "wait": 60,
    },
    timeout=70,
)
result = resp.json()["result"]

sums = result["validation"]["line_item_sum_match"]
totals = result["validation"]["totals_reconcile"]
if not (sums["valid"] and totals["valid"]):
    raise ValueError(
        f"Invoice math failed — line-sum diff {sums['diff']}, totals diff {totals['diff']}"
    )

for li in result["line_items"]:
    print(li["quantity"], li["description"], li["line_total"]["amount"])
print("TOTAL", result["total"]["amount"], result["currency"])
parseInvoice.tstypescript
const resp = await fetch("https://eu-api.tryparsr.dev/v1/parse/invoice", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${process.env.PARSR_API_KEY}`,
    "Content-Type": "application/json",
    "Idempotency-Key": crypto.randomUUID(),
  },
  body: JSON.stringify({
    document_url: "https://example.com/inv-2026-04-118.pdf",
    wait: 60,
  }),
});
const { result } = await resp.json();

const { line_item_sum_match, totals_reconcile } = result.validation;
if (!line_item_sum_match.valid || !totals_reconcile.valid) {
  throw new Error(
    `invoice math failed — line-sum diff ${line_item_sum_match.diff}, totals diff ${totals_reconcile.diff}`,
  );
}

for (const li of result.line_items) {
  console.log(li.quantity, li.description, li.line_total.amount);
}
agent.pypython
from langchain_parsr import ParsrToolkit
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent

tools = ParsrToolkit.from_env().get_tools()
agent = create_react_agent(ChatOpenAI(model="gpt-4o"), tools)

result = await agent.ainvoke({
    "messages": [(
        "user",
        "Parse this supplier invoice, confirm the line-item math reconciles, "
        "and return a single sentence summary plus the total: "
        "https://example.com/inv-2026-04-118.pdf"
    )]
})
print(result["messages"][-1].content)

Compared

How parsr's invoice parsing compares

VendorPricing per pageEU residencyConfidence + bboxLine-item math validationMulti-format coverage
parsrfrom €0.022Default (eu-api region-bound key)Per field, per line itemYes — line-sum + totals reconcileSMB cloud, ERP, DACH, FR, Benelux, custom
Mindee~$0.10 (Pro tier)Pro tier+ only (€179/mo entry)YesNo — extraction onlyPublished 90%+ across 50+ countries (their flagship)
Veryfi$500/mo minimumNoYesNoUS-focused; weaker EU/DACH
AffindaQuote-basedRegion-selectable, paid tierYesNoStrong on resumes; invoice templates editable
ReductoCustomGrowth tier (custom)PartialNoGeneral-purpose document parser

Vendor changes their template? 48 hours.

We don't train models — we curate prompts, schemas, validators, and fixture tests. A new accounting tool, a Q1 layout change from your top supplier, or a freshly mandated e-invoicing format goes live in two business days. Mindee's pre-trained models take months to add new formats. Email a sample (anonymized fine) and we'll confirm within 24 hours.

Request a format →

FAQ

Common questions

  • Invoice vs receipt — when do I use which?

    Use doc_type='invoice' for B2B billing documents — issue date, due date, vendor + customer party blocks, line items with tax breakdown. Use doc_type='receipt' for point-of-sale tender records — merchant, transaction time, payment method, no due date. The shape of the response is different; pick the wrong one and you'll fight the schema. When in doubt, the rule is: if there's a 'pay by' date and a customer party block, it's an invoice.

  • How does multi-currency work?

    Each line item carries its own currency in line_items[].line_total.currency. Statement-level currency is the dominant currency. When an invoice prints an explicit FX rate (e.g. 'invoiced in USD, paid in EUR at 1.0850'), it's surfaced as result.fx_rate with both currencies. We don't fetch FX rates from external services — we report what the invoice actually states.

  • Can I send the email body, or just the PDF attachment?

    Send the PDF attachment. Most B2B invoices are PDFs attached to a forwarding email; the body is usually a courtesy note or auto-text from the sender's billing system. If you have a forwarding inbox, extract the application/pdf attachment and POST that to /v1/parse/invoice. If your invoice is HTML inline in the email body, render it to PDF first (Puppeteer or wkhtmltopdf) — we don't accept .eml directly.

  • Do you validate VAT numbers against VIES?

    No, and that's deliberate. We extract VAT numbers and format-check them (country prefix + length + checksum where defined: BE mod-97, NL mod-11, DE/FR length-only). VIES validation is a runtime EU service call — it depends on the live state of a tax registry that goes down regularly, which means it doesn't belong in a parsing API. Use field_metadata.vendor.vat_number.format_valid as your first gate, then call VIES yourself before payment if you need a live check.

  • Credit notes and refunds with negative amounts?

    Supported. Credit notes parse with the same schema as invoices — line_items[].line_total can be negative, subtotal and total can be negative, tax_breakdown[].amount can be negative. validation.line_item_sum_match and validation.totals_reconcile both honor sign. If the document explicitly types itself as a credit note (DE: 'Gutschrift', FR: 'avoir', NL: 'creditnota'), result.document_subtype is set to 'credit_note'.

  • What about e-invoicing standards — PEPPOL, Factur-X, ZUGFeRD?

    Two paths. (1) If the document is a hybrid PDF/A-3 with embedded structured XML (Factur-X / ZUGFeRD), we extract the XML payload directly — that's the deterministic path, near-100% accuracy on the structured fields. (2) If it's a pure PDF with no embedded XML (the vast majority of invoices in flight today), we fall back to the visual layer with the line-item math validators on top. PEPPOL UBL XML hitting the API endpoint is also accepted directly — no PDF round-trip needed.

200 free pages. No credit card. No sales call.

Drop invoices parsing into your stack in an afternoon. If it doesn't earn its keep, walk away — no lock-in.

Get an API key