Language support
Documents in any major language
Vision LLMs handle 100+ languages out of the box. Here's where we've validated and where we're still calibrating. Tell us yours; we'll confirm within 48 hours.
How we tier
Three honesty tiers
Tier 1 is production-validated against our own fixtures, with 95%+ measured accuracy. Tier 2is beta — we've confirmed 90%+ accuracy on representative samples but edge cases are still possible. Tier 3is experimental: accuracy reflects the current capabilities of the underlying foundation models (Gemini 2.0 Flash + Claude Sonnet 4), but we haven't run our own validation yet — use the generic/v1/extractendpoint and validate the output yourself. We move languages between tiers as customers request validation.
Tier 1 · Production validated · 95%+ accuracy
Validated against real fixtures
These languages have a fixture library, measured accuracy, and regression tests. Use the typed extractors — /v1/parse/bank-statement, /v1/parse/payslip, and friends — with confidence.
- EnglishUS · UK · AU
- FrenchFR · BE · CH · CA
- GermanDE · AT · CH
- DutchNL · BE
- ItalianIT · CH
- SpanishES · MX · AR
- PortuguesePT · BR
Tier 2 · Beta · 90%+ accuracy
Confirmed on samples, edge cases possible
We've tested representative documents and confirmed 90%+ field-extraction accuracy. The fixture library is smaller than Tier 1; if you ship one of these in production, send us a sample and we'll move it to Tier 1.
Central + Eastern European
- Polish
- Czech
- Hungarian
- Romanian
- Greek
CJK
- Japanese
- Korean
- Simplified Chinese
- Traditional Chinese
Other
- Turkish
- Arabic
Nordic
- Swedish
- Norwegian
- Danish
- Finnish
Tier 3 · Experimental
Use generic /v1/extract, validate yourself
Foundation models read these languages — Gemini 2.0 Flash and Claude Sonnet 4 both have demonstrated capability. We just haven't run our own validation, so we're honest about not claiming a number. Use the generic /v1/extract endpoint with a JSON schema and validate accuracy on your side. Send us a sample to move it to Tier 2.
Latin-script
- All other Latin-script languages
South + Southeast Asian
- Hindi
- Bengali
- Vietnamese
- Thai
Middle Eastern
- Hebrew
- Persian
Cyrillic
- Russian
- Ukrainian
Mixed-language documents
Yes, supported.
Belgian payslips with French + Dutch + English headers parse cleanly. The vision LLM doesn't “switch language” — it just reads the document. The same applies to Swiss invoices (DE + FR + IT), Canadian statements (EN + FR), and any other multilingual layout you're likely to encounter in EU finance.
Special handling
Scripts and reading order
RTL · Arabic · Hebrew · Persian
Tested. Reading order is right-to-left for native scripts; embedded Latin characters (e.g. amounts, IBANs, dates) read left-to-right. Field extraction preserves the directionality the document was written in.
CJK · Japanese · Korean · Chinese
Tested. Accuracy depends on document quality — printed CJK is very reliable, but handwritten content drops 10–20%. Vertical layouts (rare in finance) are read in their native order.
Non-Latin scripts
Tested. Accuracy reflects the current capability of the foundation models (Gemini 2.0 Flash + Claude Sonnet 4). When the model improves, we get the improvement automatically — no re-training cycle on our end.
Improvement model
Driven by customer requests
We add language validation when a customer requests it. Email support@tryparsr.dev with the language and a sample document; we'll test, confirm accuracy, and move it to Tier 1 within 48 hours if the fixtures pass. Same SLA we apply to bank-statement format requests — see how parsr ships.
Try parsr on a language we don't list yet.
Send us a sample document — anonymized is fine. We'll come back within 48 hours with measured accuracy and, if it passes, promotion to Tier 1.