OCR built for statements, not just text
A generic OCR API gives you a blob of text. A bank statement OCR API gives you *reconciled transactions* — because reading the characters is only half the job; the hard part is rebuilding the table so each amount lines up with the right date and description. FlowParse's `POST /api/v1/extract` does both: it recognises the text on a scanned page and reconstructs the statement structure, returning typed JSON instead of raw lines.
It's a hybrid by design. For a digital PDF it reads the embedded text geometry (no OCR error at all); for a scanned page or a photo it runs AI OCR; for a statement that mixes both it does each page the right way. You call one endpoint and the engine picks the method per page, so you never have to detect 'is this scanned?' yourself. For digital-first statements see the bank statement API; this page focuses on the scanned path.
The result is the familiar schema: account header, a `transactions` array with signed amounts and balances, and a `raw_table` preserving the original columns — so a photographed statement and a clean PDF produce the same JSON shape and flow into the same downstream calls.
Send a scan, get JSON
Base64-encode the scanned PDF or image (PNG/JPG) and POST it. The `filename` helps the engine detect the type, but the OCR path triggers automatically whenever a page has no text layer. Get your key from the API dashboard.
curl -X POST https://flowparse.io/api/v1/extract \
-H "Authorization: Bearer pf_live_xxx" \
-H "Content-Type: application/json" \
-d '{ "file": "iVBORw0KGgoAAA...", "filename": "statement-scan.png" }'Scanned in, structured out
A scanned page returns exactly the structure a digital PDF does — so your code path is identical regardless of source quality.
{
"type": "bank_statement",
"pages": 2,
"billedPages": 2,
"data": {
"type": "bank_statement",
"data": {
"bank_name": "First National",
"currency": "USD",
"opening_balance": 2010.00,
"closing_balance": 3145.20,
"transactions": [
{ "date": "2024-09-04", "description": "ACH DEPOSIT PAYROLL", "amount": 1850.00, "balance": 3860.00 },
{ "date": "2024-09-09", "description": "CARD 1124 GROCERY", "amount": -64.80, "balance": 3795.20 }
]
}
}
}How OCR accuracy is protected
OCR introduces a failure mode digital parsing doesn't: a misread digit. FlowParse guards against it with the same deterministic check it uses everywhere — opening balance plus the sum of transactions must equal the closing balance. On a scanned statement that equation is a powerful integrity test: if a single amount was misread, the running balance won't reconcile and the row is flagged.
Send any OCR result to `POST /api/v1/validate` to get a 0–100 score plus specific checks, then branch on the grade. Clean scans pass straight through; ambiguous ones (faint print, skewed photos) get queued for a quick human glance. This is how teams run OCR at volume without quietly importing wrong numbers — the validation engine describes every rule.
Getting the best OCR results
Resolution
300 DPI or a sharp phone photo reads far better than a low-res fax. Higher resolution directly improves digit accuracy.
Contrast & flatten
High-contrast, deskewed, flattened pages beat shadowed or curled ones. Crop to the page where you can.
Prefer the original PDF
If a digital PDF exists, send it — the coordinate pass is error-free and cheaper to trust than any OCR.
Always validate
Run /api/v1/validate on scanned results and gate on the score; the balance check catches most misreads automatically.
What happens to a scanned page
Detect text layer
Each page is checked for embedded text. Digital pages use the coordinate pass; image pages go to OCR.
AI OCR
Image pages are read with AI OCR tuned for financial tables, not just free text.
Rebuild the table
Recognised cells are reassembled into rows so dates, descriptions and amounts align — debits/credits become a signed amount.
Reconcile & validate
The balance equation runs; call /api/v1/validate to score and flag any misread row.
Export or store
Use the JSON directly, or /api/v1/export to QBO/QFX/Xero/CSV — the same as any other statement.
Scanned statements into accounting files
Once a scan is structured and validated, it's a first-class statement: pass it to `POST /api/v1/export` to produce a QuickBooks/Quicken bank feed (`.QBO`/`.QFX`/`.OFX`, OFX 1.0.2 with FITID de-duplication), a Xero/CSV import, or an XLSX. That means a folder of photographed statements can become clean, importable accounting data with two API calls each. See PDF to QBO and scanned bank statement to Excel for the same flow in the app.
Pricing and limits
Scanned extraction is billed per page, like any extraction, from your page balance; validation and export previews are free. Over-budget requests return `429` with the exact shortfall — no surprise charges. Files up to 20 MB and multi-page scans are supported; for large batches, extract each then merge. Plans and the per-page rate are on the pricing page; the full reference and live examples are in the API docs.
| Operation | Endpoint | Billing |
|---|---|---|
| OCR a scanned statement → JSON | POST /api/v1/extract | Per page |
| Validate / score the OCR result | POST /api/v1/validate | Free |
| Export to QBO/QFX/Xero/CSV | POST /api/v1/export | Per page (preview free) |
| Merge many scans → one Excel | POST /api/v1/merge | Per page (preview free) |
Where a statement OCR API fits
Lending from paper
Underwrite applicants who only have scanned or photographed statements — get clean income and balance data anyway.
Bookkeeping intake
Clients email phone photos of statements; turn them into ledger-ready transactions automatically.
Document digitisation
Back-convert archives of scanned statements into structured, searchable data.
Branch / mailroom capture
Process statements captured at scan stations into validated JSON without manual keying.
Why hybrid beats pure OCR
Most OCR APIs treat every page the same way: rasterise it and recognise characters. That's wasteful and lossy when a page already has a perfect text layer, because OCR can only ever approximate what's printed, while the embedded text is exact. FlowParse takes a hybrid approach — it checks each page for a text layer and reads digital pages directly from their geometry (zero OCR error), reserving AI OCR for the pages that are genuinely images. On a statement that mixes a digital cover page with photographed inserts, you get the best available method per page from a single call.
This matters most for accuracy on the numbers that count. A misread `8` as a `3` in an amount is the kind of error that quietly corrupts a ledger, so the fewer characters you OCR, the fewer chances there are to get one wrong. By only OCR-ing what truly needs it and validating the result against the balance equation, the hybrid pipeline keeps scanned-statement accuracy far closer to digital than a brute-force OCR step ever could. When a clean digital PDF exists, sending it instead of a photo is always the most accurate — and cheapest-to-trust — option.
Status codes and limits
OCR adds one error path the digital flow doesn't: an unreadable scan. If no text can be recovered the API returns `422` and the call isn't billed for a file, so you can safely prompt the user to re-scan at higher quality. Otherwise the codes are the same across the API — `200` success, `400` for a malformed request, `401` for a bad key, `429` when the page budget is exhausted (with the exact shortfall), and `503` for a transient issue to retry with backoff. No call ever returns unbilled data.
Scanned PDFs and images up to 20 MB are supported, multi-page included. OCR is billed per page like any extraction, from your page balance; validation and previews are free, so you can prove out scan quality and the validation gate before spending anything. For big archives, extract each scan — parallelised across workers with a concurrency cap — then consolidate with Smart Merge. Plans and the per-page rate are on the pricing page.
| Code | Meaning | Action |
|---|---|---|
| 200 | Scan read — JSON returned | Validate, then process |
| 422 | Unreadable scan (not billed) | Re-scan higher quality or send the PDF |
| 401 | Invalid or missing key | Check Authorization / rotate key |
| 429 | Page budget exhausted | Top up or upgrade, then retry |
| 503 | Temporarily unavailable | Retry with backoff |
Integrating scanned-document capture
Scanned input usually arrives through channels you don't fully control — a customer emails a phone photo, a branch scans a page, a mailroom digitises a batch — so the integration needs to be forgiving. The call itself is the same plain-JSON POST as any extraction: base64-encode the image or scanned PDF, send it to `/api/v1/extract`, read the structured `data`. What changes is the surrounding flow: because scan quality varies, you should always pair extraction with a validation gate and a clear path for the cases that need a human, rather than assuming every scan is perfect.
At volume, run capture behind a queue with worker processes that call extract, then validate, then either auto-accept on a good grade or route the document to a small review queue. Keep jobs idempotent on a file hash so a re-sent photo doesn't create a duplicate, cap worker concurrency so a big import can't drain your page budget at once, and handle a `429` by pausing and resuming after a top-up. This is the same robust shape as the digital flow — the only addition is that you expect a slightly higher review rate for genuinely poor scans, and you design the UI so a reviewer can fix a flagged row in seconds.
Because the output schema is identical to a digital statement, none of your downstream code needs to know whether a document was scanned or digital. The same rows feed your ledger, the same reconciliation engine matches payments, and the same export produces accounting files. Scanned capture becomes just another source into one consistent pipeline.
Best practices for OCR at volume
Three practices separate a reliable OCR integration from a fragile one. First, capture quality matters more than any setting: nudge users toward 300 DPI scans or sharp, well-lit, deskewed photos, and crop to the page where you can — every bit of clarity directly reduces digit errors. Second, prefer the original digital PDF whenever one exists; OCR is for when there is genuinely no text layer, not a default. Third, never skip validation on scanned output — the balance-reconciliation check is your single best defence against a silent misread, so gate auto-acceptance on the score and review the rest.
Operationally, treat scanned extraction like any billed call: it's per page, so cap concurrency for predictable spend, log `billedPages` and the validation grade with every document for a clean audit trail, and build and test the whole flow for free against validation and previews before enabling billed extraction. For an archive of historical scans, extract each then consolidate with Smart Merge into one reconciled workbook. These habits let you process photographed and faxed statements at scale with the same confidence as clean PDFs.
Monitoring usage and controlling cost
Running extraction at volume means watching two things: spend and quality. Every API key tracks its own request count, page total and cost, visible in the dashboard, so you can see exactly what each integration or customer is consuming and spot anomalies — a sudden spike usually means a retry loop or a malformed batch. Setting a budget that matches your plan, and capping worker concurrency so a single run can't exhaust it, keeps cost predictable rather than surprising.
On the quality side, log the validation grade and `billedPages` with every document and chart the auto-accept rate over time. A falling auto-accept rate is an early signal that input quality has dropped — a new scanner, a worse photo flow, a new bank format — and lets you act before bad data reaches the books. Together these two habits turn the API from a black box into an observable, controllable part of your pipeline; the per-page rate and plan limits are on the pricing page.
Build it end to end
Start with a key from the dashboard, send one scanned page to `/api/v1/extract`, and confirm the JSON and the validation score. For the full integration — batching, error handling, exporting and reconciliation — follow the guide to parsing bank statements with an API. For digital documents and the generic schema, see the bank statement API, PDF to JSON API and document extraction API.
Turn scans into clean transactions
POST a scanned statement to /api/v1/extract and get validated, structured JSON back — ready to export to QBO, Xero or CSV.
