Developer API June 20, 2026 12 min read

Bank Statement API — Parse Statement PDFs to JSON

The FlowParse bank statement API turns a PDF, scanned image or CSV statement into clean, structured JSON in a single REST call — every transaction as a typed row with date, description, signed amount and running balance, plus the account header and a `raw_table` that preserves every original column 1:1. One key, one endpoint, any bank. Then validate, reconcile, Smart-merge or export to QuickBooks, QFX, Xero and more — all over the same authenticated API.

FlowParse
flowparse.io

A REST API for bank statement data

A bank statement API is an HTTP endpoint you send a statement file to and get structured data back — no templates, no manual mapping, no screen-scraping. FlowParse exposes exactly that: `POST /api/v1/extract` accepts a statement as base64 (PDF, scanned image, XLSX or CSV) and responds with typed JSON — the same multi-model pipeline the FlowParse workspace uses, now callable from your own backend.

It is built for teams that process statements at volume: lenders pulling income and balance signals for underwriting, fintechs onboarding business accounts, accounting platforms importing client books, and bookkeeping tools that need a reliable bank statement converter behind an API rather than a web upload. Authentication is a single bearer key, billing is per page, and the response schema is stable and documented.

Every call returns the account context (bank name, account holder, masked number, currency, opening and closing balance) and a `transactions` array where each row is normalised: ISO dates, a single signed `amount` (credits positive, debits negative) and the running `balance`. The original layout is never lost — a `raw_table` carries every source column exactly as printed, so reference codes, transaction types and card last-fours survive into your database.

FlowParse
flowparse.io

Your first request in 60 seconds

Create a key in the API dashboard, base64-encode a statement, and POST it. The endpoint is `https://flowparse.io/api/v1/extract`; authenticate with `Authorization: Bearer pf_live_…` (an `X-API-Key` header also works). Base64 keeps the request a plain JSON body, so it works from any language and any HTTP client with zero multipart handling.

POST /api/v1/extract
curl -X POST https://flowparse.io/api/v1/extract \
  -H "Authorization: Bearer pf_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "file": "JVBERi0xLjcKJ...",        # base64 of october.pdf
    "filename": "october.pdf"
  }'

The structured response

The response wraps the account header and a clean `transactions` array. `pages` and `billedPages` tell you exactly what the call cost. Because every amount is already signed and every date is ISO-8601, you can insert the rows straight into a ledger table without post-processing.

200 OK
{
  "type": "bank_statement",
  "pages": 4,
  "billedPages": 4,
  "data": {
    "type": "bank_statement",
    "data": {
      "bank_name": "Sterling Bank",
      "account_holder": "ACME TRADING LTD",
      "account_number": "•••• 9471",
      "currency": "GBP",
      "opening_balance": 4120.55,
      "closing_balance": 6134.80,
      "transactions": [
        { "date": "2024-10-03", "description": "STRIPE PAYMENTS UK LTD", "amount": 2480.00, "balance": 6600.55 },
        { "date": "2024-10-05", "description": "AWS EMEA",               "amount": -312.40, "balance": 6288.15 }
      ],
      "raw_table": {
        "columns": ["Date", "Description", "Money Out", "Money In", "Balance"],
        "rows": [ ["03 Oct", "STRIPE PAYMENTS UK LTD", "", "2,480.00", "6,600.55"] ]
      }
    }
  }
}

Every field the API returns

The schema is snake_case and identical to the payload the validation, export and reconcile endpoints accept — so the output of one call is the input of the next with no reshaping.

FieldTypeDescription
bank_namestringIssuing bank or neobank as printed on the statement
account_holderstringName on the account
account_numberstringMasked account / card number
currencystringISO currency code (GBP, USD, EUR…)
opening_balance / closing_balancenumberPeriod balances used for the reconciliation check
transactions[]arrayOne object per line: date, description, signed amount, balance
raw_tableobjectEvery original column and row, preserved 1:1 for audit

Why the data is trustworthy

Statement parsing fails silently when a converter drops rows or mis-signs a debit, so accuracy is the whole game. FlowParse runs a coordinate pass first (reading the PDF's own text geometry), falls back to AI OCR only for scanned pages, then applies multi-model field extraction and a deterministic check: opening balance plus the sum of transactions must equal the closing balance. When it doesn't, the discrepancy is flagged rather than hidden.

You can gate every result programmatically. Pipe the returned `data` straight into `POST /api/v1/validate` to get a 0–100 quality score, a letter grade and a list of specific checks (balance reconciliation, duplicate detection, date monotonicity, low-confidence fields). That lets you auto-accept clean statements and route only the genuinely ambiguous ones to a human — see the validation engine for the rule set.

FlowParse
flowparse.io

From upload to ledger-ready rows

1

Send the file

Base64-encode the statement and POST it to `/api/v1/extract` with your bearer key. PDFs, scans, XLSX and CSV are all accepted.

2

Coordinate + OCR read

The engine reads the PDF text layer for pixel-accurate tables and only runs OCR when a page is scanned — preserving every row on long, multi-page statements.

3

Field extraction

Multi-model extraction maps debit/credit columns to a single signed amount, normalises dates, and reconstructs the account header.

4

Validate

Call `/api/v1/validate` to score the result and surface balance breaks or duplicates before anything reaches your database.

5

Export or reconcile

Hand the same JSON to `/api/v1/export` (QBO, QFX, OFX, Xero, CSV, XLSX) or `/api/v1/reconcile` to match payments to invoices.

FlowParse
flowparse.io

Turn the JSON into accounting files

Most teams don't want JSON as the final artefact — they want it inside QuickBooks, Quicken or Xero. The same key calls `POST /api/v1/export` to turn any extracted statement into a real bank-feed file: `.QBO`/`.QFX`/`.OFX` (OFX 1.0.2, with FITID de-duplication so re-imports never double-post), or a clean CSV/XLSX. File exports are billed per page; accounting previews are free so you can inspect the mapping first.

POST /api/v1/export — bank feed
curl -X POST https://flowparse.io/api/v1/export \
  -H "Authorization: Bearer pf_live_xxx" \
  -d '{ "format": "quickbooks", "type": "bank_statement", "data": { ... } }'
# → { "format":"qbo", "filename":"acme-oct.qbo", "encoding":"base64", "content":"T0ZYSER..." }
FlowParse
flowparse.io

Many statements, one call

For a year of statements or a whole portfolio, `POST /api/v1/merge` consolidates up to 100 already-extracted documents into a single reconciled Excel — unified columns across banks, duplicate rows removed, per-row source tracking — exactly like Smart Merge in the app. Pass `preview: true` to get the summary and sheet previews for free before you spend pages on the file.

A typical batch pipeline is: loop your PDFs through `/extract`, optionally `/validate` each, then `/merge` the collection. Because extraction is per-document you can parallelise it across workers, and because billing is per page you only ever pay for what you actually convert.

FlowParse
flowparse.io

Pricing, keys and rate limits

Billing is simple and per page. Extraction and file exports draw from the same page balance as the app (your monthly plan allowance first, then any top-up pages); validation and previews are free. When a request would exceed your balance the API returns `429` with a clear message, so you never get a surprise bill — top up or upgrade on the pricing page. Each key tracks its own request and page totals, visible in the dashboard.

Keys come in seconds from the API dashboard: create, reveal once, and revoke at any time. Use a separate key per environment or per customer so you can rotate without downtime. Full request/response reference, every format and live examples live in the API docs; you can also try calls in the browser with the API playground.

OperationEndpointBilling
Extract PDF → JSONPOST /api/v1/extractPer page
Validate / quality scorePOST /api/v1/validateFree
Export (QBO/QFX/OFX/Xero/CSV/XLSX)POST /api/v1/exportPer page (preview free)
Reconcile invoices ↔ paymentsPOST /api/v1/reconcileFree
Merge many → one ExcelPOST /api/v1/mergePer page (preview free)

Security & data handling

Statements are sensitive, so the API is built to hold as little as possible. Calls run over HTTPS, authenticate with a hashed key, and the uploaded file is processed in memory to produce the JSON response — it is not retained as a downloadable document on your behalf. Each key is scoped to your account, request-counted, and instantly revocable, and every call is logged with the document label and page cost for a clean audit trail.

Because you control the request, you also control retention: store only the fields you need, drop the `raw_table` if you don't, and keep PII out of logs. FlowParse never uses your documents to train models. For the platform's wider posture see the security page.

FlowParse
flowparse.io

What teams build on it

Lending & underwriting

Pull clean income, expense and balance signals from applicant statements as structured input for your own risk model.

Fintech onboarding

Let business customers connect by uploading statements when an open-banking feed isn't available.

Accounting & bookkeeping apps

Embed statement import: convert client PDFs to transactions and push straight to the ledger.

Internal finance automation

Wire statement parsing into month-end close, expense reconciliation and cash-flow reporting.

How it handles every bank format

The reason a template-based parser keeps breaking is that there is no single 'bank statement format'. One bank prints separate Money In and Money Out columns; another uses a single amount column with a Dr/Cr suffix; a third puts the running balance on the left, uses `DD/MM/YYYY` dates and a comma as the decimal separator. A US statement might show `MM/DD/YYYY` and `$`, a European one `1.234,56 €`. A naive parser needs a new rule for each of these; the FlowParse engine recognises the *meaning* of each column and normalises them all to one shape — ISO dates, a single signed amount, a numeric balance — so your code never branches on the bank.

Multi-currency statements, foreign-exchange lines, reference numbers, transaction-type codes and card last-fours all survive too. The normalised `transactions` array gives you the clean, sortable data for ledgers and analytics, while `raw_table` keeps a faithful copy of the original columns for audit and for any bank-specific field your workflow happens to need. That combination — clean by default, lossless when you need it — is what lets one endpoint serve high-street banks, neobanks, business accounts and credit-card statements without per-issuer code.

Because extraction is contextual rather than positional, onboarding a new bank is not an engineering task. A statement from an institution FlowParse has never seen still returns the same schema on the first call, which is exactly what you want when your own customers upload whatever bank they happen to use. For the full list of banks already covered in the product, browse the bank statement converter hub.

FlowParse
flowparse.io

Designing a robust integration

The extract call is synchronous: you POST a statement and the structured JSON comes back on the same request, typically in a few seconds but longer for big multi-page scans. For a handful of documents that's all you need. At volume, put extraction behind a queue and a pool of workers so a slow document never blocks a user-facing request — your worker calls `/extract`, then `/validate`, then writes the result and notifies your own system. That gives you a clean, webhook-style flow built from your own infrastructure, with full control over retries and ordering.

Make calls idempotent on your side by keying each job on a content hash of the file, so a retried upload doesn't create duplicate records. For accounting exports, the OFX `FITID` on each transaction is a stable identifier you can use to de-duplicate across re-imports. Handle `429` by pausing the worker and resuming after a top-up rather than hammering the endpoint, and treat `503` with exponential backoff. Cap your worker concurrency so a large batch can't burn your whole page budget in a single burst — predictable spend is a feature, not an afterthought.

Keep keys server-side and rotate them per environment and per customer; a leaked key is just revoked and replaced with no downtime. Log the `billedPages` and the validation grade with every job so you have a complete, queryable audit trail of what was processed, what it cost and what was auto-accepted versus reviewed. These few habits turn a single API call into a dependable pipeline you can run unattended.

FlowParse
flowparse.io

API vs manual entry vs open banking

There are three common ways to get transaction data into a system, and they solve different problems. Manual entry (or outsourced keying) is flexible but slow, expensive and error-prone — and it doesn't scale past a few statements a day. Open-banking aggregation is excellent for live, ongoing feeds, but it requires the customer to connect their bank, only covers institutions the aggregator supports, and can't reach historical periods or accounts that were closed. A bank statement API fills the gap: it works from the PDF the customer already has, for any bank, for any period, including business accounts and back-years that no feed can provide.

In practice the strongest products use both: an open-banking feed for the live relationship and a statement API for onboarding, historical analysis and the long tail of banks without a feed. Lenders, in particular, almost always need the statement path because underwriting looks backwards over months of history that a just-connected feed doesn't hold. Pair the reconciliation engine with extracted statements and you can match payments to invoices automatically, closing the loop between what was billed and what actually arrived.

ApproachBest forLimitation
Manual / outsourced entryTiny volumes, odd one-offsSlow, costly, error-prone, doesn't scale
Open-banking feedLive, ongoing transaction dataNeeds a live connection; no history; limited bank coverage
Bank statement APIOnboarding, history, any bank, business accountsWorks from the document the customer provides

Start building today

The fastest path is: grab a key, send one statement to `/api/v1/extract`, and read the JSON. From there add `/validate` for a quality gate and `/export` or `/merge` for the output your users actually need. The guide to parsing bank statements with an API walks through a complete, production-ready integration with error handling and batching, including the queue-and-worker pattern above.

Looking for a specific format or document type? See PDF to JSON API for the generic schema view, the bank statement OCR API for scanned documents, and the document extraction API for invoices and receipts alongside statements. When you're ready to wire output into accounting software, PDF to QBO and bank statement to Xero cover the import side end to end.

Ship statement parsing this week

Create a key, POST one statement to /api/v1/extract, and get clean transaction JSON back. Validate, export to QBO/Xero and reconcile over the same API.

Frequently asked questions

Related