What is a bank statement API?

It's a REST endpoint you send a statement file to and receive structured data back. FlowParse's `POST /api/v1/extract` accepts a PDF, scanned image, XLSX or CSV (base64) and returns JSON with the account header and a transactions array — every row normalised with an ISO date, a signed amount and a running balance — plus a raw_table preserving the original columns.

Which file types can I send?

Text-based PDFs, scanned PDFs and images (PNG/JPG, processed with OCR), XLSX and CSV. Send the file as base64 in the `file` field with an optional `filename` so the type is detected correctly.

How is the response structured?

As snake_case JSON: bank_name, account_holder, account_number, currency, opening_balance, closing_balance, a transactions[] array, and a raw_table with every original column and row. It's the same schema the validate, export and reconcile endpoints accept.

How do I authenticate?

With an API key as a bearer token: `Authorization: Bearer pf_live_…`. An `X-API-Key` header is also accepted. Create and revoke keys in the API dashboard at /get-api-key.

How accurate is the extraction?

The engine reads the PDF's own text geometry for pixel-accurate tables and only uses OCR for scanned pages, then validates that opening balance plus transactions equals the closing balance. You can score any result with /api/v1/validate and route low-confidence statements to a human.

Are debits and credits signed?

Yes. Separate debit/credit columns are normalised to a single signed amount — credits positive, debits negative — so you can sum a column directly. The original columns remain available in raw_table.

How much does it cost?

Billing is per page, drawn from the same page balance as the app (monthly allowance first, then top-up pages). Validation and previews are free. The exact per-page rate and plans are on the pricing page.

What happens if I run out of pages?

The API returns HTTP 429 with a message telling you how many pages the request needed and how many were available. No partial or unbilled data is returned — top up or upgrade and retry.

Can I export straight to QuickBooks or Xero?

Yes. Pass the extracted JSON to /api/v1/export with format quickbooks, qfx, ofx, xero, csv or xlsx. QBO/QFX/OFX files use OFX 1.0.2 with FITID de-duplication so re-imports never double-post.

Can I process many statements at once?

Extract each document, then call /api/v1/merge to consolidate up to 100 into one reconciled Excel with unified columns and duplicate removal. Use preview:true to see the result for free before generating the file.

Does it handle multi-page statements?

Yes. The coordinate pass keeps every row across long, multi-page statements, which is exactly where naive converters tend to drop transactions.

Is there a sandbox or free way to test?

Yes — validation and export/merge previews are free, so you can build and test the full flow without spending pages, then switch on billed extraction and exports when you're ready.

The uploaded file is processed to produce the JSON response and is not retained as a downloadable document. Calls run over HTTPS with a hashed key, every request is logged for your audit trail, and documents are never used to train models.

Which languages and SDKs are supported?

Any language with an HTTP client — the API is plain JSON over REST. The API docs include copy-paste curl, and the API playground lets you try requests in the browser.

How is this different from open banking?

Open-banking feeds need the customer's live bank connection. A statement API works from the PDF the customer already has, which is essential for historical periods, business accounts and banks without an aggregator feed.

Where are the full API docs?

At /api-docs — every endpoint, the full schema, all export formats, rate-limit behaviour and live examples. You can also experiment in the /api-playground.

Bank Statement API — Convert Statement PDFs to JSON | FlowParse

A REST API for bank statement data

A bank statement API is an HTTP endpoint you send a statement file to and get structured data back — no templates, no manual mapping, no screen-scraping. FlowParse exposes exactly that: `POST /api/v1/extract` accepts a statement as base64 (PDF, scanned image, XLSX or CSV) and responds with typed JSON — the same multi-model pipeline the FlowParse workspace uses, now callable from your own backend.

It is built for teams that process statements at volume: lenders pulling income and balance signals for underwriting, fintechs onboarding business accounts, accounting platforms importing client books, and bookkeeping tools that need a reliable bank statement converter behind an API rather than a web upload. Authentication is a single bearer key, billing is per page, and the response schema is stable and documented.

Every call returns the account context (bank name, account holder, masked number, currency, opening and closing balance) and a `transactions` array where each row is normalised: ISO dates, a single signed `amount` (credits positive, debits negative) and the running `balance`. The original layout is never lost — a `raw_table` carries every source column exactly as printed, so reference codes, transaction types and card last-fours survive into your database.

flowparse.io

Your first request in 60 seconds

Create a key in the API dashboard, base64-encode a statement, and POST it. The endpoint is `https://flowparse.io/api/v1/extract`; authenticate with `Authorization: Bearer pf_live_…` (an `X-API-Key` header also works). Base64 keeps the request a plain JSON body, so it works from any language and any HTTP client with zero multipart handling.

POST /api/v1/extract

curl -X POST https://flowparse.io/api/v1/extract \
  -H "Authorization: Bearer pf_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "file": "JVBERi0xLjcKJ...",        # base64 of october.pdf
    "filename": "october.pdf"
  }'

The structured response

The response wraps the account header and a clean `transactions` array. `pages` and `billedPages` tell you exactly what the call cost. Because every amount is already signed and every date is ISO-8601, you can insert the rows straight into a ledger table without post-processing.

200 OK

{
  "type": "bank_statement",
  "pages": 4,
  "billedPages": 4,
  "data": {
    "type": "bank_statement",
    "data": {
      "bank_name": "Sterling Bank",
      "account_holder": "ACME TRADING LTD",
      "account_number": "•••• 9471",
      "currency": "GBP",
      "opening_balance": 4120.55,
      "closing_balance": 6134.80,
      "transactions": [
        { "date": "2024-10-03", "description": "STRIPE PAYMENTS UK LTD", "amount": 2480.00, "balance": 6600.55 },
        { "date": "2024-10-05", "description": "AWS EMEA",               "amount": -312.40, "balance": 6288.15 }
      ],
      "raw_table": {
        "columns": ["Date", "Description", "Money Out", "Money In", "Balance"],
        "rows": [ ["03 Oct", "STRIPE PAYMENTS UK LTD", "", "2,480.00", "6,600.55"] ]
      }
    }
  }
}

Every field the API returns

The schema is snake_case and identical to the payload the validation, export and reconcile endpoints accept — so the output of one call is the input of the next with no reshaping.

Field	Type	Description
bank_name	string	Issuing bank or neobank as printed on the statement
account_holder	string	Name on the account
account_number	string	Masked account / card number
currency	string	ISO currency code (GBP, USD, EUR…)
opening_balance / closing_balance	number	Period balances used for the reconciliation check
transactions[]	array	One object per line: date, description, signed amount, balance
raw_table	object	Every original column and row, preserved 1:1 for audit

Why the data is trustworthy

Statement parsing fails silently when a converter drops rows or mis-signs a debit, so accuracy is the whole game. FlowParse runs a coordinate pass first (reading the PDF's own text geometry), falls back to AI OCR only for scanned pages, then applies multi-model field extraction and a deterministic check: opening balance plus the sum of transactions must equal the closing balance. When it doesn't, the discrepancy is flagged rather than hidden.

You can gate every result programmatically. Pipe the returned `data` straight into `POST /api/v1/validate` to get a 0–100 quality score, a letter grade and a list of specific checks (balance reconciliation, duplicate detection, date monotonicity, low-confidence fields). That lets you auto-accept clean statements and route only the genuinely ambiguous ones to a human — see the validation engine for the rule set.

flowparse.io

From upload to ledger-ready rows

Send the file

Base64-encode the statement and POST it to `/api/v1/extract` with your bearer key. PDFs, scans, XLSX and CSV are all accepted.

Coordinate + OCR read

The engine reads the PDF text layer for pixel-accurate tables and only runs OCR when a page is scanned — preserving every row on long, multi-page statements.

Field extraction

Multi-model extraction maps debit/credit columns to a single signed amount, normalises dates, and reconstructs the account header.

Validate

Call `/api/v1/validate` to score the result and surface balance breaks or duplicates before anything reaches your database.

Export or reconcile

Hand the same JSON to `/api/v1/export` (QBO, QFX, OFX, Xero, CSV, XLSX) or `/api/v1/reconcile` to match payments to invoices.

flowparse.io

Turn the JSON into accounting files

Most teams don't want JSON as the final artefact — they want it inside QuickBooks, Quicken or Xero. The same key calls `POST /api/v1/export` to turn any extracted statement into a real bank-feed file: `.QBO`/`.QFX`/`.OFX` (OFX 1.0.2, with FITID de-duplication so re-imports never double-post), or a clean CSV/XLSX. File exports are billed per page; accounting previews are free so you can inspect the mapping first.

POST /api/v1/export — bank feed

curl -X POST https://flowparse.io/api/v1/export \
  -H "Authorization: Bearer pf_live_xxx" \
  -d '{ "format": "quickbooks", "type": "bank_statement", "data": { ... } }'
# → { "format":"qbo", "filename":"acme-oct.qbo", "encoding":"base64", "content":"T0ZYSER..." }

flowparse.io

Many statements, one call

For a year of statements or a whole portfolio, `POST /api/v1/merge` consolidates up to 100 already-extracted documents into a single reconciled Excel — unified columns across banks, duplicate rows removed, per-row source tracking — exactly like Smart Merge in the app. Pass `preview: true` to get the summary and sheet previews for free before you spend pages on the file.

A typical batch pipeline is: loop your PDFs through `/extract`, optionally `/validate` each, then `/merge` the collection. Because extraction is per-document you can parallelise it across workers, and because billing is per page you only ever pay for what you actually convert.

flowparse.io

Pricing, keys and rate limits

Billing is simple and per page. Extraction and file exports draw from the same page balance as the app (your monthly plan allowance first, then any top-up pages); validation and previews are free. When a request would exceed your balance the API returns `429` with a clear message, so you never get a surprise bill — top up or upgrade on the pricing page. Each key tracks its own request and page totals, visible in the dashboard.

Keys come in seconds from the API dashboard: create, reveal once, and revoke at any time. Use a separate key per environment or per customer so you can rotate without downtime. Full request/response reference, every format and live examples live in the API docs; you can also try calls in the browser with the API playground.

Operation	Endpoint	Billing
Extract PDF → JSON	POST /api/v1/extract	Per page
Validate / quality score	POST /api/v1/validate	Free
Export (QBO/QFX/OFX/Xero/CSV/XLSX)	POST /api/v1/export	Per page (preview free)
Reconcile invoices ↔ payments	POST /api/v1/reconcile	Free
Merge many → one Excel	POST /api/v1/merge	Per page (preview free)

Security & data handling

Statements are sensitive, so the API is built to hold as little as possible. Calls run over HTTPS, authenticate with a hashed key, and the uploaded file is processed in memory to produce the JSON response — it is not retained as a downloadable document on your behalf. Each key is scoped to your account, request-counted, and instantly revocable, and every call is logged with the document label and page cost for a clean audit trail.

Because you control the request, you also control retention: store only the fields you need, drop the `raw_table` if you don't, and keep PII out of logs. FlowParse never uses your documents to train models. For the platform's wider posture see the security page.

flowparse.io

What teams build on it

Lending & underwriting

Pull clean income, expense and balance signals from applicant statements as structured input for your own risk model.

Fintech onboarding

Let business customers connect by uploading statements when an open-banking feed isn't available.

Accounting & bookkeeping apps

Embed statement import: convert client PDFs to transactions and push straight to the ledger.

Internal finance automation

Wire statement parsing into month-end close, expense reconciliation and cash-flow reporting.

How it handles every bank format

The reason a template-based parser keeps breaking is that there is no single 'bank statement format'. One bank prints separate Money In and Money Out columns; another uses a single amount column with a Dr/Cr suffix; a third puts the running balance on the left, uses `DD/MM/YYYY` dates and a comma as the decimal separator. A US statement might show `MM/DD/YYYY` and `$`, a European one `1.234,56 €`. A naive parser needs a new rule for each of these; the FlowParse engine recognises the *meaning* of each column and normalises them all to one shape — ISO dates, a single signed amount, a numeric balance — so your code never branches on the bank.

Multi-currency statements, foreign-exchange lines, reference numbers, transaction-type codes and card last-fours all survive too. The normalised `transactions` array gives you the clean, sortable data for ledgers and analytics, while `raw_table` keeps a faithful copy of the original columns for audit and for any bank-specific field your workflow happens to need. That combination — clean by default, lossless when you need it — is what lets one endpoint serve high-street banks, neobanks, business accounts and credit-card statements without per-issuer code.

Because extraction is contextual rather than positional, onboarding a new bank is not an engineering task. A statement from an institution FlowParse has never seen still returns the same schema on the first call, which is exactly what you want when your own customers upload whatever bank they happen to use. For the full list of banks already covered in the product, browse the bank statement converter hub.

flowparse.io

Designing a robust integration

The extract call is synchronous: you POST a statement and the structured JSON comes back on the same request, typically in a few seconds but longer for big multi-page scans. For a handful of documents that's all you need. At volume, put extraction behind a queue and a pool of workers so a slow document never blocks a user-facing request — your worker calls `/extract`, then `/validate`, then writes the result and notifies your own system. That gives you a clean, webhook-style flow built from your own infrastructure, with full control over retries and ordering.

Make calls idempotent on your side by keying each job on a content hash of the file, so a retried upload doesn't create duplicate records. For accounting exports, the OFX `FITID` on each transaction is a stable identifier you can use to de-duplicate across re-imports. Handle `429` by pausing the worker and resuming after a top-up rather than hammering the endpoint, and treat `503` with exponential backoff. Cap your worker concurrency so a large batch can't burn your whole page budget in a single burst — predictable spend is a feature, not an afterthought.

Keep keys server-side and rotate them per environment and per customer; a leaked key is just revoked and replaced with no downtime. Log the `billedPages` and the validation grade with every job so you have a complete, queryable audit trail of what was processed, what it cost and what was auto-accepted versus reviewed. These few habits turn a single API call into a dependable pipeline you can run unattended.

flowparse.io

API vs manual entry vs open banking

There are three common ways to get transaction data into a system, and they solve different problems. Manual entry (or outsourced keying) is flexible but slow, expensive and error-prone — and it doesn't scale past a few statements a day. Open-banking aggregation is excellent for live, ongoing feeds, but it requires the customer to connect their bank, only covers institutions the aggregator supports, and can't reach historical periods or accounts that were closed. A bank statement API fills the gap: it works from the PDF the customer already has, for any bank, for any period, including business accounts and back-years that no feed can provide.

In practice the strongest products use both: an open-banking feed for the live relationship and a statement API for onboarding, historical analysis and the long tail of banks without a feed. Lenders, in particular, almost always need the statement path because underwriting looks backwards over months of history that a just-connected feed doesn't hold. Pair the reconciliation engine with extracted statements and you can match payments to invoices automatically, closing the loop between what was billed and what actually arrived.

Approach	Best for	Limitation
Manual / outsourced entry	Tiny volumes, odd one-offs	Slow, costly, error-prone, doesn't scale
Open-banking feed	Live, ongoing transaction data	Needs a live connection; no history; limited bank coverage
Bank statement API	Onboarding, history, any bank, business accounts	Works from the document the customer provides

Start building today

The fastest path is: grab a key, send one statement to `/api/v1/extract`, and read the JSON. From there add `/validate` for a quality gate and `/export` or `/merge` for the output your users actually need. The guide to parsing bank statements with an API walks through a complete, production-ready integration with error handling and batching, including the queue-and-worker pattern above.

Looking for a specific format or document type? See PDF to JSON API for the generic schema view, the bank statement OCR API for scanned documents, and the document extraction API for invoices and receipts alongside statements. When you're ready to wire output into accounting software, PDF to QBO and bank statement to Xero cover the import side end to end.

Ship statement parsing this week

Create a key, POST one statement to /api/v1/extract, and get clean transaction JSON back. Validate, export to QBO/Xero and reconcile over the same API.

Frequently asked questions

API Documentation Get an API Key API Playground PDF to JSON API Bank Statement OCR API Document Extraction API Guide: Parse statements with an API Bank Statement Converter (hub)PDF to QBO Converter Reconciliation Engine API Pricing

Bank Statement API — Parse Statement PDFs to JSON

A REST API for bank statement data

Your first request in 60 seconds

The structured response

Every field the API returns

Why the data is trustworthy

From upload to ledger-ready rows

Send the file

Coordinate + OCR read

Field extraction

Validate

Export or reconcile

Turn the JSON into accounting files

Many statements, one call

Pricing, keys and rate limits

Security & data handling

What teams build on it

Lending & underwriting

Fintech onboarding

Accounting & bookkeeping apps

Internal finance automation

How it handles every bank format

Designing a robust integration

API vs manual entry vs open banking

Start building today

Ship statement parsing this week

Frequently asked questions

Related