What endpoint parses a bank statement?

POST https://flowparse.io/api/v1/extract. Send the statement as base64 in the `file` field and you get structured JSON back — the account header plus a transactions array with ISO dates, signed amounts and running balances.

How do I authenticate?

Send your API key as a bearer token: Authorization: Bearer pf_live_…. An X-API-Key header also works. Create and revoke keys in the dashboard at /get-api-key.

What file types are supported?

Text-based PDFs, scanned PDFs and images (PNG/JPG, read with OCR), plus XLSX and CSV. Base64-encode the file and include an optional `filename` so the type is detected.

Do I need multipart upload?

No. The whole request is plain JSON with the file as a base64 string, so any HTTP client works without multipart handling.

How are debits and credits represented?

Separate debit/credit columns are normalised to a single signed amount — credits positive, debits negative — so you can sum a column directly. The original columns remain in raw_table.

How do I know the parse is correct?

Send the returned data to /api/v1/validate. It returns a 0–100 score, a grade and specific checks (balance reconciliation, duplicates, date order, low-confidence fields). Gate auto-acceptance on the grade.

Does it handle multi-page statements?

Yes. The coordinate pass keeps every row across long, multi-page statements; OCR is used only for scanned pages. pages and billedPages in the response tell you the size.

How much does a call cost?

Extraction and file exports bill per page from your page balance; validation and previews are free. The per-page rate and plans are on the pricing page.

What happens if my page budget runs out?

The API returns HTTP 429 with the pages needed vs available and returns no unbilled data. Top up or upgrade and retry.

How do I export to QuickBooks or Xero?

Pass the extracted JSON to /api/v1/export with format quickbooks, qfx, ofx, xero, csv or xlsx. QBO/QFX/OFX use OFX 1.0.2 with FITID de-duplication so re-imports never double-post.

Can I process many statements at once?

Extract each statement (parallelise across workers), then call /api/v1/merge to consolidate up to 100 into one reconciled Excel. Use preview:true to see the result for free first.

How do I reconcile statements against invoices?

Use /api/v1/reconcile with your invoices and the extracted payments; it returns matched and unmatched items with a reconciliation report.

What status codes should I handle?

200 success, 400 bad request (missing/invalid file), 401 invalid key, 422 not convertible/unreadable (not billed for a file), 429 budget exhausted, 503 temporarily unavailable. Retry 429 after topping up and 503 with backoff.

The uploaded file is processed to produce the JSON and is not retained as a downloadable document. Calls use HTTPS and a hashed key, are logged for your audit trail, and are never used to train models.

Validation and export/merge previews are free, so you can build and test the whole flow before enabling billed extraction. You can also try requests in the /api-playground.

Where is the full API reference?

At /api-docs — every endpoint, the full schema, all export formats, rate-limit behaviour and live examples.

How to Parse Bank Statements with an API

Overview: what you're building

Parsing a bank statement means going from an unstructured PDF — pages of dates, descriptions and amounts laid out in a bank-specific table — to structured data your code can use: a list of transactions with typed fields, plus the account header and period balances. Doing that by hand, or with brittle regex, doesn't scale. An API does: you POST the file and get back JSON. This guide walks the whole loop with the FlowParse bank statement API — extract, validate, export, reconcile — so by the end you have a production-ready integration, not just a hello-world call.

You'll use one endpoint to read statements (/api/v1/extract) and a handful more to act on the result. Everything authenticates with a single key, bills per page, and shares one stable JSON schema, so the output of one call is the input of the next. If you want the conceptual background on the data itself, the PDF to JSON API page covers the schema in depth.

A quick note on scope before we dive in: "parsing" here means the full journey from an opaque PDF to data your application can act on, not just pulling out raw text. Plenty of tools can dump the words on a page; almost none rebuild the transaction table correctly, keep debits and credits straight, and prove the result reconciles. Those last three are where statements go wrong in practice, so this guide treats extraction, validation and a clean hand-off to your ledger or accounting software as one connected problem rather than five disconnected scripts. Everything below uses calls you can run today against the live API, with no SDK to install.

Why use an API instead of templates

Every bank formats statements differently, and they change those formats without warning. Template- or regex-based parsers break the moment a column moves or a new bank appears, so you end up maintaining an ever-growing pile of fragile rules. An AI extraction API generalises across layouts — it reads a statement by understanding what each field means, not where it sits in pixels — so a bank you've never seen works on the first request. That's the difference between a one-off script and a bank statement converter you can depend on.

No per-bank templates to author or maintain — new layouts just work.
Every row preserved, even on long multi-page statements where naive parsers drop lines.
Debits and credits normalised to one signed amount, dates to ISO-8601.
A built-in correctness check (opening + transactions = closing) you can gate on.
One schema flows straight into validation, export and reconciliation.

flowparse.io

Before you start

You need three things: an account, an API key, and a statement file to test with. Sign up, then create a key from the API dashboard — keys are revealed once, so copy it somewhere safe. Use a separate key per environment (dev, staging, prod) so you can rotate without downtime. Keep the key server-side; never ship it in a browser or mobile app.

For the file, any text-based PDF, scanned PDF, image (PNG/JPG), XLSX or CSV statement works. Validation and previews are free, so you can build and test the entire pipeline before you spend a single page — see the pricing page for how the page balance works. Full reference and live calls are in the API docs and the playground.

Step 1 — Authenticate

Every request carries your key as a bearer token in the Authorization header (an X-API-Key header is also accepted). The base URL is https://flowparse.io/api/v1. A quick way to confirm your key works is a free validate call:

Smoke test — free validate call

curl -X POST https://flowparse.io/api/v1/validate \
  -H "Authorization: Bearer pf_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{ "type": "bank_statement", "data": { "transactions": [] } }'
# 200 → { "validations": [ ... ] }     401 → invalid key

flowparse.io

Step 2 — Extract the statement

Base64-encode the statement and POST it to /api/v1/extract. The whole request is plain JSON — the file travels as a base64 string in the file field, with an optional filename so the type is detected. No multipart handling required.

POST /api/v1/extract

# encode the PDF (shell example)
B64=$(base64 -w0 october.pdf)

curl -X POST https://flowparse.io/api/v1/extract \
  -H "Authorization: Bearer pf_live_xxx" \
  -H "Content-Type: application/json" \
  -d "{ \"file\": \"$B64\", \"filename\": \"october.pdf\" }"

The same call in Node uses any HTTP client — read the file, base64-encode it, and send JSON:

Node (fetch)

import { readFileSync } from "node:fs"

const file = readFileSync("october.pdf").toString("base64")
const res = await fetch("https://flowparse.io/api/v1/extract", {
  method: "POST",
  headers: {
    Authorization: "Bearer " + process.env.FLOWPARSE_KEY,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({ file, filename: "october.pdf" }),
})
const json = await res.json()

flowparse.io

Step 3 — Read the structured JSON

A successful call returns { type, pages, billedPages, data }. The data holds the account header and a transactions array where every row is already normalised: ISO date, a single signed amount (credits positive, debits negative) and the running balance.

200 OK

{
  "type": "bank_statement",
  "pages": 4,
  "billedPages": 4,
  "data": {
    "type": "bank_statement",
    "data": {
      "bank_name": "Sterling Bank",
      "account_holder": "ACME TRADING LTD",
      "currency": "GBP",
      "opening_balance": 4120.55,
      "closing_balance": 6134.80,
      "transactions": [
        { "date": "2024-10-03", "description": "STRIPE PAYMENTS UK LTD", "amount": 2480.00, "balance": 6600.55 },
        { "date": "2024-10-05", "description": "AWS EMEA",               "amount": -312.40, "balance": 6288.15 }
      ],
      "raw_table": { "columns": ["Date","Description","Money Out","Money In","Balance"], "rows": [ ] }
    }
  }
}

Insert the rows straight into your ledger table — no post-processing needed. If you also want the original column layout for audit (reference codes, transaction types, card last-fours), read data.data.raw_table, which preserves every source column 1:1. The key fields are:

Field	Type	Notes
bank_name / account_holder	string	Header, as printed
currency	string	ISO code (GBP, USD, EUR…)
opening_balance / closing_balance	number	Used by the reconciliation check
transactions[].date	string	ISO-8601 (YYYY-MM-DD)
transactions[].amount	number	Signed: credit +, debit −
transactions[].balance	number	Running balance after the row
raw_table	object	Original columns + rows, 1:1

flowparse.io

Step 4 — Validate before you trust it

Structured doesn't automatically mean correct — especially for scanned statements where OCR can misread a digit. Pipe the returned data straight into /api/v1/validate to get a 0–100 quality score, a letter grade and concrete checks: balance reconciliation, duplicate detection, date order and low-confidence fields. Validation is free.

POST /api/v1/validate

curl -X POST https://flowparse.io/api/v1/validate \
  -H "Authorization: Bearer pf_live_xxx" \
  -d '{ "type": "bank_statement", "data": { ... } }'
# → { "validations": [ { "score": { "value": 100, "grade": "A" }, "checks": [ ... ] } ] }

Use the grade as a gate: auto-accept high scores and route only the genuinely ambiguous statements to a human. That's how you run extraction at volume without quietly importing wrong numbers — the validation engine lists every rule it applies.

flowparse.io

Step 5 — Export to Excel or accounting software

When the destination is a spreadsheet or accounting system, hand the JSON to /api/v1/export. It returns a base64 file in the format you ask for: xlsx, csv, quickbooks (.QBO), qfx, ofx, xero and more. Bank-feed files use OFX 1.0.2 with FITID de-duplication, so re-imports never double-post.

POST /api/v1/export — QuickBooks bank feed

curl -X POST https://flowparse.io/api/v1/export \
  -H "Authorization: Bearer pf_live_xxx" \
  -d '{ "format": "quickbooks", "type": "bank_statement", "data": { ... } }'
# → { "format":"qbo", "filename":"acme-oct.qbo", "encoding":"base64", "content":"T0ZYSER..." }

Add "preview": true to inspect the column mapping for free before you generate the billed file. For the full format list and import steps see PDF to QBO and bank statement to Xero.

flowparse.io

Step 6 — Reconcile against invoices

If you're matching incoming payments to invoices, /api/v1/reconcile takes your invoices and the statement's payments and returns matched and unmatched items with a reconciliation report — the same engine behind the reconciliation feature. Reconciliation is free.

POST /api/v1/reconcile

curl -X POST https://flowparse.io/api/v1/reconcile \
  -H "Authorization: Bearer pf_live_xxx" \
  -d '{ "invoices": [ ... ], "payments": [ ... ] }'
# → { "report": { "matched": [ ... ], "unmatched": [], "currency": "EUR" } }

flowparse.io

Step 7 — Batch and merge many statements

For a year of statements or a whole portfolio, extract each document — you can parallelise this across workers — then call /api/v1/mergeto consolidate up to 100 already-extracted documents into one reconciled Excel: unified columns across banks, duplicate rows removed, per-row source tracking. It's Smart Merge over the API. Pass preview: true to see the summary and sheet previews for free before spending pages on the file.

Batch pattern (pseudocode)

const docs = []
for (const path of statementPaths) {
  const r = await extract(path)            // POST /api/v1/extract
  const v = await validate(r.data)         // POST /api/v1/validate (free)
  if (v.grade >= "B") docs.push(r.data)    // auto-accept; else queue for review
}
const merged = await merge(docs)           // POST /api/v1/merge → one Excel

flowparse.io

Errors, status codes & rate limits

The API uses standard HTTP status codes and never returns unbilled data. Handle them explicitly so your integration degrades gracefully:

Code	Meaning	What to do
200	Success — structured data returned	Process the JSON
400	Bad request (missing/invalid file or base64)	Fix the request body
401	Invalid or missing API key	Check the Authorization header / rotate the key
422	Unreadable or nothing extractable (not billed)	Re-scan at higher quality or send the original PDF
429	Page budget exhausted	Top up or upgrade, then retry
503	Temporarily unavailable	Retry with exponential backoff

Billing is per page and drawn from your page balance (monthly allowance first, then top-up pages); validation, reconciliation and previews are free. A 429 tells you exactly how many pages the request needed versus how many were available, so spend is always predictable — manage it on the pricing page.

Choosing the right output format

Once a statement is structured, the question is what to do with it — and that depends on where the data needs to land. If you're storing transactions in your own database, the raw JSON is all you need. If a human or another system expects a spreadsheet, ask /api/v1/export for xlsx or csv. If the destination is accounting software, choose a bank-feed format so the import is one click for the user rather than a fragile column-mapping exercise.

For QuickBooks and Quicken, quickbooks (.QBO) and qfx produce native bank-feed files; for tools that accept generic Open Financial Exchange, ofx works with GnuCash, Sage and others; and xero emits a Xero-friendly CSV. All of these use OFX 1.0.2 with a stable FITID per transaction, which is what stops a re-import double-posting rows the user already has.

A good rule of thumb: default to a bank-feed file when the user's end goal is accounting software, and to XLSX/CSV when they want to analyse or share the data. You can always offer both. The import steps for each destination are covered on PDF to QBO, bank statement to Quicken and bank statement to Xero.

flowparse.io

format	Output	Best for
quickbooks	.QBO bank feed	QuickBooks Online / Desktop
qfx	.QFX bank feed	Quicken (Web Connect)
ofx	.OFX 1.0.2	GnuCash, Sage, MoneyDance
xero	Xero-format CSV	Xero bank import
xlsx / csv	Spreadsheet	Analysis, sharing, custom imports

A worked example, end to end

Let's walk a single statement through the whole loop. Suppose a customer uploads october.pdf — a four-page business current-account statement. Your worker base64-encodes it and POSTs it to /api/v1/extract. A few seconds later it gets back the structured JSON: an opening balance of 4,120.55, a closing balance of 6,134.80, and atransactions array where the Stripe payout is +2480.00 and the AWS charge is -312.40. The response also reports billedPages: 4, which your worker records for the audit log.

Next it calls /api/v1/validate with that data. The response scores it 100/A: opening plus the sum of transactions equals the closing balance, no duplicates, dates in order. Because the grade clears your threshold, the worker auto-accepts: it writes the transactions to your ledger table, stores the validation grade alongside them, and moves on without any human touch. Had the score come back amber — say a balance break from one misread row — the worker would instead queue the document for a quick review.

Finally, your product needs the data inside QuickBooks, so the worker calls /api/v1/export with format: "quickbooks" and gets a base64 .QBO file back, which it hands to the customer to import. One upload became validated transactions and an importable bank feed, with three API calls and no manual entry. Scale that pattern across thousands of statements and you have an unattended pipeline — exactly what the bank statement API is built for.

flowparse.io

Scaling: concurrency, retries & idempotency

The extract call is synchronous and a large multi-page scan can take a little while, so don't call it inline on a user-facing web request at volume. Put extraction behind a queue and a pool of workers: the upload handler stores the file and enqueues a job; workers pull jobs, call extract and validate, and write results. This keeps your app responsive, lets you tune throughput by adding workers, and gives you a natural place to implement retries and backoff. It's effectively your own webhook flow, built from infrastructure you control.

Make jobs idempotent by keying each on a content hash of the file, so a retried or duplicated upload updates the same record instead of creating a second one. When you later export to a bank feed, the OFX FITID on each transaction is a stable identifier you can use to de-duplicate across re-imports — the same mechanism that stops QuickBooks double-posting. For transient 503s, retry with exponential backoff; for a 429, pause the affected worker until the page budget is topped up rather than spinning on the endpoint.

Finally, cap concurrency. Billing is per page, so an uncapped batch of large statements can spend your whole balance in one burst. A modest worker pool with a concurrency limit gives you predictable spend and steady throughput, and it plays nicely with the per-page budget you set on the pricing page. For consolidating a finished batch into one workbook, hand the validated documents to Smart Merge via /api/v1/merge.

flowparse.io

Test the whole flow for free

You don't need to spend a single page to build and prove your integration. Validation is free, and both export and merge offer free previews, so you can wire up the entire pipeline — authenticate, validate sample data, preview an export, preview a merge — and confirm your code handles every response shape before any billed call runs. That makes it easy to develop against the real API in CI and in staging without burning budget.

When you're ready to test extraction itself, run a handful of real statements through /api/v1/extract and compare the JSON against the source — check that totals reconcile, dates parse, and signs are right. Use the API playground to fire ad-hoc requests from the browser while you're exploring, and the API docs for the exact request and response of every endpoint. Keep a small fixture set of representative statements (a clean digital PDF, a scan, a multi-currency account) so you can re-run them whenever you change your integration.

Watch your usage as you go: each key's request and page totals are visible in the dashboard, so you can see exactly what a test run cost and set a budget that matches your launch plan on the pricing page. Building free-first, then switching on billed extraction at go-live, is the cheapest and safest path to production. Keep that fixture set in version control alongside your integration tests, so every future change to your code is checked against the same known statements and you catch a regression — a dropped row, a mis-signed amount, a wrong total — long before it can reach a customer's books.

Security & compliance

Bank statements are among the most sensitive documents your system will ever touch, so treat the integration accordingly. Calls run over HTTPS and authenticate with a hashed key; keep that key strictly server-side and out of any browser, mobile app or client bundle. Use a separate key per environment and, where it helps, per customer — if one is ever exposed you revoke and replace it with zero downtime. Every request is logged with the document label and page cost, giving you a clean audit trail of what was processed and what it cost.

Because you control the request, you also control retention. The uploaded file is processed to produce the JSON response and isn't retained as a downloadable document on your behalf; on your side, store only the fields you actually need, drop raw_tableif you don't use it, and keep account numbers and other PII out of your application logs. FlowParse never uses your documents to train models. For the platform's wider posture — encryption, data handling and compliance — see the security page.

Common mistakes

Skipping validation. Always score extraction with /api/v1/validate and gate on the grade — don't import unchecked rows.
Sending a photo when a digital PDF exists. The original PDF is read error-free; OCR a scan only when there's no text layer.
Re-summing debits and credits yourself. Amounts are already signed — sum the amount column directly.
Treating 422 as a failure to retry blindly. It means the file was unreadable or had nothing extractable; fix the input.
Putting the API key in client-side code. Keep keys server-side and rotate per environment.
Ignoring raw_table. If you need reference codes or transaction types, they're preserved there 1:1.

Best practices

Build the whole flow against free validation and previews first, then switch on billed extraction.
Parallelise extraction across workers for batches, but cap concurrency so you don't blow your page budget in one burst.
Persist the validation score with each document so you have an audit trail of what was auto-accepted.
Store only the fields you need; drop raw_table if you don't, and keep PII out of your logs.
Use a separate API key per environment and customer so you can revoke and rotate without downtime.
Reconcile statements against invoices to catch missing or duplicate payments early.

That's the full loop: authenticate, extract, validate, export and reconcile — all over one key and one schema. For the complete reference see the API docs; to go deeper on each surface, read the bank statement API, the bank statement OCR API for scans, the PDF to JSON API, and the document extraction API for invoices and receipts.

Start parsing statements via API

Create a key, POST one statement to /api/v1/extract, and get clean transaction JSON back — then validate, export to QuickBooks or Xero, and reconcile over the same API.

Frequently asked questions

Bank Statement API PDF to JSON API Bank Statement OCR API Document Extraction API API Documentation Get an API Key Reconciliation Engine Bank Statement to Excel

How to parse bank statements with an API

Overview: what you're building

Why use an API instead of templates

Before you start

Step 1 — Authenticate

Step 2 — Extract the statement

Step 3 — Read the structured JSON

Step 4 — Validate before you trust it

Step 5 — Export to Excel or accounting software

Step 6 — Reconcile against invoices

Step 7 — Batch and merge many statements

Errors, status codes & rate limits

Choosing the right output format

A worked example, end to end

Scaling: concurrency, retries & idempotency

Test the whole flow for free

Security & compliance

Common mistakes

Best practices

Start parsing statements via API

Frequently asked questions

Related