Announcement · Research

Introducing FinProof: the BFSI AI safety benchmark

June 2026 · 6 min read · The Zytra Research Team

Banks are deploying generative AI faster than they can govern it. The hard question every risk committee now asks is simple: how do we know our AI is safe enough to put in front of customers? Until now, the honest answer has been "we don't — not precisely." We built FinProof to change that.

FinProof is an open, reproducible adversarial benchmark designed specifically for banking and financial services. It contains 5,389 prompts spanning seven BFSI attack categories and three real deployment registers, so teams can measure exactly how a safety model behaves under the conditions that matter for finance.

Why generic safety benchmarks fall short for banking

Most safety benchmarks were built for open-domain chat. They measure whether a model refuses obviously toxic content. But a banking assistant fails in subtler, costlier ways: giving unlicensed investment advice, being talked through a KYC bypass, fabricating a regulatory disclosure, or leaking another customer's data.

Equally important is the inverse failure. A guardrail that blocks legitimate customers — flagging "what's my loan balance?" as an attack — destroys the customer experience and quietly gets switched off. Production safety is a balance of catching real attacks and leaving real customers alone.

A benchmark that only measures attack detection, and ignores false positives, will reward models that are unusable in production.

What's inside FinProof

FinProof v1 organizes adversarial coverage across seven categories: investment advice, KYC bypass, regulatory misrepresentation, document hallucination, data rights, transaction integrity, and account bypass. Each prompt is written for one of three deployment registers — professional compliance, retail customer mobile, and relationship-manager internal — because the same attack reads very differently across channels.

Quantum-augmented generation

Static attack datasets get memorized. To keep FinProof's medium tier diverse and hard to overfit, we generate prompts with a quantum-augmented sampling process. The result is broader coverage of the long tail of phrasing that real adversaries use.

Open by default

Tiers 1 and 2 are available now for public self-evaluation on Hugging Face, alongside the full attack taxonomy. The withheld Tier 4 keeps the leaderboard honest: official scores are produced by Zytra so no model can train on the test set.

If you build or buy AI safety models for finance, we'd love for you to run FinProof — and to submit your model for official evaluation.

Public benchmark Read the research page