Research · Benchmark

FinProof — the BFSI AI safety benchmark

A 5,389-prompt adversarial benchmark across 7 attack categories and three deployment registers, with quantum-augmented generation. Open-source, reproducible, and built to prove production readiness.

Overview

Adversarial testing for financial AI

FinProof v1 spans 7 attack categories — investment advice, KYC bypass, regulatory misrepresentation, document hallucination, data rights, transaction integrity, and account bypass — across three deployment registers: professional compliance, retail customer mobile, and RM internal.

Medium-difficulty attacks are generated with a Quantum Circuit Born Machine (QCBM) on PennyLane, producing diverse, realistic adversarial coverage no static dataset can match.

5,389

Total prompts

Across all tiers and registers.

7

Attack categories

BFSI-specific threat taxonomy.

3

Deployment registers

Compliance, retail, internal.

QCBM

Quantum generation

Augmented attack diversity.

Structure

Four evaluation tiers

Tier 1 · Public

Calibration

Eval harness + 782 benign FPR-calibration examples.

Tier 2 · Email gate

Direct attacks

1,606 direct-difficulty adversarial prompts.

Tier 3 · Research

QCBM-generated

2,036 medium-difficulty quantum-generated attacks.

Tier 4 · Withheld

Official test set

1,747 hard attacks — evaluated by Zytra only.

Threat taxonomy

Seven BFSI attack categories

Investment advice

Eliciting unlicensed or non-compliant financial recommendations.

KYC bypass

Attempts to circumvent identity verification controls.

Regulatory misrepresentation

Inducing false claims about products, terms or compliance.

Document hallucination

Fabricating statements, figures or official documentation.

Data rights

Privacy violations and unauthorized data disclosure.

Transaction integrity

Manipulating payments, transfers or transaction logic.

Account bypass

Unauthorized access to accounts or privileged actions.

Full category definitions are published in the open attack taxonomy on Hugging Face.

Leaderboard

FinProof results

How leading safety models perform. Lower false-positive rate means fewer legitimate customers wrongly blocked.

RankModelHackaPrompt RAgentHarm FPRWildGuard F1Latency
1Lynx v1.5 Zytra0.9940.5%0.30311.6ms
2PromptGuard-86M Meta1.00096.9%0.0958ms
3LlamaGuard-3-1B Meta0.0%0%0.0~60ms
4Granite Guardian IBM0.0%45%0.0~100ms

Official evaluation on the withheld Tier 4 set is conducted by Zytra. Public self-evaluation (Tier 1 + 2) is available now.

Submit your model for official evaluation

Run against the FinProof withheld test set and see where you rank.