insurance

One wrong claim classification costs you $50,000 in manual rework. Or worse, a denied valid claim.

WhiteBox runs every claim through multiple AI models. When they agree, auto-process. When they disagree, route to an adjuster with the full breakdown. Auditable. Defensible. Compliant.

try it free ↵ see the API docs ↗

the problem

What goes wrong with single-model claims processing

Misclassified claim types

Customer files: "Water damage to basement after heavy rain." Is this flood damage (excluded from standard homeowner policy) or water damage from a burst pipe (covered)?

One model says "flood", another says "water damage -- plumbing". The distinction determines coverage. WhiteBox surfaces the disagreement before a wrong denial.

Fraud vs legitimate gray zone

Third claim in 18 months for a stolen laptop. Same model, different location each time.

Two models say "suspicious", one says "legitimate -- high-value area", one says "fraud". Single model picks one. WhiteBox shows the split so the SIU can make an informed decision.

Severity and priority errors

"Minor fender bender in parking lot" but the repair estimate is $12,000 and includes frame damage.

Description says minor, numbers say major. Models disagree on severity. WhiteBox escalates instead of auto-routing to the wrong queue.

how it works

Multi-model consensus in action

whitebox claims

auto-processed

whitebox › classify "Roof shingles damaged during windstorm, multiple shingles missing, no interior damage reported"

options: ["wind_damage", "hail_damage", "wear_and_tear", "structural"]

01gpt-4o-miniwind_damagelogp -0.06

02claude-3.5wind_damagelogp -0.04

03llama-3.3wind_damagelogp -0.09

04deepseek-v3wind_damagelogp -0.07

verdict

wind_damage · confidence 98%

SHIP

auto-routed to: property claims · priority: standard

whitebox claims

escalated

whitebox › classify "Water in basement after heavy rainfall, sump pump was running, foundation crack visible"

options: ["flood", "water_damage_plumbing", "foundation_defect", "maintenance_neglect"]

01gpt-4o-minifloodlogp -0.52

02claude-3.5foundation_defectlogp -0.71

03llama-3.3floodlogp -0.63

04deepseek-v3water_damage_plumbinglogp -0.88

verdict

no consensus · confidence 34%

ESCALATE

routed to: senior adjuster · queue: complex-claims · sla: 4hr

Every run, every log-prob, every disagreement -- recorded. Replay any decision from its ID.

use cases

Anywhere claims need a decision, you need consensus

Claims triage

Auto-classify incoming claims by type, severity, and priority. Route to the right team instantly.

Coverage determination

Classify whether a claim falls under covered perils or exclusions. Flag borderline cases for adjuster review.

Fraud detection

Score claim legitimacy from description, history, and pattern signals. Escalate suspicious claims to SIU with the full model breakdown.

Policy underwriting

Classify risk factors from applications. Flag inconsistencies between self-reported data and model assessments.

Subrogation identification

Detect claims where a third party may be liable. Auto-flag for recovery before settlement.

Regulatory compliance

Full audit trail on every classification decision. Defensible when regulators ask "why was this claim denied?"

compliance

Built for regulated industries

Full audit trail

Every claim classification logged with which models voted, what they said, confidence scores, and the final decision. Exportable.

Human-in-the-loop

No claim is denied by AI alone. Low-confidence decisions always route to a human adjuster with the complete model breakdown.

Defensible decisions

When a policyholder disputes a classification, you can show exactly how the decision was made: 4 models agreed, or 2 disagreed and a human adjuster made the final call.

numbers

What changes when you add consensus

85%

auto-processed

claims where models agree and route automatically

15%

escalated

complex claims flagged for adjuster review

$0.01

per classification

fraction of the cost of manual review

100%

audit trail

every decision defensible

comparison

WhiteBox vs traditional claims processing

Feature	Manual processing	Single AI model	WhiteBox
Speed	Hours per claim	Seconds	Seconds
Accuracy	High but expensive	Unknown -- no second opinion	Measured by consensus
Edge cases	Caught by experienced adjusters	Silently misclassified	Flagged for human review
Audit trail	Paper files	No	Every model vote logged
Fraud detection	Relies on adjuster intuition	Single score	Multi-model disagreement signal
Compliance	Manual documentation	Not defensible	Fully auditable

playground

Try it. Describe a claim, see the classification.

claim description

$0.01 per claim classification

Process 1,000 claims for $10. Compare that to $50+ per manual review.

20 free classifications to test with your own claims data.

free tier

20 classifications

per classification

$0.01

subscriptions

none

get a key ↵

get started

Stop misclassifying claims. Start trusting your triage.

20 free classifications. Then $0.01 each. The audit trail starts the moment you install.

get a key ↵ API docs ↗