A Adil Islam
Benchmarks

Two command centers, side by side.

Live evaluation dashboards. StoryBench measures narrative and reasoning quality; DharmaBench measures alignment and values. Each runs in its own panel below.

StoryBench

Narrative · Reasoning Open full ↗

DharmaBench

Alignment · Values Open full ↗