Benchmarks
Two command centers, side by side.
Live evaluation dashboards. StoryBench measures narrative and reasoning quality; DharmaBench measures alignment and values. Each runs in its own panel below.
Live evaluation dashboards. StoryBench measures narrative and reasoning quality; DharmaBench measures alignment and values. Each runs in its own panel below.