Knowledge Index of Noah's Ark

A High-Density Benchmark Systematically Mapping 261 Disciplines

Overview

KINA is a high-density knowledge benchmark encompassing 261 fine-grained disciplines, the first to incorporate disciplinary representativeness as a core metric. It features a reusable, game-theoretic data collection pipeline that mitigates annotation vulnerabilities.

261Disciplines
899Questions
10Options

Benchmark Comparison

Bubble size = question count  ·  Lower score = more challenging for SOTA models

KINA (Ours)
Other Benchmarks

Leaderboard

We evaluate 42 models from 13 major AI labs on KINA. Scores are reported as avg@4 accuracy.

Filter:
Rank Model Type ALL Agr. Econ. Edu. Eng. Hist. Law Arts Mgt. Med. Phil. Sci. Soc.
Closed-Source
Open-Source
1 Gold
2 Silver
3 Bronze
Bold = Best in column

Data Sample

Data Collection Pipeline

Data Collection Pipeline

Score Distribution

Granularity:

Hover to see statistics. Click a violin to jump to the model in the leaderboard.

Discipline Coverage

We curate a hierarchical taxonomy of Disciplines grounded in the U.S. Classification of Instructional Programs (CIP).
The finalized dataset comprises 899 instances, distributed across 12 disciplines, 70 fields, and 261 fine-grained subfields.

All 12 Disciplines · 70 Fields · 899 Questions

Click any block to drill into its Level-3 sub-disciplines. Click the breadcrumb to return.

Model Scores Over Time

Hover a dot to see score and release date. Click to jump to the model in the leaderboard.

Inference Cost Distribution

Qwen3

Qwen3.5

Token Length vs. Performance

BibTeX

If you find KINA useful in your research, please cite our paper:

@misc{anonymous2026kina,
  title  = {KINA: Knowledge Index of Noah's Ark},
  author = {Anonymous Authors},
  year   = {2026},
  note   = {Under review}
}