MMLU-Pro
- 2025-08-02
LLM 벤치마크 중 하나. MMLU를 개선한 버전. 더 어렵고 추론을 요하는 질문을 추가하고, 선택지를 기존 4개에서 10개로 늘리고, 너무 쉽거나 도움이 안되는 질문들은 제거.
An enhanced dataset designed to extend the mostly knowledge-driven MMLU benchmark by integrating more challenging, reasoning-focused questions and expanding the choice set from four to ten options. Additionally, MMLU-Pro eliminates the trivial and noisy questions in MMLU. —MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark