๐Ÿ‡ฎ๐Ÿ‡ท MIZAN: A Persian LLM Leaderboard

MIZAN: A Persian LLM Leaderboard is a comprehensive benchmark for evaluating Large Language Models (LLMs) in Persian. It combines existing datasets, translated benchmarks, and new Persian-specific data to assess LLM capabilities in understanding, generation, reasoning, and knowledge relevant to the Persian language and culture. MIZAN provides a standardized tool for researchers and developers to measure Persian LLM performance.

Filter by Model Source

๐Ÿ† Overall Benchmark

10

Unknown

71.27

88.11

87.33

78.08

91.18

71.47

20.11