๐Ÿ‡ฎ๐Ÿ‡ท MIZAN: A Persian LLM Leaderboard

MIZAN: A Persian LLM Leaderboard is a comprehensive benchmark for evaluating Large Language Models (LLMs) in Persian. It combines existing datasets, translated benchmarks, and new Persian-specific data to assess LLM capabilities in understanding, generation, reasoning, and knowledge relevant to the Persian language and culture. MIZAN provides a standardized tool for researchers and developers to measure Persian LLM performance.

Filter by Model Source

๐Ÿ† Overall Benchmark

1 ๐Ÿฅ‡
โœ”๏ธ

Unknown

73.32

91.04

91.08

82.17

92.18

71.79

19.47