๐Ÿ‡ฎ๐Ÿ‡ท MIZAN: A Persian LLM Leaderboard

MIZAN: A Persian LLM Leaderboard is a comprehensive benchmark for evaluating Large Language Models (LLMs) in Persian. It combines existing datasets, translated benchmarks, and new Persian-specific data to assess LLM capabilities in understanding, generation, reasoning, and knowledge relevant to the Persian language and culture. MIZAN provides a standardized tool for researchers and developers to measure Persian LLM performance.

Filter by Model Source

๐Ÿ† Overall Benchmark

10

Unknown

74.68

92.60

91.38

82.17

92.18

72.07

20.11