Emotional Intelligence Benchmarks for LLMs
Github | Paper | | Twitter | About
💙EQ-Bench3 ✍️Longform Writing 🎨Creative Writing v3 ⚖️Judgemark v2 🎤BuzzBench 🌍DiploBench 🎨Creative Writing (Legacy) 💗EQ-Bench (Legacy)
A benchmark measuring emotional intelligence in challenging roleplays, judged by Sonnet 3.7. Learn more
Model | Abilities | Humanlike | Safety | Assertive | Social IQ | Warm | Analytic | Insight | Empathy | Compliant | Moralising | Pragmatic | Elo Score | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Model | Abilities | Humanlike | Safety | Assertive | Social IQ | Warm | Analytic | Insight | Empathy | Compliant | Moralising | Pragmatic | Elo Score |
For more details about the benchmark, see the About section.
The Elo score shown in the leaderboard is calculated from pair-wise model comparisons, where the LLM judge rates each response against eight core dimensions of emotional intelligence:
Note: the coloured “Abilities” heat-map columns (Humanlike, Safety, Assertive, etc.) are not used in the Elo calculation—they are purely informational, giving a quick view of each model’s stylistic traits and skill profile.
The leaderboard displays several metrics: