Evalry
Benchmarks
Recent LLM benchmarks comparing model performance across prompts.