Particle.news

Download on the App Store

Technology Artificial Intelligence Model Evaluation

Benchmarking

Performance Metrics GSM8K AIME and MATH Tests MATH-500 MMLU Reasoning Performance User Feedback Transparency Issues Human Evaluation