AI evaluation documentation platform
A comprehensive platform for documenting and sharing AI system evaluations with transparency and rigor.
Support structured evaluation across 20+ categories covering capabilities and risks
Promote open documentation of AI system capabilities, limitations, and risks
Facilitate adoption of consistent evaluation practices across organizations
Our evaluation framework builds upon established research and industry standards:
OECD framework for AI capabilities assessment and evaluation
oecd.org/ai-capability-indicatorsNIST framework for identifying and managing AI risks
nvlpubs.nist.gov/NIST.AI.600-1.pdfStart a new evaluation for your AI system
Answer questions across relevant categories
Analyze results and share with stakeholders
Access our complete evaluation framework including category definitions, question sets, and validation schemas.
View on Hugging FaceExplore completed evaluation examples from leading AI systems to understand the evaluation process.
View on Hugging FaceOur evaluation framework is open for community contributions. Use these schemas and examples to create your own evaluations, or contribute improvements to help advance AI transparency and accountability standards.