Start Here ↓
You can only evaluate a model if you have a snapshot of your dataset.
Here are Some Results to Keep in Mind:
- Average Score: The average score of the evaluation for the model.
- Model Name: The name of the model the evaluation was run on.
- System Prompt: The system prompt used for the evaluation.
- User Message: The user message from the dataset.
- Original Assistant Message: The original assistant message from the dataset.
- Predicted Assistant Message: The predicted assistant message from the model.
- Model Score: The score of the model chosen for the evaluation.
- Score Reason: The reasoning behind the score.





