Agent Evaluation Runner

Instructions:

  1. Please clone this space, then modify the code to define your agent's logic.
  2. Ensure metadata.jsonl is available with question-answer pairs.
  3. Log in to your Hugging Face account using the button below. This uses your HF username for submission.
  4. Click 'Run Evaluation & Submit All Answers' to fetch questions, run your agent, submit answers, and see the score.

Agent Configuration:

  • 📄 Uses metadata.jsonl for answer lookup
  • ❓ Returns 'unknown' for unmatched questions

Questions and Agent Answers

Questions and Agent Answers