Submission Guide

Overview

To submit your Code Agent's evaluation results to the LiveCVEBench leaderboard, you need to create a Pull Request to our GitHub repository with your results file.

Submission Format

Create a JSON file with the following structure:

{
    "model": "Your-Model-Name",
    "agent": "Your-Agent-Name",
    "modelType": "open",
    "agentType": "open",
    "instruction_type": "user_report",
    "cve_results": {
        "CVE-2025-0001": {
            "success": true,
            "turns": 3,
            "tokens": 14500
        },
        "CVE-2025-0002": {
            "success": false,
            "turns": 8,
            "tokens": 42000
        }
    }
}

Field Descriptions

Field Type Description
model string Name of the LLM model (e.g., "GPT-4o", "Claude-3.5-Sonnet")
agent string Name of the agent framework (e.g., "OpenHands", "Aider")
modelType "open" | "closed" Whether the model weights are publicly available
agentType "open" | "closed" Whether the agent source code is publicly available
instruction_type "user_report" | "cve_description" Type of task input: user_report (recommended) or cve_description
success boolean Whether the CVE was successfully fixed
turns number Number of interaction turns taken
tokens number Total tokens consumed (input + output)

Submission Steps

  1. Fork the livecvebench/submissions repository
  2. Create your results file as submissions/{Model}_{Agent}.json
  3. Commit and push your changes
  4. Create a Pull Request with:
    • A brief description of your model/agent
    • Link to model/agent repository (if open source)
    • Any relevant configuration details

Evaluation Environment

LiveCVEBench is fully compatible with the Terminal Bench evaluation framework. You can use Terminal Bench to run your agent on our CVE tasks and generate the results file.

Questions?
If you have any questions about the submission process, please open an issue on our GitHub repository.