Submission Guide - LiveCVEBench

Overview

To submit your Code Agent's evaluation results to the LiveCVEBench leaderboard, you need to create a Pull Request to our GitHub repository with your results file.

Submission Format

Create a JSON file with the following structure:

{
    "model": "Your-Model-Name",
    "agent": "Your-Agent-Name",
    "modelType": "open",
    "agentType": "open",
    "instruction_type": "user_report",
    "cve_results": {
        "CVE-2025-0001": {
            "success": true,
            "turns": 3,
            "tokens": 14500
        },
        "CVE-2025-0002": {
            "success": false,
            "turns": 8,
            "tokens": 42000
        }
    }
}

Field Descriptions

Field	Type	Description
`model`	string	Name of the LLM model (e.g., "GPT-4o", "Claude-3.5-Sonnet")
`agent`	string	Name of the agent framework (e.g., "OpenHands", "Aider")
`modelType`	"open" \| "closed"	Whether the model weights are publicly available
`agentType`	"open" \| "closed"	Whether the agent source code is publicly available
`instruction_type`	"user_report" \| "cve_description"	Type of task input: user_report (recommended) or cve_description
`success`	boolean	Whether the CVE was successfully fixed
`turns`	number	Number of interaction turns taken
`tokens`	number	Total tokens consumed (input + output)

Submission Steps

Fork the livecvebench/submissions repository
Create your results file as submissions/{Model}_{Agent}.json
Commit and push your changes
Create a Pull Request with:
- A brief description of your model/agent
- Link to model/agent repository (if open source)
- Any relevant configuration details

Evaluation Environment

LiveCVEBench is fully compatible with the Terminal Bench evaluation framework. You can use Terminal Bench to run your agent on our CVE tasks and generate the results file.

Questions?
If you have any questions about the submission process, please open an issue on our GitHub repository.