Bastet is a comprehensive dataset of common smart contract vulnerabilities in DeFi along with an AI-driven automated detection process to enhance vulnerability detection accuracy and optimize security lifecycle management.
Bastet covers common vulnerabilities in DeFi, including medium- to high-risk vulnerabilities found on-chain and in audit competitions, along with corresponding secure implementations. It aims to help developers and researchers gain deeper insights into vulnerability patterns and best security practices.
In addition, Bastet integrates an AI-driven automated vulnerability detection process. By designing tailored detection workflows, Bastet enhances AI's accuracy in identifying vulnerabilities, with the goal of optimizing security lifecycle management—from development and auditing to ongoing monitoring.
We strive to improve overall security coverage and warmly welcome contributions of additional vulnerability types, datasets, or improved AI detection methodologies. Please refer here to join and contribute to the Bastet dataset. Together, we can drive the industry's security development forward.
To download the dataset here
Bastet/
│── cli/ # Python CLI package
│ │── __init__.py
│ │── main.py # CLI entry point
│ │── commands/ # CLI commands
│ │ │── <module>/
│ │ │ │── __init__.py # CLI routing only, logic will define below
│ │ │ │── <function>.py
│ │── models/ # Interfaces for python type check
│ │ │── <SAAS>/
│ │ │ │── __init__.py # For output all models in SAAS
│ │ │ │── <function>.py
│ │ │── audit_report.py # Main Interface of output in Bastet
│── dataset/ # dataset location
│ │── reports/ # will be unzipped from the dataset.zip provide in google drive -> audit reports of the projects
│ │ │── <reports>/
│ │── repos/ # will be unzipped from the dataset.zip provide in google drive -> codebase of the projects
│ │ │── <repos>/
│ │── dataset.csv # dataset sheet, provide ground truth. (should be clone from google drive)
│ │── README.MD # Basic information of the dataset
│── n8n_workflows/ # n8n workflow files
│ │── <file>.json # workflow for analyzing the smart contracts
│── docker-compose.yaml
│── README.md
│── poetry.lock
│── pyproject.toml
│── .gitignore
- Recursive scanning of
.sol
files in specified directories - Automatic database creation and schema setup
- Integration with n8n workflows via webhooks
- Detailed processing summary and error reporting
- Results stored in PostgreSQL for further analysis
- A dataset for evaluate the prompt
- A cli interface to trigger evaluate workflow
- Python file formatter: Black
Prerequisites
- Python 3.10 or higher
- Docker installed on your machine
- Docker Compose installed on your machine
- Poetry for package management, if you want to follow our instruction the version should< 2.0.0
Installation Steps
Video tutorial
- Setup Python environment:
# Initialize virtual environment and install dependencies
poetry shell
poetry install
- Configure environment variables in
.env
:
cp .env.example .env
Update the environment variables in .env
file if needed.
- Start n8n and database:
docker-compose -f ./docker-compose.yml up -d
-
Access the n8n dashboard, Open your browser and navigate to
http://localhost:5678
-
(First time only) Setup owner account, activate free n8n pro features
-
Click the user icon at the bottom left → Settings → Click the n8n API in the sidebar → Create an API key → Label fill Bastet → Expiration select "No Expiration" (If you want to set an expiration time, select it) → Copy the API key and paste it to
N8N_API_KEY
in.env
file, because the API key will not be visible after creation, you can only create it again → Click Done. -
Back to the homepage (http://localhost:5678/home/workflows)
-
Click Create Credential in the arrow button next to the Create Workflow button → Fill in "OpenAi" in the input → You will see "OpenAi" and select it, click Continue → API Key fill your OpenAi API key, Create OpenAi credentials, and copy the value of the ID field and paste it to
N8N_OPENAI_CREDENTIAL_ID
in.env
file. -
Import the workflow by executing the following code
Before the setup, make sure you fill the N8N_API_KEY, N8N_OPENAI_CREDENTIAL_ID in .env
file.
poetry run python cli/main.py init
You will see the all workflows we provided currently. (default activated, if you want to skip some workflow, please deactivate it in n8n (http://localhost:5678/home/workflows)
The main script scan
will recursively scan all .sol
files in the specified directory:
poetry run python cli/main.py scan
The script will scan all contracts in the dataset/scan_queue
directory using all workflows that you have activated by turning on their respective switch buttons.
you can use flag
--help
for detail information of flag you can use
- Go into the workflow you want to scan.
- Click the Chat button on the bottom and input the contract content.
- import the workflow you want to evaluate
The output of the workflow need to follow the following json schema.
{
"type": "array",
"items": {
"type": "object",
"properties": {
"summary": {
"type": "string",
"description": "Brief summary of the vulnerability"
},
"severity": {
"type": "string",
"items": {
"type": "string",
"enum": ["high", "medium", "low"]
},
"description": "Severity level of the vulnerability"
},
"vulnerability_details": {
"type": "object",
"properties": {
"function_name": {
"type": "string",
"description": "Function name where the vulnerability is found"
},
"description": {
"type": "string",
"description": "Detailed description of the vulnerability"
}
},
"required": ["function_name", "description"]
},
"code_snippet": {
"type": "array",
"items": {
"type": "string"
},
"description": "Code snippet showing the vulnerability",
"default": []
},
"recommendation": {
"type": "string",
"description": "Recommendation to fix the vulnerability"
}
},
"required": [
"summary",
"severity",
"vulnerability_details",
"code_snippet",
"recommendation"
]
},
"additionalProperties": false
}
The trigger point should be a webhook and this workflow should be activated (by clicking the switch at n8n home page)
You may refer
n8n_workflow/slippage_min_amount.json
-
download the latest dataset.zip and the dataset.csv from here
-
unzip the dataset.zip in the ./dataset and the folder structure should look like this
dataset/ # dataset location
│── reports/ # will be unzipped from the dataset.zip provide in google drive -> audit reports of the projects
│ │── <reports>/
│── repos/ # will be unzipped from the dataset.zip provide in google drive -> codebase of the projects
│ │── <repos>/
│── dataset.csv # dataset sheet, provide ground truth. (should be clone from google drive and renamed to `dataset.csv`)
│── README.MD # Basic information of the dataset
- run the command
poetry run python cli/main.py eval
you can use flag
--help
for detail information of flag you can use
-
import
slippage_min_amount.json
to your n8n service. -
provide the openAI credential for the workflow
slippage_min_amount
you just create. -
make the workflow active
-
download the latest dataset.zip and the dataset.csv from here
-
unzip the dataset.zip in the ./dataset and the folder structure should look like this
dataset/ # dataset location
│── reports/ # will be unzipped from the dataset.zip provide in google drive -> audit reports of the projects
│ │── <reports>/
│── repos/ # will be unzipped from the dataset.zip provide in google drive -> codebase of the projects
│ │── <repos>/
│── dataset.csv # dataset sheet, provide ground truth. (should be clone from google drive and renamed to `dataset.csv`)
│── README.MD # Basic information of the dataset
- run
poetry run python cli/main.py eval
you shell get the confusion metrics. like this
+----------------+---------+
| Metric | Value |
+================+=========+
| True Positive | 16 |
+----------------+---------+
| True Negative | 27 |
+----------------+---------+
| False Positive | 2 |
+----------------+---------+
| False Negative | 13 |
+----------------+---------+
Note: the number shell be difference since the answer of LLM model is not stable, the answer here is created by gpt-4o-mini
Date | Conference Name | Topic | Slide |
---|---|---|---|
2025-04-02 | ETH TAIPEI 2025 | Exploring AI’s Role in Smart Contract Security | ETH-TAIPEI-2025 |
2025-04-17 | CyberSec 2025 | AI-Driven Smart Contract Vulnerability Detection | CyberSec-2025 |
Bastet is for research and educational purposes only. Anyone who discovers a vulnerability should adhere to the principles of Responsible Disclosure and ensure compliance with applicable laws and regulations. We do not encourage or support any unauthorized testing, attacks, or abusive behavior, and users assume all associated risks.
Apache License 2.0