A tool to detect Type I (Exact Clone), Type II (Renamed Clone), and Type III (Near-Miss Clone) code clones across smart contracts written in the Solidity programming language. CCD can handle complete as well as incomplete code (i.e., code snippets). This repository includes code, data, tools, and evaluation results from our paper on Analyzing the Impact of Copying-and-Pasting Vulnerable Solidity Code Snippets from Question-and-Answer Websites.
The figure above depicts the overall architecture of CCD. It generates fingerprints of Solidity source code snippets using ssdeep as its piecewise hashing function. It then follows a hybrid approach to match similar code fragments by first retrieving similar fingerprints indexed by an Elasticsearch database in terms of n-gram similarity and then computes on the returned records an order-independent similarity score to match similar code snippets to indexed smart contracts.
A container with all the dependencies can be found here.
To run the container, please install docker and run:
docker pull christoftorres/contract-clone-detector && docker run -it christoftorres/contract-clone-detector
docker build -t contract-clone-detector .
docker run -it contract-clone-detector:latest
brew install ssdeep
sudo apt-get install build-essential libffi-dev python3 python3-dev python3-pip libfuzzy-dev
sudo apt-get install ssdeep
brew install antlr
sudo apt-get install antlr4
brew tap elastic/tap
brew install elastic/tap/elasticsearch-full
curl -fsSL https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elastic.gpg
echo "deb [signed-by=/usr/share/keyrings/elastic.gpg] https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list
sudo apt-get update
sudo apt-get install elasticsearch
cd CCD
python3 -m pip install -r requirements.txt
# Example generate fingerprint
python3 CCD.py -g example.sol
# Example store fingerprint
service elasticsearch start
python3 CCD.py -s example.sol --elasticsearch-index test
# Example match fingerprint
service elasticsearch start
python3 CCD.py -m example.sol --elasticsearch-index test
cd evaluation
# Install SmartEmbed and Python dependencies
docker pull christoftorres/smartembed
python3 -m pip install -r requirements.txt
# Evaluate SmartEmbed
python3 evaluate_smartembed.py
# Evaluate CCD
python3 evaluate_ccd.py
# Compare results
python3 compare_results.py
# Compare parameters
python3 compare_parameters.py
If using this repository for research, please cite as:
@inproceedings{
copypastesolidity,
address={Madrid, Spain},
title={Analyzing the Impact of Copying-and-Pasting Vulnerable Solidity Code Snippets from Question-and-Answer Websites},
ISBN={979-8-4007-0592-2/24/11},
DOI={10.1145/3646547.3688437},
booktitle={Proceedings of the 2024 ACM Internet Measurement Conference (IMC '24)},
publisher={Association for Computing Machinery},
author={Weiss, Konrad and Ferreira Torres, Christof and Wendland, Florian},
year={2024}
}