Skip to content

TyeYeah/PositionDistributionMatters

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

PositionDistributionMatters (PDM)

This repository is the implementation of PDM, a graph-based binary function similarity analysis method, proposed in Position Distribution Matters: A Graph-based Binary Function Similarity Analysis Method.

This work is based on codes provided by PalmTree and CapsGNN.

The training and evaluating datasets can be compiled by yourselves, or find in BinKit.

Introduction

Requirements

  • numpy
  • r2pipe
  • re
  • scipy
  • torch
  • torchvision
  • glob, json, multiprocessing, os, shutil, tqdm, etc.

Usage

Preperation:

~ $ git clone https://github.com/TyeYeah/PositionDistributionMatters.git
~ $ sudo apt install radare2 # or visit `https://github.com/radareorg/radare2/releases` for latest version (recommended).
~ $ conda install numpy ...
~ $ pip install r2pipe ... 
~ $ cd PositionDistributionMatters
~/PositionDistributionMatters $ 

Train the BIRD model and construct ACFG+ of function:

~ $ cd PositionDistributionMatters
~/PositionDistributionMatters $ cd BIRD
# prepare binaries in bin_bird/ and bin_pdm/
~/PositionDistributionMatters/BIRD $ python r2exp.py
# see main function for only `bird` model training, or instruction embedding
# training output in `data` dir in `BIRD`
# generate an `output` dir in `PDM` 

Train and employ function ACFG+ graph embedding model

~/PositionDistributionMatters/BIRD $ cd ../PDM
~/PositionDistributionMatters/PDM $  python main.py --expmode train_s/train_t/evaluate_s/evaluate_t/embed_s/embed_t

The expmode value includes:

  • train_s: to train using siamese loss
  • train_t: to train using triplet loss
  • evaluate_s: to evaluate model generated by train_s
  • evaluate_t: to evaluate model generated by train_t
  • embed_s: to generate graph embeddings using model generated by train_s
  • embed_t: to generate graph embeddings using model generated by train_t

The corresponding embed_s and embde_t functions in main.py needs to be customized by users.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages