GitHub - safe-graph/DGFraud at 4e439c34f5bff17f3deae7e715391945b6af98af

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 145 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
algorithms		algorithms
base_models		base_models
dataset		dataset
reference		reference
utils		utils
.gitignore		.gitignore
.travis.yml		.travis.yml
DGFraud_logo.png		DGFraud_logo.png
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
setup.py		setup.py

Repository files navigation

Under Building Now. The first version is expected to be released in mid May, 2020.

A Deep Graph-based Toolbox for Fraud Detection

Introduction: DGFraud is a Graph Neural Network (GNN) based toolbox for fraud detection. It integrates the implementation & comparison of state-of-the-art GNN-based fraud detection models. It also includes several utility functions such as graph preprocessing, graph sampling, and performance evaluation. The introduction of implemented models can be found here.

We welcome contributions on adding new fraud detectors and extending the features of the toolbox. Some of the planned features are listed in TODO list.

If you feel this repo is useful, please cite the paper below:

@inproceedings{liu2020alleviating,
  title={Alleviating the Inconsistency Problem of Applying Graph Neural Network to Fraud Detection},
  author={Liu, Zhiwei and Dou, Yingtong and Yu, Philip S. and Deng, Yutong and Peng, Hao},
  booktitle={Proceedings of the 43nd International ACM SIGIR Conference on Research and Development in Information Retrieval},
  year={2020}
}

Useful Resources

Table of Contents

Installation
User Guide
Implemented Models
Model Comparison
TODO List
How to Contribute

Installation

git clone https://github.com/safe-graph/DGFraud.git
cd transformers
python setup.py install

Requirements

tensorflow>=1.14.0,<2.0
numpy>=1.16.4
scipy>=1.2.0

Dataset

User Guide

Introduce how to run the code from the command line, how to run the code from IDE, how to fine-tune the model, the structure of code, the function of different directories, how to load graphs, how to evaluate the models.

Running the example code

python Player2vec_main.py

you can specify parameters for models when running the code.

Running on your dataset

Have a look at the load_data_dblp() function in utils/utils.py for an example.

In order to use your own data, you have to provide:

adjacency matrices or adjlists (for SpamGCN);
a feature matrix
a label matrix then split feature matrix and label matrix into testing data and training data.

You can specify a dataset as follows:

python xx_main.py --dataset your_dataset

or by editing xx_main.py

The structure of code

The repository is organised as follows:

algorithms/ contains the implemented models and the corresponding example code;
base_models/ contains the basic models (GCN);
dataset/ contains the necessary dataset files;
utils/ contains:
- loading and splitting the data (data_loader.py);
- contains various utilities (utils.py);
- preprocessing raw data (process_dzdp.py and process_yelp.py);
- computing ndcg score and ranking precision score (cal_ndcg.py).

Implemented Models

Model	Paper	Venue	Reference
GraphConsis	Alleviating the Inconsistency Problem of Applying Graph Neural Network to Fraud Detection	SIGIR 2020	BibTex
SemiGNN	A Semi-supervised Graph Attentive Network for Financial Fraud Detection	ICDM 2019	BibTex
Player2Vec	Key Player Identification in Underground Forums over Attributed Heterogeneous Information Network Embedding Framework	CIKM 2019	BibTex
GAS	Spam Review Detection with Graph Convolutional Networks	CIKM 2019	BibTex
FdGars	FdGars: Fraudster Detection via Graph Convolutional Networks in Online App Review System	WWW 2019	BibTex
GeniePath	GeniePath: Graph Neural Networks with Adaptive Receptive Paths	AAAI 2019	BibTex
GEM	Heterogeneous Graph Neural Networks for Malicious Account Detection	CIKM 2018	BibTex

Model Comparison

Model	Application	Graph Type	Base Model
GraphConsis	Opinion Fraud	Homogeneous	GraphSAGE
SemiGNN	Financial Fraud	Heterogeneous	GAT, LINE, DeepWalk
Player2Vec	Cyber Criminal	Heterogeneous	GAT, GCN
GAS	Opinion Fraud	Heterogeneous	GCN, GAT
FdGars	Opinion Fraud	Homogeneous	GCN
GeniePath	Financial Fraud	Homogeneous	GAT
GEM	Financial Fraud	Heterogeneous	GCN

TODO List

GraphConsis Implementation
Add preprocessed Yelp datasets
The memory-efficient implementation of SemiGNN
The log loss for GEM model
Time-based sampling for GEM
Add sampling methods
Benchmarking SOTA models
Scalable implementation
TensorFlow 2.0+ implementation
Pytorch version

How to Contribute

You are welcomed to contribute to this open-source toolbox. The detailed instructions will be released soon. Currently, you can create issues or send email to ytongdou@gmail.com for enquiry.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Under Building Now. The first version is expected to be released in mid May, 2020.

A Deep Graph-based Toolbox for Fraud Detection

Installation

Requirements

Dataset

User Guide

Running the example code

Running on your dataset

The structure of code

Implemented Models

Model Comparison

TODO List

How to Contribute

About

Releases 2

Packages

Contributors 4

Languages

License

safe-graph/DGFraud

Folders and files

Latest commit

History

Repository files navigation

Under Building Now. The first version is expected to be released in mid May, 2020.

A Deep Graph-based Toolbox for Fraud Detection

Installation

Requirements

Dataset

User Guide

Running the example code

Running on your dataset

The structure of code

Implemented Models

Model Comparison

TODO List

How to Contribute

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 4

Languages

Packages