This repository contains the source code for detecting different type of malwares using Deep learning based Feature Extraction and Wraper based Feature Selection Technique. A research paper describing how it works is availible at "https://arxiv.org/abs/1910.10958"
Two major approaches we used for malware classification: 1- Image representation of byte file Independent of the platform It requires No knowledge of domain like assembly instructions 2- Hybrid feature space using both ASM and byte file This approach is platform dependent but gives a better performance that using byte file. Requires huge resources and processing time.
The data used in these tutorial can be found on the Hybrid(Final) folder of following drive link:
https://drive.google.com/drive/folders/1s7EC4s_-hP9q5vEhs-3vAubspcZbBADK?usp=sharing
After downloading the required dataset, following is the sequence of files in the hybrid folder whose execution will lead to results.
-
"Creating hybrid dataset"
-
"Min-max normalization(hybrid dataset)"
-
"ANN-Results"
The project was done under the guidance of Dr. Asifullah Khan, DCIS, PIEAS.