Detection of domains created by domain generation algorithms
1.0
This model is a convolution neural network model trained to classify URL domains generated by Domain-Generation-Algorithms. Domain generation algorithms (DGA) are algorithms seen in various families of malware that are used to periodically generate a large number of domain names that can be used as rendezvous points with their command and control servers. The large number of potential rendezvous points makes it difficult for law enforcement to effectively shut down botnets, since infected computers will attempt to contact some of these domain names every day to receive updates or commands.
There are two models for this use case. One is a CNN binary classifier (DGA or benign), and the other classifies the specific DGA family the URL belongs to using a Siamese Network.
To run this example, additional requirements must be installed into your environment. A supplementary requirements file has been provided in this example directory.
pip install -r requirements.txt
Training data consists of 320K labelled as DGA domains of 17 known DGA families and 710K labelled as not DGA domains.
Binary model = 30 epochs
Family classification model = 20 epochs
Binary model = 1000
Family classification model = 500
V100
Binary model precision = 0.9
Binary model accuracy = 0.9
To train the model run the following script under working directory.
cd ${MORPHEUS_EXPERIMENTAL_ROOT}/appshield-dga-detection/training-tuning
# Run training script and save models
python dga-appshield-cnn-training.py
This saves trained model files under ../models
directory. Then the inference script can load the models for future inferences.
Combined with host data from DOCA AppShield, this model can be used to detect DGA malware. A training notebook is also included so that users can update the model as more labeled data is collected.
This model is based on DOCA AppShield and the input of the model is the URL plugin which contains list of URLs connected to host processes.
Binary classifier outputs process with URLs classified as DGA or benign. DGA family detection classifier outputs DGA family name.
N/A
N/A