Skip to content

A Static Malware Analysis Framework (SMAF), based on a Penetration Testing and Ethical Hacking project.

License

Notifications You must be signed in to change notification settings

Endless077/SMAAF

Repository files navigation

Wallpaper

🧰 Static Malware Analysis Automation Framework (SMAAF)

The system analyzes malware binaries without execution, identifies indicators of compromise (IOCs), predicts behaviors, and generates actionable reports for security teams.

⚠️ Disclaimer

This framework handles real malware samples.
Run it only in isolated environments (VMs or containers).
Do not use untrusted sources for obtaining samples. Prefer official / reputable repositories and services, for example:

Do not execute samples directly - this framework performs static analysis only.

🔑 Key Features

  • 📦 Sample Collection — Automates the retrieval and organization of malware samples (PE, ELF, APK) from local or external sources.
  • ⚙️ Disassembly Automation — Integrates tools like Ghidra and Radare2 for headless disassembly and architecture detection.
  • 🔍 String & IOC Extraction — Extracts readable and obfuscated strings using FLOSS, and identifies IOCs (URLs, IPs, registry keys).
  • 🧩 Signature Matching — Applies YARA rules to detect known malware patterns and families.
  • 📄 Structured Output — Generates standardized JSON files for integration with external analysis or reporting tools.

🚦 Project Status

Component Status Description
🧩 Malware Sample Collector ✅ Implemented Collects and tags samples (EXE, APK, ELF, etc.)
⚙️ Disassembler Module ✅ Implemented Uses Ghidra, Radare2, or IDA for headless disassembly
🔍 String & Signature Extractor ✅ Implemented Extracts strings, IOCs, and matches YARA rules
🤖 Behavioral Predictor ⚠️ Not Implemented Planned ML module for behavioral prediction
🧾 Reporting Engine ⚠️ Not Implemented Planned reporting engine (HTML, PDF, STIX/TAXII)

Note: Only the collector, disassembler, and extractor are functional at this stage.
The behavioral predictor and reporting engine are marked as TODO.

🛠️ Local Installation

Example setup on Linux / WSL2.

  1. Clone the repository
git clone https://github.com/Endless077/Static-Malware-Framework.git
cd Static-Malware-Framework
  1. Create and activate a virtual environment
python3 -m venv .venv
source .venv/bin/activate
  1. Install dependencies
pip install -r requirements.txt
  1. Install external tools
  • Java (JDK 17+) — required runtime for Ghidra.
  • Ghidra — required for disassembly (use headless mode).
  • Radare2 — alternative disassembly backend like Ghidra.
  • FLOSS — for extracting obfuscated strings.
  • YARA and yara-python — for signature-based detection.

⚠️ Note:

For installation, it is recommended to follow the official documentation of each tool to ensure compatibility with your operating system. If any of the required tools or dependencies are missing, the framework will automatically report the issue during runtime with a clear error message.

  1. Configure paths and APIs in config.py
VT_API_KEY: str = os.getenv("VT_API_KEY", "")
VS_API_KEY: str = os.getenv("VS_API_KEY", "")
MB_API_KEY: str = os.getenv("MB_API_KEY", "")
...

🐳 Docker Installation (Optional)

  1. Build the Docker image
docker build --no-cache -t smaaf:latest .
  1. Run the container
docker run -it --name <container> -p 8000:8000 smaaf:latest

⛏️ Modules Helpers

1. Malware Collector

python3 -m collector.main

[LOG] INFO - 2025-07-31 00:00:00 -  ________               _        _       _______  _____
[LOG] INFO - 2025-07-31 00:00:00 - |_   __  |             / |_     / \     |_   __ \|_   _|
[LOG] INFO - 2025-07-31 00:00:00 -   | |_ \_|,--.   .--. `| |-'   / _ \      | |__) | | |
[LOG] INFO - 2025-07-31 00:00:00 -   |  _|  `'_\ : ( (`\] | |    / ___ \     |  ___/  | |
[LOG] INFO - 2025-07-31 00:00:00 -  _| |_   // | |, `'.'. | |, _/ /   \ \_  _| |_    _| |_
[LOG] INFO - 2025-07-31 00:00:00 - |_____|  \'-;__/[\__) )\__/|____| |____||_____|  |_____|
INFO:     Started server process [12345]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)

2. Disassembler (Headless)

python3 -m disassembler.ghidra --help
usage: ghidra.py [-h] --ghidra GHIDRA --scripts SCRIPTS [--output OUTPUT] [--keep-project] file

Extract metadata from disassembly (ghidra)

positional arguments:
  file               Target binary path to analyze.

options:
  -h, --help         show this help message and exit
  --ghidra GHIDRA    Path to Ghidra's home.
  --scripts SCRIPTS  Directory containing useful Ghidra scripts.
  --output OUTPUT    Results output directory.
  --keep-project     Do not delete temporary Ghidra project folder.
python3 -m disassembler.radare2 --help
usage: radare2.py [-h] [--output OUTPUT] [--deep] [--timeout TIMEOUT] file

Extract metadata and disassembly from binaries using radare2.

positional arguments:
  file               Target binary path to analyze.

options:
  -h, --help         show this help message and exit
  --output OUTPUT    Results output directory.
  --deep             Run full deep analysis (aaa) instead of fast (aa).
  --timeout TIMEOUT  Set timeout (in seconds) for radare2 analysis.

3. Extractor (Strings + YARA)

python3 -m extractor.main --help
usage: main.py [-h] -m METADATA [-o OUTPUT] [-l LENGTH] [-r RULES] sample

String & Signature Extractor (SS Extractor).

positional arguments:
  sample                Path to the malware sample.

options:
  -h, --help            show this help message and exit
  -m METADATA, --metadata METADATA
                        Path to disassembler metadata directory.
  -o OUTPUT, --output OUTPUT
                        Path to output JSON report.
  -l LENGTH, --length LENGTH
                        Minimum string length for extraction.
  -r RULES, --rules RULES
                        Directory containing YARA rules (optional).

📜 API Reference

You can view the API documentation using FastAPI by visiting the /docs endpoint of your server (i.e., http://localhost:8000/docs). This interactive interface provides a comprehensive list of available APIs, including details on each supported request, required parameters, allowed HTTP methods, and expected responses. It's an invaluable tool for quickly exploring and understanding the functionality offered by your APIs without the need to manually reference static documentation.

🙏 Acknowledgements

FastAPI 🚀

FastAPI is a modern web framework for building APIs with Python 3.7+ based on standard Python type hints. It offers high performance with automatic interactive documentation (Swagger UI), WebSocket support, GraphQL integration, CORS middleware, OAuth2 authentication, and more.

More information here

Ghidra ⚙️

Ghidra is a free, open-source software reverse engineering (SRE) suite developed by the NSA. It provides disassembly, decompilation, program analysis, and scripting. SMAAF uses headless mode (analyzeHeadless) to batch-disassemble binaries in automated pipelines.

More information here

radare2 🛠️

radare2 (r2) is an open-source reverse engineering framework offering disassembly, debugging, binary analysis, and powerful scripting via r2pipe. It can be used as an alternative or complement to Ghidra for automated disassembly tasks.

More information here

YARA 🧩

YARA is a pattern-matching tool widely used in malware research to identify families and traits via rules that match byte patterns, strings, and metadata. SMAAF leverages YARA (and yara-python) to detect known signatures and extract IOCs.

More information here

💾 License

This project is licensed under the GNU General Public License v3.0.

GNU General Public License v3.0

Static Badge

🖐 Authors

Project Manager:

🔔 Support

For support, email antonio.garofalo125@gmail.com or contact the project contributors.

📝 Documentation

See the documentation project here.

About

A Static Malware Analysis Framework (SMAF), based on a Penetration Testing and Ethical Hacking project.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published