Assembly code analysis is a time-consuming process. An effective and efficient assembly code clone search engine can greatly reduce the effort of this process, since it can identify the cloned parts that have been previously analyzed. Kam1n0 is a scalable system that supports assembly code clone search. It allows a user to first index a (large) collection of binaries, and then search for the code clones of a given target function or binary file. We have created a promotional video on YouTube:
Kam1n0 tries to solve the efficient subgraph search problem (i.e. graph isomorphism problem) for assembly functions. Given a target function (the middle one in the figure below) it can identity the cloned subgraphs among other functions in the repository (the ones on the left and the right as shown below). Kam1n0 supports rich comment format and has an IDA Pro plug-in to use its indexing and searching capabilities via IDA Pro.
Kam1n0 was developed by Steven H. H. Ding under the supervision of Benjamin C. M. Fung in the Data Mining and Security Lab at McGill University in Canada. This software won the second prize in the Hex-Rays Plug-In Contest 2015. If you find Kam1n0 useful, please cite our paper:
- S. H. H. Ding, B. C. M. Fung, and P. Charland. Kam1n0: MapReduce-based Assembly Clone Search for Reverse Engineering. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), pages 461-470, San Francisco, CA: ACM Press, August 2016.
In this repository we release the initial version of Kam1n0 and its IDA Pro plug-in. It can run on a single workstation/server, and provides clone search service through RESTful web services. The users can connect to the server through IDA Pro. Alternatively, it can be deployed on a distributed cluster (next major release).
- [Kam1n0 Core] Added a new symbolic mode. Now it supports cross-architecture sub-graph clone search on the symbolic expression level. Included libvex and z3 library. Supported architecture: x86, AMD64, MIPS32, MIPS64, PowerPC32, PowerPC64, ARM32, and ARM64.
- [Kam1n0 Core] Updated graph search algorithm. Improved scalability & accuracy. Updated default ALSH settings.
- [Kam1n0 Core] Added Visual C++ Redistributable for VS15 dependency (included in the installer, it is for z3).
- [Web UI] In the symbolic mode, we also visualize the control flow graph with abstract syntax tree for each basic block.
- [Web UI] User can index multiple files at a time.
- [Web UI] User can directly index idb or i64 file.
- [Web UI] Fix web UI bugs and improve usability.
- [Web UI] User can interrupt running jobs through the administration portal.
- [RESTful API] The old API is no longer working. Check out new API after installation.
- [IDA Pro plug-in for Kam1n0] Support composition analysis query.
- [Web UI] Added a web interface for clone search with an assembly function.
- [Web UI] Added a web interface for clone search with a binary file.
- [Kam1n0 Workbench] Added Kam1n0 Workbench for creating and managing multiple repositories on a single workstation.
- [Kam1n0 Core] The binary file clone search result can be shared and browsed on the other machine without access to the repository.
- [Kam1n0 Core] Support indexing and searching for large binary file (>40mb) without limits on system memory.
- [Kam1n0 Core] Support ARM, PowerPC, x86 and amd86 binaries.
- [Kam1n0 Core] Support user-defined processor architecture.
- [Kam1n0 Core] Optimized index structure supports better scalability and clone search quality.
- [Kam1n0 Core] Kam1n0 no longer skips basic blocks which have less than three lines of instruction. Now only single line basic block is skipped; thanks to the new index structure.
- [IDA Pro plug-in for Kam1n0] [Experimental] Added assembly fragment search functionality.
- [IDA Pro plug-in for Kam1n0] Added a tree view for browsing large number of clones.
- The assembly code repositories and configuration files used in previous versions (<1.0.0) are no longer supported by the latest version. See documentations about how to migrate previous repositories.
- You can index millions of functions in each repository on a single machine. The average response time for a query still stays around 1s; and the average indexing time for a function still stays around 20ms.
The current release of the Kam1n0 consists of two installers: the server core installer and the IDA Pro plug-in installer for Kam1n0.
Installer | Included components | Description |
---|---|---|
Kam1n0-server.msi | Core engine | Main engine providing service for indexing and searching |
Workbench | A user interface to manage the repositories and the running service. | |
Web user interface | Web user interface for searching/indexing binary file and assembly function. | |
Visual C++ Redistributable for VS15 | Dependecy for z3. | |
Kam1n0-client-idaplugin.msi | Plug-in | Connectors and user interface. |
Cefpython | Rendering engine for the user interface. | |
Wxpython | Rendering engine for Cefpython. |
The Kam1n0 core engine is purely written in Java. You need the following dependencies:
- [Required] The latest x64 8.x JRE/JDK distribution from Oracle.
- [Optional] The latest version of IDA Pro with the idapython plug-in installed. The Python plug-in and runtime should have already been installed with IDA Pro. Re-install IDA Pro if necessary.
Download the Kam1n0-server.msi
file on our release page. Follow the instructions to install the server. You will be prompted to select an installation path as well as the IDA Pro installation path. The later is optional if the server does not have to deal with any disassembling. In other words, the client side uses the Kam1n0 plugin for IDA Pro. It is strongly suggested to have the IDA Pro installed with the Kam1n0 server. The current version of Kam1n0 only supports IDA Pro.
The IDA Pro plug-in for Kam1n0 is written in Python for logic and in html/JavaScript for rendering. Before installation, it needs the following dependency:
- [Required] The latest version of IDA Pro with the idapython plug-in installed. The Python plug-in and runtime should have already been installed with IDA Pro. Re-install IDA Pro if necessary.
Next, download the Kam1n0-client-idaplugin.msi
installer from our release page. Follow the instructions to install the plug-in and runtime. Please note that the plug-in has to be installed in the IDA Pro plugins directory which is located at $IDA_PRO_PATH$/plugins
. For example, on Windows, the path could be C:/Program Files (x86)/IDA 6.8/plugins
. The installer will validate the path.
In the previous version of Kam1n0, only a single repository is supported on a workstation, and the configuration files for Kam1n0 stay in the same folder as the engine executable file. Starting from 1.x.x version, Kam1n0 supports multiple repositories on a workstation, and each repository can support different type of processor architecture. Each repository is given a data directory where you can find its configuration files. More details can be found in our Kam1n0 workbench tutorial.
- Manage repositories with Kam1n0 Workbench
- Web interface tutorial
- IDA Pro plug-in tutorial
- Working with a cluster
- Create your own processor definition
- Migrate repository from the previous version
- CLI tutorial
The software was developed by Steven H. H. Ding under the supervision of Benjamin C. M. Fung at the McGill Data Mining and Security Lab. Currently, we adopt Apache License Version 2.0. Please refer to LICENSE.txt for details.
Copyright 2015 McGill Unviersity All rights reserved.