Description
What is Kam1n0?
Assembly code analysis is a time-consuming process. An effective and efficient assembly code clone search engine can greatly reduce the effort of this process; since it can identify the cloned parts that have been previously analyzed. Kam1n0 is a scalable system that supports assembly code clone search. It allows a user to first index a (large) collection of binaries, and then search for the code clones of a given target function or a binary file.
Kam1n0 tries to solve the efficient subgraph search problem (i.e. graph isomorphism problem) for assembly functions. Given a target function (the middle one in the figure below) it can identity the cloned subgraphs among other functions in the repository (the one on the left and the one on the right as shown below). Kam1n0 supports rich comment format, and it has a IDA-Pro plug-in to use its indexing and searching capabilities via the IDA Pro.
In this repository we release the initial version of Kam1n0 and its plugin for IDA Pro. It can run on a single workstation/server, and provide clone search service through RESTful web services. The users can connect to the server through IDA Pro. Alternatively it can be deployed on a distributed cluster (next major release).
Table of Content
Installation
The current distribution of the Kam1n0 IDA Pro plug-in is bundled with a local Kam1n0 engine. In order to have it work properly, you need the following dependencies:
- [Required] The latest x86 8.x JRE/JDK distribution from Oracle (x86).
- [Required] The latest version of IDA Pro with the idapython plug-in installed. The Python plug-in and runtime should have already been installed with IDA Pro. Re-install IDA Pro if necessary.
Next, download the latest .msi
installation file for Windows at our release page. Follow the instructions to install the plug-in and runtime. Please note that the plug-in has to be installed in the IDA-Pro plugins directory which is located at $IDA_PRO_PATH$/plugins
. For example, on Windows, the path could be C:/Program Files (x86)/IDA 6.8/plugins
. The installer will validate the path.
Where does Kam1n0 store the data?
At the end of the installation, the installer will ask you to select the path for storing local data and log files. It also creates a folder ~/Kam1n0/
to store plug-in data and errors. The local Kam1n0 engine can be found IN the installation path. You can customize its configuration file kam1n0-conf.xml
.
Tutorial
This tutorial first introduces Kam1n0's basic functionalities and then walk you through a simple index and search example.
Functionalities
The Kam1n0 engine with the plug-in provide you the functionalities to index and search assembly functions.
These functionalities can be found in:
-
IDA Pro Search Toolbar:
-
IDA Pro Functions Window:
-
IDA Pro Search Menu:
-
IDA Pro Edit Menu:
-
IDA Pro View A (popup menu):
Even though you can select functions in the popup menu of the IDA PRO Functions Window
to search/index functions, using and at other places (e.g. toolbar) open a Selection Window
which provides A more detailed configuration for multiple search. While using the plugin, we recommend you to keep the Output Window
open in IDA Pro.
For example, you can apply different filters and choose which connection you want to use to search/index them.
Walk through example
Let's go through a simple index and search case using the engine and plugin.
Preparing the data
Suppose we have two binary files libpng-1.7.0b54.dll
from libpng and zlib-1.2.7.dll
from zlib. These two files are included in our release file Kam1n0_IDA_Pro_v0.0.2.zip
. We suggest you to try them first as to be consistent with the following descriptions. You may index other binary files later as you wish. We try to index the first binary file libpng-1.7.0b54.dll
and search the second one zlib-1.2.7.dll
against it.
Start the engine
To begin with, we first need to start the kam1n0 storage and search engine. You can run it from apps in your Start Menu or desktop shortcut.
Kam1n0 is a console application. It is normal to see some warning messages at the first run, as the engine tries to find and create several elements. Please note that if you chose a system path to be the storage directory, you need to have the engine run as administrator.
Kam1n0 should open a browser with a login page as shown below. The default username and password are both admin
. You can change the later after you are logged in. You can close the browser, as we will use IDA Pro.
Indexing
Open IDA-Pro and disassemble the libpng-1.7.0b54.dll
binary file as usual. Click on the Manage Connection Button
in the toolbar . You are now able to review and edit the connections of the plugin. There is already a default connection for the local engine. These connections will be stored for future use.
To index the functions, click on the Select Functions to Index Button
at the toolbar (or in the other aforementioned location). Check the Select All Functions Option
and click the Index Button
(shown as Step 1, 2 and 3 in the image below). Each indexed binary is uniquely identified by its path, and each indexed function by its binary's id and its starting address.
Wait until the indexing process finishes as shown in the Progress Form
. Detailed progress info is printed in the IDA Output Window
. Press the OK Button
to close the form when you see 100% shown in the form.
Search and add comments
Open IDA Pro and disassemble the target zlib-1.2.7.dll
binary file as usual. Click on the Select Functions to Search Button
at the toolbar . Suppose we want to search the alder32
and compress2
. Select them using ctrl+click on the list. Click on the Search Button
. (Shown as the Step 1 and Step 2 in image below).
The search should end in seconds. You will be able to see a progress form and the Clone Graph View
.
The Clone Graph View
can be dragged and zoomed in/out with mouse scrolling. Each circle represents a function. Each color represents different binary. A link between two nodes indicates their similarity. The two blue circles are our selected target functions. By double-clicking on the alder32
node (blue node in the center), we open the Clone List Window
as shown below:
The window lists all the connected nodes with more details about thier similarity and binary name. There are three views to inspect each result:
The Flow View
The Flow View explores the cloned control flow graph structure between two functions. The cloned areas are highlighted in different convex hubs. As you can see in this example, even though two functions have different entry blocks, they share several cloned subgraphs. Each is highlighted using a convex hub with different color. Currently we ignore blocks with less than 4 instructions. Both graphs can be zoom in/out and dragged. We provide a scroll (blue) for each of them.
The Text-Diff View
The Text-Diff View tries to fully ally two assembly functions using basic string comparison algorithm. It is useful to compare two functions with a high degree of similarity. The lines with a red background mean deletion; while the ones with a green background mean addition.
The Clones View
The Clones View lists different cloned subgraphs and compares their differences. The panel below two text views lists these cloned subgraphs as cloned groups. Each group consists of pairs of cloned basic blocks between two functions. These basic blocks belong to the same group since they can be connected in the control flow. By clicking on each clone pair, the above two text views will jump to the corresponding blocks and compare their differences using string alignment.
In the Clone View, you are able to add rich comments to each assembly code instruction of each function. Move the mouse to the line for which you want to add a comment, and click on the +
button to show the Comment Form
. Markdown language is supported.