AutoC

AutoC is an automated tool designed to extract and analyze Indicators of Compromise (IoCs) from open-source threat intelligence sources.

Features

Threat Intelligence Parsing: Parses blogs, reports, and feeds from various OSINT sources.
IoC Extraction: Automatically extracts IoCs such as IP addresses, domains, file hashes, and more.
Visualization: Display extracted IoCs and analysis in a user-friendly interface.

Getting Started

🚀 Quick Start

Fastest way to get started with AutoC is to run it using Docker (with docker-compose).

Make sure to set up the .env file with your API keys before running the app (See Configuration section below for more details).

git clone https://github.com/barvhaim/AutoC.git
cd AutoC
docker-compose up --build

Once the app is up and running, you can access it at http://localhost:8000

Optional Services

With crawl4ai: docker-compose --profile crawl4ai up --build
With Milvus vector database: docker-compose --profile milvus up --build
With both: docker-compose --profile crawl4ai --profile milvus up --build

📦 Installation

Install Python 3.11 or later. (https://www.python.org/downloads/)
Install uv package manager (https://docs.astral.sh/uv/getting-started/installation/)
- For Linux and MacOS, you can use the following command:
```
curl -LsSf https://astral.sh/uv/install.sh | sh
```
- For Windows, you can use the following command:
```
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
```
Clone the project repository and navigate to the project directory.
```
git clone https://github.com/barvhaim/AutoC.git
cd AutoC
```
Install the required Python packages using uv.
```
uv sync
```
Configure the .env file with your API keys (See Configuration section below for more details).

🔑 Configuration

Set up API keys by adding them to the .env file (Use .env.sample file as a template). You can use either of multiple LLM providers (IBM WatsonX, OpenAI), you will configure which one to use in the next step.

cp .env.sample .env

Supported LLM providers:

watsonx.ai by IBM ("watsonx") Get API Key
OpenAI ("openai") - Experimental
RITS internal IBM ("rits")
Ollama ("ollama") - Experimental

Suggested models by provider:

Provider (LLM_PROVIDER)	Models (LLM_MODEL)
watsonx.ai by IBM (watsonx)	- `meta-llama/llama-3-3-70b-instruct` -`ibm-granite/granite-3.1-8b-instruct`
RITS (rits)	- `meta-llama/llama-3-3-70b-instruct` - `ibm-granite/granite-3.1-8b-instruct` -`deepseek-ai/DeepSeek-V3`
OpenAI (openai)	- `gpt-4.1-nano`
Ollama (ollama) Experimental	- `granite3.2:8b`

Enhanced Blog post extraction (optional)

By default, AutoC uses combination of docling and beautifulsoup4 libraries to extract blog posts content, which behind the scenes uses requests library to fetch the blog post content.

There is an option to use Crawl4AI that uses a headless browser to fetch the blog post content, which is more reliable, but requires additional setup.

To enable Crawl4AI, you need Crawl4AI backend server, which can be run using Docker:

docker-compose --profile crawl4ai up -d

The crawl4ai service uses a profile configuration, so it only starts when explicitly requested with the --profile crawl4ai flag.

And then set the environment variables in the .env file to point to the Crawl4AI server:

USE_CRAWL4AI_HEADLESS_BROWSER_HTML_PARSER=true
CRAWL4AI_BASE_URL=http://localhost:11235

Q&A Batch Mode (optional)

AutoC processes analyst questions about articles in two modes:

Individual mode (default): Each question is processed separately with individual LLM calls
Batch mode: All questions are processed together in a single LLM call for improved performance

To enable batch mode, set the environment variable in the .env file:

QNA_BATCH_MODE=true

You can also control this via API settings by including "qna_batch_mode": true in your request.

Benefits of batch mode:

Reduces number of API calls from N questions to 1 call
Potentially faster processing for multiple questions
More cost-effective for large question sets
Automatic fallback to individual mode if batch processing fails

Q&A RAG Mode (optional)

AutoC supports Retrieval-Augmented Generation (RAG) for intelligent context retrieval during Q&A processing:

Standard mode (default): Uses the entire article content as context for answering questions
RAG mode: Intelligently retrieves only the most relevant chunks of content for each question

To enable RAG mode, set the environment variable in the .env file:

QNA_RAG_MODE=true

You can also control this via API settings by including "qna_rag_mode": true in your request.

Benefits of RAG mode:

More targeted and relevant answers by focusing on specific content sections
Improved answer quality for long articles by reducing noise
Better handling of multi-topic articles
Automatic content chunking and semantic search
Efficient processing of large documents

Note: RAG mode only works with individual Q&A processing mode. When batch mode (QNA_BATCH_MODE=true) is enabled, RAG mode is automatically disabled and the full article content is used as context.

RAG Configuration: RAG mode requires a Milvus vector database. Configure the connection in your .env file:

RAG_MILVUS_HOST=localhost
RAG_MILVUS_PORT=19530
RAG_MILVUS_USER=
RAG_MILVUS_PASSWORD=
RAG_MILVUS_SECURE=false

To run Milvus with Docker:

docker-compose --profile milvus up -d

How it works:

Article content is automatically chunked and indexed into Milvus vector store
For each analyst question, the most relevant content chunks are retrieved
Only the relevant context is sent to the LLM for answer generation
Vector store is automatically cleaned up after processing

MITRE ATT&CK TTPs detection (optional)

AutoC can detect MITRE ATT&CK TTPs in the blog post content, which can be used to identify the techniques and tactics used by the threat actors. To enable MITRE ATT&CK TTPs detection, you need to set the environment variable in the .env file:

HF_TOKEN=<your_huggingface_token>
DETECT_MITRE_TTPS_MODEL_PATH=dvir056/mitre-ttp  # Hugging Face model path for MITRE ATT&CK TTPs detection

Information about model training: https://github.com/barvhaim/attack-ttps-detection?tab=readme-ov-file#-mitre-attck-ttps-classification

📝 Usage

Run the AutoC tool with the following command:

uv run python cli.py extract --help (to see the available options)
uv run python cli.py extract --url <blog_post_url>

🧑‍💻 Bonus - Try our UI

🏃Up and running options:

Assuming the app .env file is configured correctly, you can run the app using one of the following options:

Running the app

For running the app locally, you'll need node 20 and npm installed on your machine. We recommend using nvm for managing node versions.

cd frontend
nvm use
npm install
npm run build

Once the build is complete, you can run the app using the following command from the root directory:

cd ..
uv run python -m uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4

One the app is up and running, you can access it at http://localhost:8000

Development

For development purposes, you can run the app in development mode using the following command:

Start the backend server:

uv run python -m uvicorn main:app --reload

and in a separate terminal, start the frontend development server:

cd frontend
nvm use
npm install
npm run build
npm run dev

Once the app is up and running, you can access it at http://localhost:5173

🔨 MCP tool for Claude Desktop (Experimental)

Make sure you have Claude Desktop installed, uv package manager and Python installed on your machine. Clone the project repository and navigate to the project directory.

Install the required Python packages using uv.

uv sync

Edit claude desktop config file and add the following lines to the mcpServers section:

{
  "mcpServers": {
    "AutoC": {
      "command": "uv",
      "args": [
        "--directory",
        "/PATH/TO/AutoC",
        "run",
        "mcp_server.py"
      ]
    }
  }
}

Restart the app, you should see the AutoC MCP server in the list of available MCP servers.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.idea		.idea
api		api
backend		backend
docs		docs
frontend		frontend
nginx		nginx
research		research
.dockerignore		.dockerignore
.env.sample		.env.sample
.gitignore		.gitignore
.nvmrc		.nvmrc
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
cli.py		cli.py
config.json		config.json
docker-compose.yaml		docker-compose.yaml
main.py		main.py
mcp_server.py		mcp_server.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AutoC

Features

Getting Started

🚀 Quick Start

Optional Services

📦 Installation

🔑 Configuration

Supported LLM providers:

Suggested models by provider:

Enhanced Blog post extraction (optional)

Q&A Batch Mode (optional)

Q&A RAG Mode (optional)

MITRE ATT&CK TTPs detection (optional)

📝 Usage

🧑‍💻 Bonus - Try our UI

🏃Up and running options:

Running the app

Development

🔨 MCP tool for Claude Desktop (Experimental)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

barvhaim/AutoC

Folders and files

Latest commit

History

Repository files navigation

AutoC

Features

Getting Started

🚀 Quick Start

Optional Services

📦 Installation

🔑 Configuration

Supported LLM providers:

Suggested models by provider:

Enhanced Blog post extraction (optional)

Q&A Batch Mode (optional)

Q&A RAG Mode (optional)

MITRE ATT&CK TTPs detection (optional)

📝 Usage

🧑‍💻 Bonus - Try our UI

🏃Up and running options:

Running the app

Development

🔨 MCP tool for Claude Desktop (Experimental)

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages