Skip to content

[Enhancement] When observable data is too big, use hash #2288

Closed
@To-om

Description

@To-om

Request Type

Enhancement

Feature Description

Fields which contains more that 32k of data cannot be indexed, and breaks index engine. The aim of this issue is to store huge observable data in a dedicated unindexed field (named fullData) and store the hash in indexed field (instead of the real value). This change must be implemented in the observable creation and in the observable search (in properties).

Existing data must also be processed but the schema evolution cannot be used because the index may be broken. The processing can use the immense term processing of Scalligraph TheHive-Project/ScalliGraph#17

In order to fix existing data, the following configuration must be set:

db.janusgraph {
  immenseTermProcessing: {
    data: observableHashToIndex
  }
}

This make the next startup slower because the whole database must be crawled.
IMPORTANT This configuration should be present only for one startup to fix the data. It should be removed as soon as the process if finished.

Activity

added this to the 4.1.16 milestone on Dec 14, 2021
self-assigned this
on Dec 14, 2021
added a commit that references this issue on Dec 14, 2021

#2288 Use hash for big data

8a4ac2b
added 2 commits that reference this issue on Dec 14, 2021

#2288 Use hash for big data

2e02312

#2288 Prefix hash with algorithm

e20a231
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

    Participants

    @jeromeleonard@To-om

    Issue actions

      [Enhancement] When observable data is too big, use hash · Issue #2288 · TheHive-Project/TheHive