Skip to content

[Bug] Continued performance issues after upgrade to 4.1.1 #1896

Closed
@jhk70

Description

Continued performance issues after upgrade to 4.1.1

Request Type

Bug

Work Environment

Question Answer
OS version (server) Ubuntu
OS version (client) 18.04
TheHive version / git hash 4.1.1 (docker image 4.1.1-2
Package Type Docker
Browser type & version Various

Problem Description

After upgrading from 4.0.5-1 to 4.1.0 and then 4.1.1:

  1. audit entries don't show in the application "live stream" view.
  2. I get the familiar "AuditSrv" error after a while
  3. the "Data Index Status" section of the "Platform Status" page does not load (i.e. user session times out before it loads).
    This was consistent behaviour for 4.1.0 and 4.1.1.
    The Audit table has 1,265,475 entries.

Steps to Reproduce

  1. Upgrade the hive as described here
  2. Configure local lucene index.
  3. Start server.
  4. Use Server

Complementary information

Other observations / debug actions:

  1. During initial indexing, there were a number of "org.janusgraph.diskstorage.TemporaryBackendException: Temporary failure in storage backend" errors. Removing MAX_HEAP_SIZE and HEAP_NEWSIZE settings on cassandra removed these.

  2. During initial periods after the upgrade, there was evidence of memory exhaustion. More RAM was added and the host and thehive was given 16g via -e JAVA_OPTS='-Xms16g -Xmx16g'

  3. Without the "Platform Status" page, I have been able to reindex with curl:
    curl -k "https://<host>:9000/api/v1/admin/index/Case/reindex" -H 'Authorization: Bearer *authwibble*'
    I have re-run these for each Index and the logs show that these complete successfully.

  4. Snippets from the Audit reindex logs:

    Mar 25 21:39:52 hivehost01 docker[26287]: [info] o.t.s.m.Database [00000020|] Reindex job is running: 1265475 record(s) indexed
    Mar 25 21:39:53 hivehost01 docker[26287]: [info] o.j.g.d.m.ManagementSystem [|] Index update job successful for [AuditRequestidMainaction]
    Mar 25 21:39:53 hivehost01 docker[26287]: [info] o.t.s.m.Database [00000020|] Reindex job is finished
    
    Mar 25 21:47:59 hivehost01 docker[26287]: [info] o.t.s.m.Database [00000020|] Reindex job is running: 0 record(s) indexed
    Mar 25 21:48:00 hivehost01 docker[26287]: [info] o.t.s.m.Database [00000020|] Reindex job is running: 0 record(s) indexed
    Mar 25 21:48:01 hivehost01 docker[26287]: [info] o.j.g.o.j.IndexRepairJob [|] Found index Audit
    Mar 25 21:48:01 hivehost01 docker[26287]: [info] o.t.s.m.Database [00000020|] Reindex job is running: 0 record(s) indexed
    Mar 25 21:48:02 hivehost01 docker[26287]: [info] o.j.g.d.m.ManagementSystem [|] Index update job successful for [Audit]
    Mar 25 21:48:02 hivehost01 docker[26287]: [info] o.t.s.m.Database [00000020|] Reindex job is finished
    
  5. Our implementation had been "misusing" tags (per the 4.1.0 release blog) and had some long tags containing links to raw alerts etc. This was evidenced with a 6sec load time on /api/v1/query?name=list-tags. I have deleted these tags from the "Custom Tags" view. Is it possible something in the Audit content could be causing this? Is it possible to truncate / compact the Audit table?

  6. Probably unrelated but I see this on start of the server:
    Mar 25 21:27:40 hivehost01 docker[26287]: [warn] c.d.d.c.RequestHandler [|] Query '[4 bound values] SELECT column1,value,writetime(value) AS writetime,ttl(value) AS ttl F ROM thehive.graphindex WHERE key=:key AND column1>=:sliceStart AND column1<:sliceEnd LIMIT :maxRows;' generated server side warning(s): Read 947 live rows and 5788 tombstone cells for query SELECT * FROM thehive.graphindex WHERE key = 022689a05461e7 AND column1 >= 00 AND column1 < ff LIMIT 5000; token -8419547459570797906 (see tombstone_warn_threshold)

  7. I have multiple times deleted & reconfigured the index. After restart (and before index), the "platform status" page loads (all indexes = "ERROR"). After I click "Reindex" on Audit, the indexing completes and the same performance issue is present. I can then no longer refresh / view the Index Status section of the Platform Status page.

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions