-
Notifications
You must be signed in to change notification settings - Fork 261
Add configurable archivalDays (30/60/90) with UI, backend, proto, and cron support #3226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add configurable archivalDays (30/60/90) with UI, backend, proto, and cron support #3226
Conversation
...ection-backend/src/main/java/com/akto/threat/backend/cron/ArchiveOldMaliciousEventsCron.java
Show resolved
Hide resolved
} catch (Exception ignore) { | ||
// leave context unset for non-numeric db names | ||
} | ||
processDatabase(dbName, nowSeconds); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avoid processDatabase call if context id wasn't set(catch)
|
||
private void ensureCollectionExists(MongoDatabase db, String collectionName) { | ||
boolean exists = false; | ||
try (MongoCursor<String> it = db.listCollectionNames().cursor()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 things -
- Check if there's a better way to do this other than iterating on all collections, if not then it's fine
- Do not create a new collection if it doesn't exist for a particular database. Skip any processing in this case
Object val = doc.get("archivalDays"); | ||
if (val instanceof Number) { | ||
long days = ((Number) val).longValue(); | ||
if (days == 30L || days == 60L || days == 90L) return days; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, just take the return the value of days. Add a sanity check like -
if days < 30 || days > 90:
return default_val
also make these as constants
if (!(retentionDays == 30L || retentionDays == 60L || retentionDays == 90L)) { | ||
retentionDays = DEFAULT_RETENTION_DAYS; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need of this, handle in fetchretentiondays, view the comment
int totalMoved = 0; | ||
|
||
while (true) { | ||
List<Document> batch = new ArrayList<>(BATCH_SIZE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's do an exercise here. Let me know, i can help you with it. Insert around 1 million documents, where around almost all of them are older. Measure how much time it takes to delete. Can help you with the data insertion script
|
||
int totalMoved = 0; | ||
|
||
while (true) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a log for how long it take for each iteration in while loop
try (MongoCursor<Document> cursor = source | ||
.find() | ||
.sort(Sorts.ascending("detectedAt")) | ||
.limit(batch) | ||
.cursor()) { | ||
while (cursor.hasNext()) { | ||
oldestDocs.add(cursor.next()); | ||
} | ||
} | ||
|
||
if (oldestDocs.isEmpty()) break; | ||
|
||
final List<Document> docsSnapshot = new ArrayList<>(oldestDocs); | ||
this.scheduler.submit(() -> { | ||
try { | ||
dest.insertMany(docsSnapshot); | ||
} catch (Exception e) { | ||
logger.errorAndAddToDb("Async archive insert failed in db " + dbName + ": " + e.getMessage(), LoggerMaker.LogDb.RUNTIME); | ||
} | ||
}); | ||
|
||
Set<Object> ids = new HashSet<>(); | ||
for (Document d : oldestDocs) { | ||
ids.add(d.get("_id")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why all this repetitive code. Ideally we should be calling the same functions which were adding to temp collection and deleting the docs in malicious_events
|
||
{ | ||
int val = updatedConfig.getArchivalDays(); | ||
if (val == 30 || val == 60 || val == 90) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add < and > checks as commented earlier
} | ||
{ | ||
int val = updatedConfig.getArchivalDays(); | ||
if (val == 30 || val == 60 || val == 90) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar comment as above
long deleted = deleteByIds(source, ids, dbName); | ||
|
||
totalDeleted += deleted; | ||
toRemove -= deleted; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove toRemove variable. Let's put a check where we break out of while loop if more than 1_00_000 docs have been deleted, or no documents are left to be deleted
// leave context unset for non-numeric db names | ||
} | ||
if (accId != null) { | ||
processDatabase(dbName, nowSeconds); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename processDatabase to a more sutiable name, which indicates what kinda operation is happening inside the function. for ex - deleteDocumentsFromDatabase
No description provided.