Conversation
Add warning utility inside the utils sub package usagelogger
Arrange multipart and warnings utility inside the utils directory middleware
usagelogger middleware
Add warning utility inside the utils sub package usagelogger
Arrange multipart and warnings utility inside the utils directory middleware
usagelogger middleware
…-python into enhance/new-structure
…queue and thread Added a mechanism that takes the payload and puts that on queue and spawns 1 extra thread for background thread. That thread reads the data from queue, adds WAF threat score then appends to a list. Later that list is submitted as a batch to reduce network overhead. http_logger, base_logger
Remove `ml-waf` from logger
Depricate old root dir middlewares and move them to middlewares dir Breaks everything from v2.x.x
…ocess Adding queue empty break on 100 as per suggestion of @monrax
|
I have yet to test it and measure performance, but it looks good! I think there's still the need to add another worker thread to handle the POST request itself, though. We don't expect it to add any significant overhead, but it is still a network I/O operation that should happen in the background (while payloads are being added to the queue but also while they are being processed). In terms of adding a bound to the queue, perhaps we should check its size instead of using a counter variable without a lock. I would suggest using the |
|
I think creating new threads might be too expensive compared to just leaving one daemon thread running since logger initialization. I believe the Maybe we can have two queues for each worker thread: an unbounded one where we will put payloads to be processed (compression, JSON serialization) by a worker thread, and one bounded queue where that same worker will put messages in while the other worker thread bundles them and handles the I/O operation. The goal would be that the process of putting payloads is never blocked, but the process of putting processed messages to bundle and send is, so that it would pause when the max bundle size is reached (while that is happening the payload would just pile on the unbounded queue). I can work on a first version of how that would look like. Lastly, is there a reason for us to create a dictionary with the submission url if |
|
I just pushed some changes to a new branch here: 0cd8193 I ended up not using the two queues, but instead just using a list as a buffer. I wanted to create the dispatcher thread when the logger is initialized, and change its internal loop to an inifinite loop so that it would sleep when the queue is empty. However, we would need to find a way to "flush" the buffer list before that happens. As I'm writing this, I realize that a similar result could be achieved by just not checking for the One change introduced in this commit is doing the compression after it is all bundled instead of compressing each message and then joining them together (after thinking about it I figured it would be best to do it that way since compression works better as the amounts of data to look for patterns and similarities in increases). |
That commit looks good but could you please check why unit tests are not passing? |
ndjsonbatching