Stable changes for V3.0 by Anyesh · Pull Request #67 · resurfaceio/logger-python

Anyesh · 2022-01-22T04:49:22Z

Added background submission
Added ndjson batching
Arrange middlewares
Added optional custom fields on submit

…ibility

Add warning utility inside the utils sub package usagelogger

Arrange multipart and warnings utility inside the utils directory middleware

middleware

usagelogger middleware

…ibility

Add warning utility inside the utils sub package usagelogger

Arrange multipart and warnings utility inside the utils directory middleware

middleware

usagelogger middleware

…-python into enhance/new-structure

…queue and thread Added a mechanism that takes the payload and puts that on queue and spawns 1 extra thread for background thread. That thread reads the data from queue, adds WAF threat score then appends to a list. Later that list is submitted as a batch to reduce network overhead. http_logger, base_logger

Remove `ml-waf` from logger

Depricate old root dir middlewares and move them to middlewares dir Breaks everything from v2.x.x

@monrax

…ocess Adding queue empty break on 100 as per suggestion of @monrax

monrax · 2022-01-24T20:51:27Z

I have yet to test it and measure performance, but it looks good! I think there's still the need to add another worker thread to handle the POST request itself, though. We don't expect it to add any significant overhead, but it is still a network I/O operation that should happen in the background (while payloads are being added to the queue but also while they are being processed). In terms of adding a bound to the queue, perhaps we should check its size instead of using a counter variable without a lock. I would suggest using the maxsize argument to initialize the queue itself, but I don't think we want to block at any point.

monrax · 2022-01-24T21:38:43Z

I think creating new threads might be too expensive compared to just leaving one daemon thread running since logger initialization. I believe the Queue.get method will block only the worker thread and not the main thread, so while there's no new submissions (and all previous ones have been bundled and sent) that thread will just sleep until there's a new msg to submit.

Maybe we can have two queues for each worker thread: an unbounded one where we will put payloads to be processed (compression, JSON serialization) by a worker thread, and one bounded queue where that same worker will put messages in while the other worker thread bundles them and handles the I/O operation. The goal would be that the process of putting payloads is never blocked, but the process of putting processed messages to bundle and send is, so that it would pause when the max bundle size is reached (while that is happening the payload would just pile on the unbounded queue). I can work on a first version of how that would look like.

Lastly, is there a reason for us to create a dictionary with the submission url if self.url is already available from the worker thread? (Same with self.skip_compression, and self.agent and the logger version for the headers)

monrax · 2022-01-26T00:55:24Z

I just pushed some changes to a new branch here: 0cd8193 I ended up not using the two queues, but instead just using a list as a buffer. I wanted to create the dispatcher thread when the logger is initialized, and change its internal loop to an inifinite loop so that it would sleep when the queue is empty. However, we would need to find a way to "flush" the buffer list before that happens.

As I'm writing this, I realize that a similar result could be achieved by just not checking for the "submission_thread" worker thread existence after putting in the payload, and instead using submit as a dispatcher of a limited amount of worker threads that would consume from the same queue, creating their own batches and submitting them accordingly (I think because of this condition I initially assumed the __internal_submission method to be a dispatcher instead of a network I/O worker thead).

One change introduced in this commit is doing the compression after it is all bundled instead of compressing each message and then joining them together (after thinking about it I figured it would be best to do it that way since compression works better as the amounts of data to look for patterns and similarities in increases).

Anyesh · 2022-01-31T02:59:04Z

I just pushed some changes to a new branch here: 0cd8193 I ended up not using the two queues, but instead just using a list as a buffer. I wanted to create the dispatcher thread when the logger is initialized, and change its internal loop to an inifinite loop so that it would sleep when the queue is empty. However, we would need to find a way to "flush" the buffer list before that happens.

As I'm writing this, I realize that a similar result could be achieved by just not checking for the "submission_thread" worker thread existence after putting in the payload, and instead using submit as a dispatcher of a limited amount of worker threads that would consume from the same queue, creating their own batches and submitting them accordingly (I think because of this condition I initially assumed the __internal_submission method to be a dispatcher instead of a network I/O worker thead).

One change introduced in this commit is doing the compression after it is all bundled instead of compressing each message and then joining them together (after thinking about it I figured it would be best to do it that way since compression works better as the amounts of data to look for patterns and similarities in increases).

That commit looks good but could you please check why unit tests are not passing?

Anyesh added 30 commits June 23, 2021 10:27

chore: Initiate incomplete AI WAF

1209a11

chore: Organize depricating imports

3edeb7f

chore: Move adapter to utils

c4c757a

chore: Move modules in middleware and add mirrors for backward compat…

6faeb97

…ibility

Merge branch 'master' into enhance/new-structure

51c5029

refactor(project-restructure): 🎨 Sync with current structure

412be96

Add warning utility inside the utils sub package usagelogger

refactor(project-restructure): 🎨 Move all utils to the utils dir

bb026af

Arrange multipart and warnings utility inside the utils directory middleware

refactor(project-restructure): 🎨 Sync utils import

4679646

middleware

new: Add init utils package

2087fac

feat(project-restructure): ✨ Add packages and subpackages manually

fc3118e

fix(project-restructure): 🐛 Make the package path platform independent

ce68c98

usagelogger middleware

refactor(project-restructure): 🎨 Rename utilities

3696c79

docs(project-restructure): 🎨 Update docs for middleware

45b9900

fix(project-restructure): Append usagelogger path in tests

4fffe70

chore: Organize depricating imports

7bfe2f4

chore: Move adapter to utils

2d076ef

chore: Move modules in middleware and add mirrors for backward compat…

0968310

…ibility

refactor(project-restructure): 🎨 Sync with current structure

4098d14

Add warning utility inside the utils sub package usagelogger

refactor(project-restructure): 🎨 Move all utils to the utils dir

0c730d1

Arrange multipart and warnings utility inside the utils directory middleware

refactor(project-restructure): 🎨 Sync utils import

3caa11a

middleware

new: Add init utils package

b25b257

feat(project-restructure): ✨ Add packages and subpackages manually

081694f

fix(project-restructure): 🐛 Make the package path platform independent

9b50a75

usagelogger middleware

refactor(project-restructure): 🎨 Rename utilities

9257184

docs(project-restructure): 🎨 Update docs for middleware

d28d97a

fix(project-restructure): Append usagelogger path in tests

46341cc

Merge branch 'enhance/new-structure' of github.com:resurfaceio/logger…

ccfdd65

…-python into enhance/new-structure

feat: Add crossplatform tests

3f5c386

feat: Remove pytest workflow

697d64b

feat: Use pytest only

752c6d7

Anyesh added 19 commits October 20, 2021 20:13

feat: Include artifacts dir in package

606c4a4

feat: Add WAF in the submit section

d1fd4a7

fix(background-submission): ⏪ Re-enable session copy

9b505ce

fix(background-submission): ♻️ Move lag to a better place

ba22ff8

feat: Remove lag from payload

c82c76e

Merge branch 'experimental/add-ai-waf' into background-submission

3bd7a79

feat(background-submission): ⚡ Merge waf to background submission

4c9a5f4

refactor(background-submission): 🔇 Remove status code log

c27eadc

feat: Update requirements for WAF

44f5a47

fix: Join daemon thread in the end

6df4376

revert(ml-waf): ⏪

2e95ddb

Remove `ml-waf` from logger

feat(v3): ✨ Change port for the v3

21002bb

test(v3): ✨ Add DEBUG env for background unittests

4e25f48

refactor(v3): 🗑️ Depricate old modules

e5835de

Depricate old root dir middlewares and move them to middlewares dir Breaks everything from v2.x.x

feat(v3): ✅ Add DEBUG flag to join worker thread

62b0608

feat(v3): ✨ Add custom fields in message

e4a6af9

feat(v3): 🐛 Add empty message(list) check

9a266c0

fix(background-submission): 🐛 Add a simple break on queue emptying pr…

8592f39

…ocess Adding queue empty break on 100 as per suggestion of @monrax

Anyesh requested review from monrax and robfromboulder January 22, 2022 04:49

Anyesh added 4 commits February 12, 2022 09:36

build(v3): 🔖 Update version to 3.0.0

ded7a3b

feat(v3): 🚧 Get remote address from http headers

f7e1a1d

feat(v3): ✨ Add section for remote_addr

7940da9

feat(v3): 🔨 Add remote_addr for all framework

60de876

Anyesh merged commit ba89843 into master Feb 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stable changes for V3.0#67

Stable changes for V3.0#67
Anyesh merged 62 commits intomasterfrom
v3.0.x

Anyesh commented Jan 22, 2022

Uh oh!

monrax commented Jan 24, 2022

Uh oh!

monrax commented Jan 24, 2022

Uh oh!

monrax commented Jan 26, 2022

Uh oh!

Anyesh commented Jan 31, 2022

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

Conversation

Anyesh commented Jan 22, 2022

Uh oh!

monrax commented Jan 24, 2022

Uh oh!

monrax commented Jan 24, 2022

Uh oh!

monrax commented Jan 26, 2022

Uh oh!

Anyesh commented Jan 31, 2022

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants