-
Notifications
You must be signed in to change notification settings - Fork 30
Introducing monitoring #1960
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Introducing monitoring #1960
Conversation
c5908b0 to
1564aa8
Compare
499ca01 to
8b4a4d4
Compare
|
Just a warning: In about 2 days the gathered stats take up about 16GB on my relatively small server that mainly gets a lot of AP requests, but not that much user requests
|
|
Maybe the stats then should get their own DB connection configuration. This way admins can store the data somewhere where the fill up is not much of a problem or even choose a DB which allows compression. |
|
This is really exciting ! But yeah, you definitely want this is a separate DB.. Or maybe even in a timeseries optimized DB. Especially due to the sizes. |
|
I think I can bring down the size a lot if I separate the query string out to another table and only save a reference to that in the "instances" of the query. Also I want to make some things optional. At the moment it does record the parameters of each query as well, which might be overkill or just unnecessary, so I want it to be optional |
|
And it is quite a lot of data, so you might not want to monitor it for that long anyway. Otherwise you just don't get through it :D |
We need monitoring of the monitoring, that is monitoring. 😅 |
1382885 to
7f1c063
Compare
|
Ok, so I pushed a few changes:
I have it live on gehirneimer, without saving the parameters to the DB. Lets see how much space it is taking up in 2 days :) |
- add monitoring capabilities for troubleshooting purposes. This cannot be enabled or disabled from the UI, you have to go into the `.env` files for this - Add monitoring capabilities for the biggest factors: curl requests (AP), twig rendering (frontend) and of course database querying - 2 sources can start an execution context: an incoming request and a started message - added an admin UI for inspecting requests and their response times, with an overview grouped by route name
- Add filtering form and dto to the monitoring overview - Fix `Could not convert PHP type 'array' to 'json', as an 'Malformed UTF-8 characters, possibly incorrectly encoded' error was triggered by the serialization` and add a test for it - move chart data generation to the controller
Because the twig render is like flame graph there could be nested templates of the same type which the previous code did not support, move to a "stack" like first in last out model
Reason: we want to group performance by route name and message class. Before this we grouped performance by route name and transport, which does not make that much sense
This will end the execution context **after** the response has been sent to keep the monitoring overhead to a minimum from a users perspective, as quite a few entities will be created on this event
- to save space in the DB: - make parameter storing optional via env var - similar queries/statements will not take up space for each call, but will be saved to a separate table and referenced by hash - Make the overview chart also display aggregated stats based on the current filter - Add a dropdown to switch between the total time and the mean time in the overview chart - Add a gradient to the twig renders either based on the percentage of the total duration or the parent duration
- `router->matchRequest` throws an exception when it cannot match the route -> catch that - actually pass the whole request to the router, otherwise it could not match headers which makes basically all AP requests fail
445b767 to
7a8aa44
Compare
.envfiles for thisThe main goal of this PR is to add monitoring capabilities, so we can see what our messengers are doing in the background. Additionally I think the graph in the overview (even though it is from a dev instance) shows pretty well that we have to tackle our twig rendering. My guess is that that is a big problem on larger instances, like fedia.io.
Some screenshots: