Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debug Logging and Attempt to resolve caching issues #469

Merged
merged 5 commits into from
Sep 30, 2021
Merged

Conversation

fmigneault
Copy link
Collaborator

  • add more debug logging during effective perm/acl resoluion
  • add session merge/reconnect of resource object between service/acl caches

relates to #466

@fmigneault fmigneault self-assigned this Sep 30, 2021
@github-actions github-actions bot added api Something related to the API operations db Issues related to database connection, migration or data models doc Documentation improvements or building problem plugin Service plugin labels Sep 30, 2021
@codecov-commenter
Copy link

codecov-commenter commented Sep 30, 2021

Codecov Report

Merging #469 (b9bbd9e) into master (8e63353) will increase coverage by 0.00%.
The diff coverage is 76.19%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #469   +/-   ##
=======================================
  Coverage   79.28%   79.29%           
=======================================
  Files          70       70           
  Lines        8836     8846   +10     
  Branches     1207     1196   -11     
=======================================
+ Hits         7006     7014    +8     
- Misses       1550     1554    +4     
+ Partials      280      278    -2     
Impacted Files Coverage Δ
magpie/adapter/__init__.py 62.79% <ø> (ø)
magpie/api/management/resource/resource_utils.py 80.34% <0.00%> (-0.70%) ⬇️
magpie/api/management/user/user_utils.py 96.72% <ø> (-0.03%) ⬇️
magpie/models.py 87.82% <50.00%> (-0.16%) ⬇️
magpie/services.py 81.33% <77.77%> (+0.27%) ⬆️
magpie/adapter/magpieowssecurity.py 69.15% <100.00%> (+0.29%) ⬆️
magpie/api/management/service/service_utils.py 74.46% <100.00%> (ø)
magpie/permissions.py 92.49% <100.00%> (+0.02%) ⬆️
magpie/register.py 46.47% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 94186a2...b9bbd9e. Read the comment docs.

@fmigneault
Copy link
Collaborator Author

@tlvu
Once validated, I will tag a new patch version with added logs.
You can try to see if the added session merge operation resolves your issue.

When re-testing, see if you can enable DEBUG logging for both Magpie and Twitcher so we can have a clear idea of which steps are happening in case the caching-related error still occurs.

If it does happen, I would also appreciate if you could try running Twitcher with gunicorn instead of waitress.
I can see your logs in #466 are formed as follows (waitress):

2021-09-09 15:24:02,563 ERROR [TWITCHER:33][waitress-0] ...

While mine are as follows (gunicorn):

2021-09-29 22:43:58,303 DEBUG [TWITCHER:177][ThreadPoolExecutor-0_1]  ...

I'm not sure if their worker/threading vs memory sharing strategies are equivalent, but I can never replicate the error you get when using gunicorn. Because waitress only reports -0 rather than 0_1 as gunicorn does, it have a feeling their multiprocessing are handled differently, which could impact how in-memory caches work between parallel requests.

@tlvu
Copy link
Contributor

tlvu commented Sep 30, 2021

I would also appreciate if you could try running Twitcher with gunicorn instead of waitress.

Isn't this choice in the Twitcher docker image, the CMD or ENTRYPOINT line?

If in birdhouse-deploy, please make a PR and I can enable it on our staging server and try another round of Jenkins run to smoke out the problem.

Have you been able to reproduce it from a real PAVICS stack on your end? Otherwise, you can hit our staging server https://medus.ouranos.ca. I got it reproduced a few times.

Copy link
Contributor

@tlvu tlvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, ignore my f-string comment if compat with python 3.5 and below is needed or if any other reasons. This is not blocking at all.

magpie/adapter/magpieowssecurity.py Show resolved Hide resolved
@fmigneault
Copy link
Collaborator Author

Isn't this choice in the Twitcher docker image, the CMD or ENTRYPOINT line?

The docker offers both waitress and gunicorn. It is selected based on the INI file.
The default Twitcher configuration uses gunicorn by default since v0.5.4 (Magpie now points at v0.5.6).
https://github.com/bird-house/twitcher/blob/0.5.x/CHANGES.rst#054-2020-10-29
v0.5.3...0.5.x#diff-28b1353404

It is overridden in birdhouse-deploy:
https://github.com/bird-house/birdhouse-deploy/blob/master/birdhouse/config/twitcher/twitcher.ini.template#L83-L85

I have not been able to test on a full stack. I will push the tag now to make the patch available.
I will be unavailable most of tomorrow so you can give it a try first and not have wait after me.

@fmigneault fmigneault merged commit d2490d0 into master Sep 30, 2021
@fmigneault fmigneault deleted the debug-logs branch September 30, 2021 03:40
@tlvu
Copy link
Contributor

tlvu commented Sep 30, 2021

I have not been able to test on a full stack. I will push the tag now to make the patch available.

Sorry, I was not clear. I meant reproduce the faulty behavoir with the current old version of Magpie/Twitcher that is in birdhouse-deploy right now but on the full PAVICS on your side.

I will be unavailable most of tomorrow so you can give it a try first and not have wait after me.

No rush. I am working on Geoserver this week. We had a power outage at the beginning of the week with problems so I already lost some time there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Something related to the API operations db Issues related to database connection, migration or data models doc Documentation improvements or building problem plugin Service plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants