Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix hero alt text #889

Merged
merged 3 commits into from
Nov 5, 2024
Merged

Fix hero alt text #889

merged 3 commits into from
Nov 5, 2024

Conversation

symroe
Copy link
Member

@symroe symroe commented Nov 5, 2024

The alt text PR did fix the problem, but a bug means we've not been running migrations and that's not been failing the CI.

This is an attempt to fix both the problems:

  1. Try to detect errors when invoking Lambda
  2. Fix the thing that's causing the error
  3. Some drive-by house keeping

@symroe symroe requested a review from chris48s November 5, 2024 11:04
Copy link

codecov bot commented Nov 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 77.69%. Comparing base (e7d1cec) to head (faadb5a).
Report is 4 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #889      +/-   ##
==========================================
- Coverage   77.73%   77.69%   -0.04%     
==========================================
  Files          28       28              
  Lines         566      565       -1     
==========================================
- Hits          440      439       -1     
  Misses        126      126              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@symroe symroe force-pushed the fix-hero-alt-text branch from 3e309a4 to e17c6ff Compare November 5, 2024 12:32
@@ -116,7 +116,7 @@ jobs:
- run:
name: "Migrate database"
command: |
aws lambda invoke --function-name DCWebsiteManagementFunction --payload '{ "command": "migrate", "args": ["--no-input"] }' --cli-binary-format raw-in-base64-out -
aws lambda invoke --function-name DCWebsiteManagementFunction --payload '{ "command": "migrate", "args": ["--no-input"] }' --cli-binary-format raw-in-base64-out - | grep 'FunctionError": ""'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a while to work out what we're trying to do here is exit with a non-zero if FunctionError is equal to some value other than empty string "". Might be useful to add a comment to this one.

Some follow up questions:

  1. This will fail the build if FunctionError is equal to something other than empty string, which is good. But as far as I can see there's nothing in this PR that actually aims to fix the migrations failing. Do we know why this is happening/what the fix is? i.e: this change in itself won't cause the migration to apply correctly next time we deploy.
  2. At the point where we attempt to apply the migrations, we've already deployed the new release. So if we fail the build here, we deploy the new lambda, but then we'll just stop here (and not invalidate the CF cache or make a new sentry release). Is that right, and if so, is that a good outcome? i.e: is deploying new code and not invalidating the CloudFront cache potentially creating another class of problem?
  3. Is this a problem we think exists on other DC projects?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh.. hang on.

Is

-      Handler: democracy_club.lambda_awsgi.management_handler
+      Handler: democracy_club.lambda_wsgi.management_handler

the fix for failing migrations?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sorry, I didn't write a lot of explanation here!

The problem is the typo above: we invoke a Lambda that exists but it throws a 500 because the handler code doesn't exist at that path.

The invocation "works" in that, we have permission to call a function that exists, but "fails" because of this error.

This means we need a way to get a return code that's non-0 somehow, and I thought grepping the JSON was...a way of doing that. I can't see anything in the aws-cli docs about exit codes in this situation.

In terms of when we run the code and the implication on the deploy...I don't know. We should run the migrations after the code has changed, but how to rollback a failed deploy is a bigger question. At least this makes the error a hard fail that we can be notified about, rather than just working without us knowing.

In terms of other projects...I don't think so. Maybe WCIVF does the same sort of thing?

Copy link
Member

@chris48s chris48s Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yeah OK. We basically have no clean rollback for failed deploys, so that's a bigger issue.

  1. I think in lieu of any better way to assess success/failure of this command, grep is fine but lets explain in a comment what we're doing here.

  1. In terms of other repos affected by this issue, I think the EC2/CDK-based repos are probably fine. YNR is its own special beast.

It looks like if we have this problem here then I reckon we have the same issue on:

WCIVF
https://github.com/DemocracyClub/WhoCanIVoteFor/blob/36d9c34d6a04eaea2cc9bf9f81815354a6bfd27d/Makefile#L26-L31

devs.DC
https://github.com/DemocracyClub/aggregator-api/blob/0d9c876c619979392211d41a4c346a3a3213b80c/.circleci/config.yml#L137-L140

so lets fix those as well, rather than just leaving a known issue as a trap for ourselves to fall into in future.

Then the other 2 places I expected to find this issue were leaflets and ec-api-proxy. They both appear to be running the migrations against the prod database directly from CI instead of invoking a lambda

which is.. different
Is there a reason we've taken different approaches to this in different places? (I realise WCIVF is an outlier compared to the other apps we're looking at here as it is EC2/SAM rather than lambda/SAM)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, pushed a comment

We can't use the status code as that indicates if the _request to invoke_ the function worked, not if _the function_ worked.

FunctionError will return `Unhandled` for generic 500 errors of the function, or an empty string if nothing failed.
Couple of tiny things I noticed when looking at the alt text bug.

This removes a duplicate query and reduces the amount of data we're loading into memory
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants