Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename instances of "monkeypox" to "mpox" #220

Merged
merged 2 commits into from
Nov 1, 2023
Merged

Rename instances of "monkeypox" to "mpox" #220

merged 2 commits into from
Nov 1, 2023

Conversation

trvrb
Copy link
Member

@trvrb trvrb commented Nov 1, 2023

This commit goes through the repo and basically does a find-and-replace of "monkeypox" to "mpox, except for a handful of locations like the #monkeypox-updates Slack notification.

This shouldn't change appearance of build outputs except for:

  • Updating to "Built by nextstrain/mpox"
  • Updating page titles to be more descriptive of what each page is showing in terms of lineages / clades
  • A bug fix to the page description to link to proper build URLs (currently these go to 404)

This has the side effect that intermediate files will begin to propagate to https://data.nextstrain.org/files/workflows/mpox/sequences.fasta.xz, etc... rather than /files/workflows/monkeypox. This may affect users who've been relying on the current URLs. Nothing should break for these users as the old files will continue to exist but they will no longer receive updated data. I think this is an okay trade-off in terms of downstream impacts.

Copy link
Member

@victorlin victorlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending one small change which I'll address.

ingest/vendored/README.md Show resolved Hide resolved
@corneliusroemer
Copy link
Member

corneliusroemer commented Nov 1, 2023

If people request from data.nextstrain.org we could be able to do URL forwarding, couldn't we? Wouldn't that be better than serving stale data?

Otherwise, for those requesting directly from AWS not via data.nextstrain.org, we have a choice between serving stale data or deleting the old file so that workflows error - each have advantages and disadvantages. Might be worth doing a code search on Github to see who uses the data and ping them.

Copy link
Member

@corneliusroemer corneliusroemer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but I would consider setting a HTTP redirect from https://data.nextstrain.org/files/workflows/monkeypox to https://data.nextstrain.org/files/workflows/mpox if this is possible

This commit goes through the repo and basically does a find-and-replace of "monkeypox" to "mpox, except for a handful of locations like the #monkeypox-updates Slack notification.

This shouldn't change appearance of build outputs except for:
- Updating to "Built by nextstrain/mpox"
- Updating page titles to be more descriptive of what each page is showing in terms of lineages / clades
- A bug fix to the page description to link to proper build URLs (currently these go to 404)

This has the side effect that intermediate files will begin to propagate to https://data.nextstrain.org/files/workflows/mpox/sequences.fasta.xz, etc... rather than `/files/workflows/monkeypox`. This may affect users who've been relying on the current URLs. Nothing should break for these users as the old files will continue to exist but they will no longer receive updated data. I think this is an okay trade-off in terms of downstream impacts.
@joverlee521
Copy link
Contributor

except for a handful of locations like the #monkeypox-updates Slack notification.

Addressing this separately in #221

subrepo:
  subdir:   "ingest/vendored"
  merged:   "a0faef5"
upstream:
  origin:   "https://github.com/nextstrain/ingest"
  branch:   "main"
  commit:   "a0faef5"
git-subrepo:
  version:  "0.4.6"
  origin:   "https://github.com/ingydotnet/git-subrepo"
  commit:   "110b9eb"
@victorlin victorlin mentioned this pull request Nov 1, 2023
5 tasks
@corneliusroemer
Copy link
Member

These are all the repos that Github could find using https://data.nextstrain.org/files/workflows/monkeypox that could benefit from URL redirects (17, with a handful of non-Nextstrain repos): https://github.com/search?q=data.nextstrain.org%2Ffiles%2Fworkflows%2Fmonkeypox&type=code

These are the repos that use aws s3 paths that can't be redirected (10 mostly ours): https://github.com/search?q=s3%3A%2F%2Fnextstrain-data%2Ffiles%2Fworkflows%2Fmonkeypox&type=code

@joverlee521
Copy link
Contributor

joverlee521 commented Nov 1, 2023

If people request from data.nextstrain.org we could be able to do URL forwarding, couldn't we?

Yes, definitely possible. Not sure if we need to do it on the Cloudfront side but it's possible to configure S3 website redirections. Also looks like it's possible to redirect requests for an S3 object

Edit: The S3 object redirect is via x-amz-website-redirect-location, which I think is only for setting the website redirection and not for direct S3 object access.

@trvrb
Copy link
Member Author

trvrb commented Nov 1, 2023

The fetch and ingest workflow completed successfully. However, the artifacts ended up at s3://nextstrain-datafiles/workflows/mpox/branch/rename-to-mpox/. However, it looks good besides this. I'm assuming the best course of action is to merge this PR when it passes review and then to run fetch and ingest again from main so that the files are placed in the proper s3 location.

@trvrb
Copy link
Member Author

trvrb commented Nov 1, 2023

Thanks @victorlin for the fix with commit b14d6fd. I'm going to go ahead and merge this now.

@trvrb trvrb merged commit d8a8a97 into master Nov 1, 2023
1 check passed
@trvrb trvrb deleted the rename-to-mpox branch November 1, 2023 20:11
@trvrb
Copy link
Member Author

trvrb commented Nov 1, 2023

Fetch and ingest is running from master here: https://github.com/nextstrain/mpox/actions/runs/6724598765 and should update S3 files to specify s3://nextstrain-datafiles/workflows/mpox/. I'll monitor and update here when complete.

@joverlee521
Copy link
Contributor

We discussed redirection rules for monkeypox during today's dev chat and I added @tsibley 's suggested redirection rules to the nextstrain-data S3 bucket.

So now access through data.nextstrain.org should redirect to mpox

$ curl -I https://data.nextstrain.org/files/workflows/monkeypox/metadata.tsv.gz
HTTP/2 301 
content-length: 0
location: https://data.nextstrain.org/files/workflows/mpox/metadata.tsv.gz
date: Wed, 01 Nov 2023 23:55:24 GMT
server: AmazonS3
x-cache: Miss from cloudfront
via: 1.1 ac3f0425be668a2439884bb8cbd3ccd8.cloudfront.net (CloudFront)
x-amz-cf-pop: SFO53-C1
x-amz-cf-id: 14QBYLPsKkmgKazVcOsSE7kPxM8NRaCzXSOe-rcV6odttMGDfwU0EA==

Redirection for the s3:// paths are not possible. However, based on Cornelius's search, the s3:// paths are only used for our internal uploads and slack notifications.

@trvrb
Copy link
Member Author

trvrb commented Nov 2, 2023

Perfect! Thank you @joverlee521.

I can confirm that s3://nextstrain-data/files/workflows/mpox/metadata.tsv.gz and hence https://data.nextstrain.org/files/workflows/mpox/metadata.tsv.gz now exists as expected after running ingest above.

I deleted the monkeypox/ directory in files/workflows/ to reduce confusion. And as expected from your post above, https://data.nextstrain.org/files/workflows/monkeypox/metadata.tsv.gz works to retrieve the mpox/ equivalent.

I believe the next step here is to confirm that a phylogenetics rebuild works as expected. I'll kick this off now.

@trvrb
Copy link
Member Author

trvrb commented Nov 2, 2023

And actually... this testing already happened with the automatic updates. Earlier today actions including https://github.com/nextstrain/mpox/actions/runs/6735424381 ran and produced updated Auspice JSONs like mpox_all-clades.json. I believe we're set in this case. The corresponding issue #168 is now closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants