-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Monorepo #336
RFC: Monorepo #336
Conversation
62b1cd8
to
40776f9
Compare
40776f9
to
b49d7fc
Compare
After extensive testing in the PoC, everything is working. In the PoC, no existing functionality is broken and a lot of things are simplified, merged or otherwise improved. In my testing, the monorepo migration was a net positive so I suggest we move forward with it. |
rfcs/20221124-monorepo.md
Outdated
|
||
1. Central place for all technical documentation, enabling documentation for different parts of the stack to cross-reference other pieces and stay current with changes in other places. | ||
|
||
1. Enables the infra to deploy the code to coexist with the code itself. Apart from the private secrets that will still need to be encrypted, the IaC files could be organised identical to the code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the IaC files could be organised identical to the code.
Just a nit but this isn't entirely true. We could maybe spread some Terraform modules out this way but it would be a pain to have so many places to search for modules. If the Terraform code will be merged into the monorepo as well then it'd be better off assuming that it will live, in totality, in a separate directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, my lack of knowledge in the IaC space is really showing here. I would appreciate if you could add any infra issues (especially any dealbreakers) and any infra benefits that you see popping up due to this merge. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From an infra perspective I don't think there are any benefits really. The only thing I could see is if we did this issue: https://github.com/WordPress/openverse-infrastructure/issues/220. The benefit of that issue could be a more easily reusable workflow with better variables in the deployment workflows we have now. Basically everything that is used to define the deployment workflows are variables that we could theoretically move into GitHub, thought it's just more manual management of those details. We could try to sync them other ways like with a GitHub secret of a JSON blob that gets passed through so the workflow call is something like:
- uses: ./actions/deploy.yml
with:
config: ${{ secrets.NUXT_DEPLOYMENT_CONFIG }}
And then the workflow parses the JSON to get the configuration.
Note: For folks without access to the infrastructure repository, the issue above describes switching to a callable workflow instead of a composite action. I've found composite actions a little more tedious so I avoided trying to make the current deployment workflows super modular. I think a callable workflow would be easier to work with for a lot of reasons and could open up the possibility of making a more module/generic single-deployment workflow file a little more realistic.
We could do all of this in our current setup, but a monorepo would end up with less places to sync the secrets to and less places to put the deployment action.
Actually, that reminds me, if the infrastructure is in the same repository, the deployment workflow outputs could be automatically merged whenever changes are made to them. Right now we have to manually sync them from the infrastructure repository to wherever they're needed. So that would be a nice simplification of that process as well (one less thing to remember to do!) 🙂
rfcs/20221124-monorepo.md
Outdated
|
||
- In fact, we employ a number of hacks to install and configure pre-commit for the frontend. Merging it with the API eliminates the need for such hacks. | ||
|
||
1. The entire system can be integration tested during releases. The real API, populated with test data, can even replace the Talkback server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting idea! For this to be feasible, we would need to come up with a definitive solution to preventing the API from making network requests when it executes a search. Right now Playwright tests do not depend at all on any external network and that needs to remain the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it can be done (maybe something like an environment variable flag to not make requests) and I can envision it being glorious when it works.
rfcs/20221124-monorepo.md
Outdated
|
||
1. The entire system can be integration tested during releases. The real API, populated with test data, can even replace the Talkback server. | ||
|
||
The `WordPress/openverse-api` repo will absorb the `WordPress/openverse-frontend` repo. The `WordPress/openverse-catalog` will also be merged, _later_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will there be a period of time during which frontend work is happening only in openverse-api
? Is the long-term plan to merge everything into openverse-api
and then eventually rename that to just openverse
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The timeline will be
- pause work on the frontend
- merge
WordPress/openverse-frontend
intoWordPress/openverse-api
- transfer issues and other relevant GitHub models
- get fronted functional, enable deploys again
- resume frontend work in
WordPress/opnenverse-api
- archive
WordPress/opnenverse-frontend
The one week to one fortnight period mentioned in the RFC is the time between step 1 and step 5.
rfcs/20221124-monorepo.md
Outdated
|
||
#### Get the timing right | ||
|
||
The first step will be to release the frontend, call a code freeze and pause work on it. This is to prevent the frontend repo from continuing to drift as we merge a snapshot of it with the API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we schedule a bug bash before this so that we can identify and resolve any outstanding issues before we enter the period of time where we won't be able to without great disruption?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that would be ideal (maybe even necessary). Before embarking on this potential two-week freeze, it would be best to have the frontend be in a state that could serve two weeks without maintenance.
If we have no high or critical priority tickets, that would indicate optimal time to work on this move. Getting the timing right is one of the main things here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are planning to do a bug discovery session during the Istanbul meetup in January, maybe targeting this for late February (or later/earlier depending on what bugs we find) could be good, provided everyone is on board. It'd be nice to know that ahead of time as well so that we could block off the time and expect frontend feature work to undergo a stoppage at a specific period of time rather than just "sometime in the future". Did you have any particular alternative scheduling ideas in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bug bash we had in Venice was very productive so repeating it seems logical. But there are two potential issues with that:
- We only identified and filed issues in the last meetup, fixing them afterwards.
- I can't speak for everyone but my productivity immediately after the meetup was very low.
IMO the best time for this migration would be when all the frontend devs' productivity has gone down (less activity = easier migration) so if we could iron out the bugs before the meetup, the meetup would be the best time to migrate this because we'd all be together to collaborate in the migration process.
We do need to set down a date for this but we should only do this after the iframe removal project is deployed, and has been stable for a week or two. Other than that, we can make any time frame work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should try to do a bug bash before the meetup then. We could even start the monorepo migration during the meetup given the core contributors will not likely be spending much time making direct frontend contributions. That way we could easily pair on any unexpected issues that crop up.
Though that would be hard if not impossible to do before the iframe project, given the current blockers and ambiguous time frame for resolving them 😢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The period of the code freeze is a big concern for me.
Having a bug bash before it is a really great idea. It's really difficult to plan it because we probably need a sync meeting to identify bugs, [very rough estimates] a day or 2 to document all of them, and then 1-3 weeks to fix everything, especially considering the slow pace of the PR review process. We would probably only want to fix the high
and critical
priority bugs.
I wonder how this process be affected by the features that we plan to work on now: navigation menu improvements and the homepage. Also, if we plan to work on the addition of analytics next, would that need to be finished before the monorepo move? Or fully postponed till after the move?
Starting the monorepo migration during meetup is great for the reasons Sara mentions. However, I'm not sure that the process itself is a great use of our time at the meetup. Someone will have to be working on the migration steps during the meetup and in the week after that.
rfcs/20221124-monorepo.md
Outdated
|
||
This is a quick process. | ||
|
||
1. Move the entire content of frontend inside a `frontend/` directory, except the following top-level files and folders. Please comment if you can add to this list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the plan for making pnpm
work in the monorepo? Do we want to go ahead and making pnpm
live at the root of the repository and treat the frontend as a pnpm
monorepo package? https://pnpm.io/workspaces
I think this would be necessary for us to be able to (reasonably) merge the browser extension and the JS that is currently in WordPress/openverse into the monorepo... otherwise we'll be fighting weird fights with pnpm in different directories 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was an issue. Currently I had everything working with pnpm
with package.json
moved into the frontend/
directly (no package manager at the root), but I definitely should look into the workspaces
feature.
rfcs/20221124-monorepo.md
Outdated
- `.prettierignore` (symlink into the `frontend/` directory) | ||
- `.eslintrc.js` (symlink into the `frontend/` directory) | ||
- `.eslintignore` (symlink into the `frontend/` directory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do these need to be symlinked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's an error on my part. I was performing lots of experiments in https://github.com/dhruvkb/monopenverse and with linting to work so I tried almost everything. When I got it to work finally, I forgot to update the RFC with the right approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in f2122d8.
I will need more information about this because IANAL. | ||
|
||
- LICENSE (both repos) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given Openverse still uses MIT we have probably very little to worry about here, FWIW.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, but I had a vague feeling that merging repositories could have legal implications for those that use them somehow.
- CONTRIBUTING.md (both repos) | ||
- CONTRIBUTORS.md (API only; also why?) | ||
- DOCUMENTATION_GUIDELINES.md (API only) | ||
- TESTING_GUIDELINES.md (frontend only) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would probably want to move this into the frontend directory, I would think. It is extremely frontend specific. Or would it go into the Sphinx docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My long-term goal is for Sphinx docs to be the single source of reference for all developers, not just API devs.
Some Markdown files that people generally look for in the repo root could be left there but ideally, even they would just point to the relevant Sphinx pages for the real stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love that plan!
Co-authored-by: sarayourfriend <24264157+sarayourfriend@users.noreply.github.com>
rfcs/20221124-monorepo.md
Outdated
|
||
## Migration path | ||
|
||
First we will merge the API and the frontend. This decision was made for the following reasons. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reasoning makes a lot of sense to me, and I'm sold on merging API/frontend first and separately from the Catalog. How much more difficult do you anticipate eventually merging the Catalog to be? Do you think there's value in planning that work first, or can that be tabled until after the API/frontend are merged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fairly confident that merging the catalog would not cause major pain later on, mostly because it's another Python project. The API and ingestion server are both independent Python projects that have coexisted in harmony since the beginning so adding a third does not seem problematic.
We could definitely plan for the eventual catalog-API merge, but I abstained from that in the RFC because
- I am not the most experienced with the catalog so I felt a bit underqualified to write about that in the RFC
- I didn't want to delay the RFC till I could understand the nuances of the catalog enough to include the portion about merging it
One note regarding translations I'd like to make sure is mentioned in this RFC: We will have to contact a member of the WP Meta team (who works on wp.org) to change the path at which GlotPress reads our translation files, to both update to the new repository and filepath. I believe the current path used is |
Co-authored-by: Zack Krida <zackkrida@pm.me>
rfcs/20221124-monorepo.md
Outdated
- `.eslintignore` | ||
- <s>`.gitignore`</s> (better to move it into the `frontend/` directory and update some absolute paths) | ||
|
||
1. Create the final commit on the `WordPress/openverse-frontend` repo. After the merge we might want to add a notice about the migration to the `README.md` file but GitHub's built-in archival process could suffice here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rfcs/20221124-monorepo.md
Outdated
|
||
The first step will be to release the frontend, call a code freeze and pause work on it. This is to prevent the frontend repo from continuing to drift as we merge a snapshot of it with the API. | ||
|
||
This can prove difficult given how productive our team is, so we will need to channel this productivity towards the catalog in the meantime. I can foresee the end-to-end migration taking one week (ideal scenario) to one fortnight (worst case scenario). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can foresee the end-to-end migration taking one week (ideal scenario) to one fortnight (worst case scenario).
@dhruvkb Could you unpack this time estimate a bit more for me? I am curious because I could see these tasks taking 1-2hrs or so and being treated like a deploy, where a few folks work on it together synchronously. I also think that would have benefit of minimizing the disruption to contributors. Are there long-running tasks I'm not considering?
I would also like to see some concrete implementation details about, for example, how we plan to make a PR for some of the meta/internal changes to the repo (the unified CODEOWNERS file, a new README, the modified actions, etc.). Could we have those changes prepared in a draft PR that we only merge after the frontend main
branch has been combined with the API? And regarding that frontend step, do we plan to do it via PR in the GitHub UI? Details like that will give me much more confidence in our final approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The missing implementation details (such as writing new documentation, updating references) are what will take the maximum time in the 1-2 week time range I proposed for this plan.
The technical merge is a and getting the functionality working is a day or two. The remaining 3 days are for identifying these issues and fixing them. I kept 5 more days as a backup in case something unforeseen comes up but given we do sufficient planning beforehand, we will not need the second week (underpromise, overdeliver 😉).
It might be hard to make a draft PR with the changes (because of the merge conflicts) but one thing we could do is start a new branch, do the merge on it and eventually replace main
with it. That seems like a great way to see everything in action but a lot of (branch-name dependent actions will be affected when we promote the other branch).
In any case, the frontend cannot be merged via the GitHub UI, it will have be CLI. In all my usage of GitHub I don't think it's possible to make a PR from any repo to another. The GitHub UI only provides merge functionality from a fork to the upstream which is not this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd still really like to see a breakdown of the tasks involved in the merge, their execution order, and a time estimate for each step. We could do that when we write an implementation plan for this RFC. Two days maximum for a code freeze isn't too disruptive but definitely a concern in the case of high priority or critical bugs we might identify, even with a bug bash and small code freeze beforehand.
This looks great, @dhruvkb. I'm really excited to move forward with this and hope it's what the team wants. Some questions:
When I clone I observed one problem with prettier in VS Code, the root-level |
Also, 👍 the combined codeowners file is excellent! |
rfcs/20221124-monorepo.md
Outdated
|
||
1. The entire system can be integration tested during releases. The real API, populated with test data, can even replace the Talkback server as long as we can turn off all network calls and enable 100% reliable output. | ||
1. The entire system can be integration tested during releases. The real API, populated with test data, can replace the Talkback server as long as we disable network calls and make output deterministic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love this! 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would have to make sure that the API response types never differ from the sample data, though. For example, if we say that the license URL is always present, and it's present in the test data, but absent in the real API responses, that would break the app even though the tests pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Next step would be integrating the catalog so the supplier of the data would also be a part of the tests 👍.
1. Create "stack: \*" labels to help with issue and PR management. | ||
Spoiler/foreshadowing: these labels will be used for more things later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this should happen before the migration? If we created the labels in advance, we could then apply each label to all issues in the respective repositories using the cli (I think this would be a very simple command to write), before transferring. The labels would be preserved during the transfer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that makes more sense than transferring the issues, keeping a list of them and then applying labels.
working with split repos quite productively for over a year. My proposition is | ||
the the monorepo solutions are better than workarounds. | ||
|
||
The [integration](#step-4-integration) section in the latter part of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The [integration](#step-4-integration) section in the latter part of the | |
The [integration](#step-4-integration) section in the latter part of the |
This appears out of date, as Step 4 now refers to a documentation merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rationale and migration path look great to me. I am really sold on the idea of a monorepo and excited for the benefits I forsee.
I still think, however, that our implementation plan needs more refinement. What is here is a great start but there is a lot of specific cleanup and steps I would like to see documented.
For this reason, I would approve this PR as-is (with implementation details removed) if we create a separate implementation plan. Otherwise there are things I would like to see expanded on:
1. Infra repo considerations
In the infra repository, we have code which syncs many things across the repositories:
- The required status checks and branch protections
- The collaborators and their levels of access
- Mapping secrets across multiple repositories (which you do mention)
What should the timing of changes be there? My thought would be that we should create a PR in advance which corrects the collaborators, repos, and secrets mapping by removing the to-be-archived repos. Then we would apply the terraform changes and merge the PR right after the actual repos are migrated.
2. Small details
Some small thoughts that are important but I'm not sure where to put them:
- What preparation of the frontend and API repos do we need to do before the merge? No open PRs? For how long?
- We should make the PR minimum review count for each repo > 2 on the day of the monorepo merge, to block any accidental PR merges.
3. The timing and implementation order
I suspect we could do much of the implementation work in advance, to dramatically simplify the day-of changes necessary to implement the monorepo. The idea would be do as much work non-destructively and in advance as possible, so that our code freeze can be as short as possible. The general idea would be:
-
Create all stack labels far in advance, and sync them to all repositories. On the day of launch use a cli command to bulk apply
frontend
andapi
stack labels to all open issues in each respective repository. When issues are transferred later, the labels will be preserved. We can also make the 'this repo is archived' notice PRs in advance of the actual monorepo merge too. -
In
WordPress/openverse
, create amonorepo shared files
PR containing all of the merged/unified files frommonopenverse
that you listed:
Expected conflicts | Resolution |
---|---|
.github/CODEOWNERS | Folderwise separation of owners, use from @dhruvkb/monoverse |
.prettierignore | Merge entries from all repos, use from @dhruvkb/monoverse |
.pre-commit-config.yaml | Python + Node.js linting, use from @dhruvkb/monoverse |
justfile | Split into many smaller justfile s and one roota |
.gitignore | Split into many smaller .gitignore s and one rootb |
CONTRIBUTING.md | Write new, referencing doc site |
README.md | Write new, referencing doc site |
This PR would be merged on the day of the monorepo migration as a first step. Crucially here, we address and fix all 'Step 4. Documentation merge' issues prior to the monorepo merge, rather than after it has happened.
- In
frontend
, create a branch which moves all code into thefrontend/
directory and removes the root-level docs which would conflict with the monorepo.
On the day of the merge we make sure this branch is up to date with main
changes and ready to go. When we combine it with the openverse
repo we use the command you mentioned (git pull frontend main --allow-unrelated-histories
) but with the name of this created branch, instead of main.
- The same approach is followed for the API, creating a branch with any conflicted files removed.
- After the root level doc and action PR is merged both histories are immediately merged in and the issue transfer commands are run. Then we apply the terraform changes from the infra PR, to the new list of combined required status checks, secrets to be synced, and so on. Finally, we archive the repos.
This general approach allows us to have all of the combined root-level files prepared in advance in one PR, and to make the repo-specific file changes made in advance on a branch before the actual monorepo merge. I believe it would reduce our "down time" to a few hours on one day.
I would also suggest we have 1-3 people working on it synchronously like a launch to iron out issues on the day.
Does that make sense? Is there something major I'm overlooking? I really appreciate your acknowledgement that there will definitely be things we miss, and things that might break after combining the repos.
In general, I want to be really really careful and detailed about the actual implementation plan before we execute this. the best approach might be to keep this open indefinitely, after holiday afk, for further implementation discussion, rather than making two separate document. Alternatively, like I said I would support approving this PR if the plan was to make a separate implementation plan.
I'll also be awaiting my co-reviewer, @sarayourfriend's thoughts which I will take into consideration too.
With this done, the development on the API and frontend can continue inside their subdirectories. The development of both parts will be independent. At least until we reach [long-term consolidation](#step-5-long-term-consolidation). | ||
Workflows with major changes: | ||
|
||
- `ci_cd.yml` from the API will absorb `ci.yml` from the frontend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What approach will we take to avoid running API tests when only frontend code is changed and vice-versa? Combining the CI workflows will preclude us from using the workflow paths
filter. Did you have a different solution in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use the path-filter
action to do that.
rfcs/20221124-monorepo.md
Outdated
|
||
The soon-to-be-ECS-based API and ECS-based frontend will continue to deploy to staging via the CI + CD pipeline, with deployment starting as soon as all CI checks have passed. They will use similar code as the frontend auto-deploy for staging used currently. | ||
|
||
These will be separate jobs with specific [path-based filters](https://github.com/dorny/paths-filter). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the benefit of using this paths-filter
action rather than the built-in GitHub workflow paths
settings for workflow triggers? Similar to my other question: https://docs.github.com/en/actions/using-workflows/triggering-a-workflow#using-filters-to-target-specific-paths-for-pull-request-or-push-events
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This actions allows us to filter jobs instead of applying the filter to the entire workflow. This allows us to have a complete workflow where the jobs can be made to run based on the outcome of other jobs with needs
. For example we can run the jobs to deploy the docs to GH Pages and Docker images to GHCR only when the CI jobs have passed.
If these jobs were in separate workflows, I don't think it's possible to run a workflow's jobs if jobs in a different workflow have passed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible to have cross-workflow dependencies, actually: https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#workflow_run
I only have two concerns with using the actions:
- Extremely repetitive and annoying code to maintain in the workflows (need to guard each frontend/api job with the same repeated step at the top of the job). Note that we could use YAML anchors to fix this slightly (still more repetition here compared to
paths
) butactionlint
does not currently support them. We'd either need to contribute an upstream PR or stop usingactionlint
, the latter of which I think is unwise given how helpful it is at catching syntax errors locally. - It still actually uses a runner and worker time to complete whereas needs bypasses it entirely, therefore not eating up any of our quota. Considering the sheer number of jobs we're going to end up having in a single repository by the end of this migration, cutting down on the number of jobs that are run is prudent in this respect. This is one reason why we added all the concurrency configuration to our workflows, for example.
If you think either of these are not a concern, then that is fine. The first should be documented as an expectation and accepted trade off. I think the second would bite us later on, but I can accept that I may be being too cautious.
One more thing that crossed my mind as I was writing this is whether our concurrency settings will need to change for jobs. My hunch is no, but it's worth looking into to confirm as they can be quite nasty to think through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
workflow_run
workflows depend on the "OR" combination of multiple workflows. This is different from needs
which depends the "AND" combination of multiple jobs.
The annoying to maintain code might be true, but it can be overcome by abstracting the paths-filter
job into a separate action with some inputs
(similar to the setup-env
action), thus limiting the annoyance to a single file.
Point 2 could be an issue as it will spin up a job and cancel it if the criterion fails, an alternative to that could be to have a job that only matches the condition of the paths. Let's say jobs called "is_api", "is_frontend" etc. The other jobs can needs
on them so that they're skipped if their code paths are untouched. I think it will be a bit of a hassle to manage the pass/fail/skip cascades though so it's something that needs a bit of a dry-run first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other jobs can needs on them so that they're skipped if their code paths are untouched. I think it will be a bit of a hassle to manage the pass/fail/skip cascades though so it's something that needs a bit of a dry-run first.
I hadn't considered that, using a single (or handful of) job(s) that determine(s) which part of the project is being changed. It could actually be more flexible as the jobs could be is-frontend
as you suggested, but also, if we know of specific cross-app dependencies that are fragile, we could have jobs for those as well (is-<whatever>-endpoint
, etc). Then needs
on other jobs actually sets up excellent self-documenting workflows and does not cause all the actions to run even if they're not needed. We'll have "failed" workflows, I guess, for when one of the project detection jobs doesn't match. I wonder how we would enforce "required" actions for different parts of the application? Can a job be required but skipped due to not matching the "needs"? I don't think so, in which case the secondary callable workflow you mentioned that takes a list of the relevant project parts for the job, while still tedious, at least allows some level of abstraction. Then we can still have required jobs that just automatically pass if they aren't relevant to the part of the code being changed.
rfcs/20221124-monorepo.md
Outdated
|
||
##### Production | ||
|
||
For production, we will not be able to use GitHub Releases and will have to use a manually-triggered workflow to build the assets and tag them. The tag can be set via a workflow input (simple) or can be calculated based on the update scope of major, minor or patch (not as simple). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another way to simplify this is for us to switch to SHA based tags for release deployment. Rather than creating a new build, just use the build tagged with the HEAD
SHA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The slight issue with SHA tags would be that it's harder to identify versioning with them. Given two hashes, it's not possible to say which is the newer version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good point!
Co-authored-by: Zack Krida <zackkrida@pm.me>
Co-authored-by: Zack Krida <zackkrida@pm.me>
I would prefer that the implementation plan be here and thoroughly discussed in one place rather than a partially complete RFC that's approved and merged without a concrete implementation plan. Keeping this PR open seems like a fair tradeoff for centralising all conversation about the entire undertaking. I've already started work on eliminating some possible merge conflicts (such as the |
Based on the medium urgency of this PR, the following reviewers are being gently reminded to review this PR: @AetherUnbound Excluding weekend1 days, this PR was updated 4 day(s) ago. PRs labelled with medium urgency are expected to be reviewed within 4 weekday(s)2. @dhruvkb, if this PR is not ready for a review, please draft it to prevent reviewers from getting further unnecessary pings. Footnotes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally it was very important for me to have an implementation plan outlined in this PR, but as I understand @dhruvkb has completed multiple dry-runs of this migration in testing environments and is confident in the procedure to run it. Since this is also a one-off migration which will be executed by @dhruvkb and the timing of it occuring after our openverse.org launch tomorrow is ideal, I am comfortable approving this PR as-is.
Once our project process is more refined, I'll feel stricter about adhering to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking forward to it!
Co-authored-by: sarayourfriend <24264157+sarayourfriend@users.noreply.github.com> Co-authored-by: Zack Krida <zackkrida@pm.me>
Fixes
Fixes #205 by @zackkrida
Description
This PR writes the monorepo RFC.
Proof of Concept can be seen at
https://github.com/dhruvkb/monopenversehttps://github.com/dhruvkb/monoverse.Testing Instructions
Checklist
Update index.md
).main
) or a parent feature branch.Developer Certificate of Origin
Developer Certificate of Origin