Skip to content

Commit

Permalink
Write the monorepo RFC
Browse files Browse the repository at this point in the history
  • Loading branch information
dhruvkb committed Nov 30, 2022
1 parent 2eb31d5 commit b49d7fc
Showing 1 changed file with 178 additions and 0 deletions.
178 changes: 178 additions & 0 deletions rfcs/20221124-monorepo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
↖️ Table of Contents

# RFC: Monorepo

**Status:** 🚧 WIP, comments are welcome nonetheless

## Reviewers

- [ ] <your name here>
- [ ] <your name here>

## Rationale

For a comprehensive discussion about the pros, the cons and the counterpoints to each see [discussion](https://github.com/WordPress/openverse/issues/192). This is not the purpose of this RFC.

This RFC summarily lists the benefits and then, with the twin assumptions of a monorepo being ultimately beneficial and the decision to migrate being finalised in the above discussion, proceeds to go into the implementation details.

### Exclusive benefits of monorepo

This only includes things that cannot be accomplished without the use of a monorepo.

1. Single place to go for issues, PRs and all activity. Currently tickets are scattered across several repos, and any tickets that could benefit more than a single layer must be opened in each of the different repos.

1. Singular copy (different from synced independent copies) of scaffolding code such as Git hooks, lint rules and common workflows. This is distinctly better than elaborate sync workflows.

1. Central place for all technical documentation, enabling documentation for different parts of the stack to cross-reference other pieces and stay current with changes in other places.

1. Enables the infra to deploy the code to coexist with the code itself. Apart from the private secrets that will still need to be encrypted, the IaC files could be organised identical to the code.

1. Milestones that can span across multiple layers of the stack are only possible in GitHub. This is a limitation imposed by GitHub and there is no workaround for this.

The [integration](#step-4-integration) section in the latter part of the document describes more interesting outcomes made possible by the monorepo. They may or may not be exclusive to monorepos but they're surely made easier by it.

## Migration path

First we will merge the API and the frontend. This decision was made for the following reasons.

1. API and frontend are tightly linked. The frontend is a direct consumer of what the API produces.

1. The API and frontend form the "service" side of Openverse that directly faces the users (both API consumers and Search engine users).

1. The frontend uses ECS deployments and the API is well on the same track. This makes it possible for them to share some deployment code.

1. To the RFC author, the API and frontend are very familiar so merging them would be easier. Adding a third component would make the task daunting.

1. Merging incurs a productivity hit for the initial transition. So merging everything in one swoop is not ideal.

1. The API’s comprehensive tooling for developer documentation can benefit frontend devs and create a unified docs site for contributors.

1. The API is already organised by stack folders so the `frontend/` directory will fit right in with the others like `api/` and `ingestion_server/`.

1. The API and frontend share identical tooling for Git hooks, linting and formatting. We will fight our tools less and encounter minimal friction.

- In fact, we employ a number of hacks to install and configure pre-commit for the frontend. Merging it with the API eliminates the need for such hacks.

1. The entire system can be integration tested during releases. The real API, populated with test data, can even replace the Talkback server.

The `WordPress/openverse-api` repo will absorb the `WordPress/openverse-frontend` repo. The `WordPress/openverse-catalog` will also be merged, _later_.

### Reference

I'm following the steps listed below in a fork at [@dhruvkb/monopenverse](https://github.com/dhruvkb/monopenverse/). You can refer to the fork, but note that it is a comes from a place of haste and has not been treated with the same level of love and care that the final treatment will receive.

### Step 0: Prerequisites

#### Get the timing right

The first step will be to release the frontend, call a code freeze and pause work on it. This is to prevent the frontend repo from continuing to drift as we merge a snapshot of it with the API.

This can prove difficult given how productive our team is, so we will need to channel this productivity towards the catalog in the meantime. I can foresee the end-to-end migration taking one week (ideal scenario) to one fortnight (worst case scenario).

### Step 1: Merge with histories

This is a quick process.

1. Move the entire content of frontend inside a `frontend/` directory, except the following top-level files and folders. Please comment if you can add to this list.

- `.github/`
- `.editorconfig`
- `justfile`
- `.pre-commit-config.yaml`
- `.prettierignore` (symlink into the `frontend/` directory)
- `.eslintrc.js` (symlink into the `frontend/` directory)
- `.eslintignore` (symlink into the `frontend/` directory)
- <s>`.gitignore`</s> (better to move it into the `frontend/` directory and update some absolute paths)

1. Create the final commit on the `WordPress/frontend` repo. After the merge we might want to add a notice about the migration to the `README.md` file but GitHub's built-in archival process could suffice here.

1. Merge this repo's `main` branch into the `WordPress/openverse-api` repo's `main` branch (see Git docs for `--allow-unrelated-histories`). There will be some conflicts but they will be small and infrequent. [[implementation details](#conflict-resolution)]

1. Create "stack: \*" labels to help with issue and PR management. Spoiler/foreshadowing: these labels will be used for more things later.

1. Migrate issues from `WordPress/openverse-frontend` to `WordPress/openverse-api`. @obulat's has done prior work in this department (when we migrated from CC Search to Openverse) but that might not be as useful because in this case, we can directly transfer the issues, retaining all their comments. Apply the "stack: frontend" label to them. [[implementation details](#issue-transfer)]

With this done, we can archive the frontend repo.

#### Conflict resolution

The following conflicts may occur during merge.

- `.prettierignore`: concatenate
- `.pre-commit-config.yaml`: use from [dhruvkb/monopenverse](https://github.com/dhruvkb/monopenverse)
- Workflows can conflict but they can be renamed and kept alongside each other, _for now_.

#### Issue transfer

As far as I can tell, issue transfer can only be performed via the GitHub GraphQL API ([docs](https://docs.github.com/en/graphql/reference/mutations#transferissue)) and not via the REST API. From my limited testing, transferred issues seem to retain labels (provided they exist in the target repo).

An implementation of the GraphQL API call (albeit in Ruby) is available in `hub` and the [code for it](https://github.com/github/hub/commit/4c2e44146988dfb385a26f649298f274a5017756) is available in their GitHub repo for reference.

However, instead of writing the code ourselves, we can install `hub` and run a small script that repeatedly calls `hub` to migrate each issue one by one. That'll be a hack but it's okay since this is a one-off use for this.

### Step 2. Restore workflows

The workflows of both the API and frontend will need some refactoring to start, and pass, again. [monopenverse](https://github.com/dhruvkb/monopenverse) has updated these workflows and the following work.

[monopenverse](https://github.com/dhruvkb/monopenverse) showcases a `setup-env` action that sets up Node.js, Python, Just and other dependencies and can be used in every workflow.

The `ci_cd.yml` workflow from the API has been very nicely combined with the `ci.yml` workflow from the frontend. Redundant steps were eliminated.

The following actions have been successfully combined:

- actionlint ✅
- bundle_size.yml ✅
- ci_cd.yml (API) + ci.yml (frontend) [merged]
- Playwright tests from ci.yml (frontend) ✅
- draft_release.yml ✅
- generate_pot.yml ✅
- gh_pages.yml ✅
- migration_safety_warning.yml ✅
- subscribe_to_label.yml ✅
- label_new_pr.yml ✅
- pr_closed.yml ✅
- pr_label_check.yml ✅
- new_issues.yml ✅
- pr_ping.yml ✅

The following have not been verified to work:

- renovate.yml
- rollback.yml
- ghcr.yml
- push_docker_image.yml

With this done, the development on the frontend can continue inside the subdirectory.

### Step 3. Buff the rough edges

There will be a few rough edges that I cannot foresee and we can continuously fix those as we spot them. But up to this point we should be in a position where
we can continue to build the API and the frontend independently but from one repo.

1. The action `banyan/auto-label` will need to be configured (`auto-label.json`) to add the "stack: \*" labels based on the modified directory.

### Step 4. Integration

This is the long term combination of code for the frontend and the API.

#### Combined lint

All lint steps can be combined in `.pre-commit-config.yaml`. This also simplifies the CI jobs can now be merged.

See the combined lint in action at [monopenverse](https://github.com/dhruvkb/monopenverse).

### Step 5. Documentation merge

The following documentation files will need reorganisation or merge.

- README.md (both repos)
- CODE_OF_CONDUCT.md (both repos)
- CONTRIBUTING.md (both repos)
- CONTRIBUTORS.md (API only; also why?)
- DOCUMENTATION_GUIDELINES.md (API only)
- TESTING_GUIDELINES.md (frontend only)
- DEPLOYMENT.md (frontend only)

I will need more information about this because IANAL.

- LICENSE (both repos)

0 comments on commit b49d7fc

Please sign in to comment.