Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preliminary Implementation of Wiki Farm Support for Canasta #250

Closed
wants to merge 16 commits into from

Conversation

chl178
Copy link
Contributor

@chl178 chl178 commented Jun 5, 2023

Related Issue #57

This pull request outlines my progress on the project as part of the Google Summer of Code (GSoC) program. The objective of the project is to implement support for a wiki farm in Canasta, a Docker-based MediaWiki distribution.

Background

Currently, Canasta offers a streamlined way to set up a feature-rich MediaWiki instance on virtually any server. However, it does not currently support running multiple wikis, or a wiki farm, within the same container. This project aims to fill this gap and provide the ability to run different wikis in the same container. The wikis could vary by directory (e.g., example.com/a, example.com/b), subdomain (e.g., a.example.com, b.example.com), or completely different domains (e.g., example1.com, example2.com).

In addition, this project intends to extend Canasta's command-line interface (CLI) to support wiki farm configuration. This would allow administrators to effortlessly create, manage, and delete individual wikis.

Implementation Approach

To simplify management and facilitate use, I've implemented a shared common setting that applies to all wikis. Alongside this commonality, each wiki is distinguished by its unique ID and has its own customized settings. This allows for the definition of unique skins, extensions and configurations for each wiki. In this way, multiple wikis can run independently in a single container.

For wikis under directories, I have scripted an automatic generator generatewikihtaccess.sh for .htaccess files to manage access permissions.

Test

As you can see three different wiki a, k and local run independently in one container.
Screenshot from 2023-06-04 19-33-23
Screenshot from 2023-06-04 19-34-56
Screenshot from 2023-06-04 19-35-12
They are under different domain names and different directories.

Next Steps

While this initial implementation is a significant step towards our goal, there is still considerable work to do. The next phase of this project will focus on developing the Canasta CLI to make it even more user-friendly and efficient in managing wiki farms.
I'm looking forward to feedback and suggestions on how we can improve this implementation and successfully deliver the project's objectives.

@chl178
Copy link
Contributor Author

chl178 commented Jun 5, 2023

For people who want to test it locally

As this is an early version for wiki farm support for Canasta, it currently does not support CLI installation and can only be manually installed using docker-compose. You may refer to the official docker-compose manual installation documentation for guidance.
The test template for docker-compose.

  1. Modify your hosts file to add an alias for 127.0.0.1 and restart the Apache service.
    Screenshot from 2023-06-04 19-54-27
  2. Download my canasta code and create an image by docker build -t canastafarm .
  3. Replace the web image in the docker-compose.override.yml file with the image you've just built.
    Screenshot from 2023-06-04 19-59-09
  4. Modify the caddyfile file in the docker-compose's config, add an alias in the form of http://{corresponding alias}.
    Screenshot from 2023-06-04 19-59-42
    Note: Because this is local testing, if you use https, you might not be able to access it due to the web server lacking an SSL certificate. Therefore, you need to change https to http in order to disable Caddy's https.
  5. Copy .env.example to .env, and modify MW_SITE_SERVER and MW_SITE_FQDN in the .env file, changing them both to start with http.
    Screenshot from 2023-06-04 20-02-23
  6. Create a wikis.yaml file and place it in the config directory.
    Screenshot from 2023-06-04 20-03-20
  7. Run the command docker-compose up -d
  8. Navigate to the installation address and install MediaWiki.
  9. After installation, restart the installation and choose a different database for another wiki (repeat as many times as the number of wikis you wanted to install).
    Screenshot from 2023-06-04 20-09-57
    Screenshot from 2023-06-04 20-12-37
  10. Run the command docker-compose down
  11. Place the LocalSettings.php into the config, and add the commonsettings.php and each wiki's customized LocalSettings_wikiID.php.
    Screenshot from 2023-06-04 20-04-46
    Screenshot from 2023-06-04 20-13-27
  12. Run the command docker-compose up -d to complete the installation.

My docker-compose structure:
Screenshot from 2023-06-04 20-21-38

@chl178
Copy link
Contributor Author

chl178 commented Jun 5, 2023

@yaronkoren @jeffw16

_sources/canasta/CanastaDefaultSettings.php Outdated Show resolved Hide resolved
_sources/canasta/FarmConfigLoader.php Show resolved Hide resolved
_sources/canasta/FarmConfigLoader.php Outdated Show resolved Hide resolved
_sources/canasta/CanastaDefaultSettings.php Outdated Show resolved Hide resolved
_sources/canasta/CanastaDefaultSettings.php Outdated Show resolved Hide resolved
_sources/canasta/FarmConfigLoader.php Outdated Show resolved Hide resolved
@@ -1,5 +1,9 @@
RewriteEngine On

# Capture original request URL
RewriteCond %{THE_REQUEST} \s(.*?)\s
RewriteRule ^ - [E=ORIGINAL_URL:%1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any benefits over using $_SERVER['REQUEST_URI'] ?

$path = null;

// Retrieve the server name and request path if available
if (isset($_SERVER['SERVER_NAME'])) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Afaik, we don't have UserCanonicalName and ServerName set in Canasta Apache configurations and thus this makes the SERVER_NAME spoofable, see https://shiflett.org/blog/2006/server-name-versus-http-host

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly is the problem with it being spoofable? Perhaps it is the intention of a legitimate user to access a specific wiki by setting it manually in a cURL request or something like that.

_sources/canasta/CanastaDefaultSettings.php Show resolved Hide resolved
_sources/scripts/run-apache.sh Outdated Show resolved Hide resolved
@vedmaka
Copy link
Collaborator

vedmaka commented Jul 16, 2023

Some general comments:

  • It looks like some changes to the job runner scripts are also necessary, to make it run against all the farm wikis ( mwjobrunner.sh , mwsitemapgen.sh , mwtranscoder.sh )
  • The update.php run also need to be changed by adding support for multiple wikis
  • Images directories ($wgUploadDirectory) need to be also altered per wiki. Otherwise, files will be overwritten
  • Not quite sure about Canasta preference here, but maybe it'd be nice to have the farm mode switch controlled not by wikis.yaml or CommonSettings.php file presence but by some ENV variable instead

Please do not treat my comments as a call to action but as suggestions

_sources/canasta/CanastaDefaultSettings.php Show resolved Hide resolved
_sources/canasta/CanastaDefaultSettings.php Show resolved Hide resolved
_sources/canasta/FarmConfigLoader.php Outdated Show resolved Hide resolved
_sources/canasta/FarmConfigLoader.php Outdated Show resolved Hide resolved
_sources/canasta/FarmConfigLoader.php Outdated Show resolved Hide resolved
_sources/scripts/generatewikihtaccess.sh Outdated Show resolved Hide resolved
_sources/scripts/run-apache.sh Outdated Show resolved Hide resolved
@@ -180,9 +186,11 @@ cd "$MW_HOME" || exit

########## Run maintenance scripts ##########
echo "Checking for LocalSettings..."
if [ -e "$MW_VOLUME/config/LocalSettings.php" ]; then
if [ -e "$MW_VOLUME/config/LocalSettings.php" ] || [ -e "$MW_VOLUME/config/CommonSettings.php" ]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, should change this to support any .php file in that directory.

_sources/scripts/run-apache.sh Outdated Show resolved Hide resolved
_sources/canasta/FarmConfigLoader.php Outdated Show resolved Hide resolved
@chl178
Copy link
Contributor Author

chl178 commented Jul 18, 2023

Some general comments:

  • It looks like some changes to the job runner scripts are also necessary, to make it run against all the farm wikis ( mwjobrunner.sh , mwsitemapgen.sh , mwtranscoder.sh )
  • The update.php run also need to be changed by adding support for multiple wikis
  • Images directories ($wgUploadDirectory) need to be also altered per wiki. Otherwise, files will be overwritten
  • Not quite sure about Canasta preference here, but maybe it'd be nice to have the farm mode switch controlled not by wikis.yaml or CommonSettings.php file presence but by some ENV variable instead

Please do not treat my comments as a call to action but as suggestions

Thank @vedmaka for your comments and feedback:
We've already initiated the process of resolving the issues concerning the upload and cache directories. This should ensure that files are not overwritten and each wiki in the farm has its own distinct space.

Regarding the ENV variables, we've had discussions with Daniel, a respected MediaWiki community developer. His advice was not to define too many environment variables. As the current practice in the MediaWiki community is to use a YAML file for setting up farms, we've decided to follow this established practice.

We're taking into account your points on modifying the job runner scripts (mwjobrunner.sh, mwsitemapgen.sh, mwtranscoder.sh), and incorporating support for multiple wikis in update.php.

Thanks again for your suggestions. They're not direct calls to action, but we highly value your input.

}
run_maintenance_scripts() {
# Iterate through all the .sh files in /maintenance-scripts/ directory
for maintenance_script in $(find /maintenance-scripts/ -maxdepth 1 -mindepth 1 -type f -name "*.sh"); do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make this order predictable (i.e. load lexicographically) rather than nondeterministic?

@@ -0,0 +1,44 @@
#!/bin/bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename file to 1_mw_job_runner.sh

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I only saw this now. I disagree with this - whether or not these three maintenance scripts should get renamed, the renaming should not happen in this patch.

@@ -0,0 +1,33 @@
#!/bin/bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename file 2_mw_transcoder.sh

@@ -1,5 +1,13 @@
#!/bin/bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename file 3_mw_sitemap_generator.sh

_sources/canasta/FarmConfigLoader.php Outdated Show resolved Hide resolved
@@ -0,0 +1,44 @@
#!/bin/bash

RJ=$MW_HOME/maintenance/runJobs.php
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't abbreviate this variable. It was confusing to me what it referred to until I found its definition. Please rename it to RUN_JOBS_SCRIPT_PATH.

It might be more verbose, but it's more descriptive. Verbose but descriptive variable names are typically more helpful than concise but nondescriptive variable names.

@@ -0,0 +1,33 @@
#!/bin/bash

RJ=$MW_HOME/maintenance/runJobs.php
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my comments on the other file about the naming of this variable.

_sources/canasta/FarmConfigLoader.php Outdated Show resolved Hide resolved
_sources/canasta/FarmConfigLoader.php Outdated Show resolved Hide resolved
_sources/canasta/FarmConfigLoader.php Outdated Show resolved Hide resolved
Copy link
Member

@jeffw16 jeffw16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's simple code style issues that should be fixable with a linter. Let me know when you've fixed them all and then I can continue my code review.

_sources/canasta/CanastaDefaultSettings.php Outdated Show resolved Hide resolved
_sources/canasta/CanastaDefaultSettings.php Outdated Show resolved Hide resolved
_sources/canasta/FarmConfigLoader.php Outdated Show resolved Hide resolved
_sources/canasta/FarmConfigLoader.php Outdated Show resolved Hide resolved
@chl178 chl178 closed this Sep 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants