We have gathered here a few things which should be helpful if you want to participate to MWoffliner development or hack it.
To setup MWoffliner locally:
git clone https://github.com/openzim/mwoffliner.git
cd mwoffliner
npm ci
To run it (this is only an example):
./node_modules/.bin/ts-node-esm ./src/cli.ts --mwUrl=https://bm.wikipedia.org --adminEmail=XXX
or
npm start -- --mwUrl=https://bm.wikipedia.org --adminEmail=XXX
We use TypeScript for development. You can find all .ts
files in
src/**
. The best way to develop is to run npm run watch
in a
terminal, and then execute MWOffliner via the pre-configured debugger
in Visual Studio Code. The compiled .js
files are not committed to
Git, but they are published to NPM.
We follow a nearly exact tslint:recommended
scheme -
you can see more information here: ./tslint.json
It's best to use TSLint to check your code as you develop, this project is pre-configured for development with VSCode and the TSLint plugin.
To run the automated tests with collecting coverage (both unit and e2e):
npm test # or "npm run test"
To run the automated tests with collecting coverage and verbose all debug message (both unit and e2e):
npm run test-verbose
To run the automated tests without collecting coverage (both unit and e2e). This command will be a bit faster:
npm run test-without-coverage
To run the unit tests with collecting coverage:
npm run test:unit-coverage
To run the unit tests without collecting coverage:
npm run test:unit
To run end-2-end tests with collecting coverage:
npm run test:e2e-coverage
To run end-2-end tests without collecting coverage:
npm run test:e2e
To run a specfic test with collecting coverage:
npm run test:pattern test/e2e/wikisource.e2e.test.ts -- --coverage
To run a specfic test without collecting coverage:
npm run test:pattern test/e2e/wikisource.e2e.test.ts
To run a tests by regex pattern. Example which runs all e2e tests:
npm run test:pattern ^.*e2e.*\.test\.ts
For S3 tests to pass, create a '.env' file (at the root of your MWoffliner code directory) where you will configure S3 URL with credentials. Example:
S3_URL=https://s3.region.amazonaws.com/?bucketName=S3_BUCKET_NAME&keyId=S3_KEY_ID&secretAccessKey=S3_ACCESS_KEY
... or just ensure the S3_URL
environment variable is properly set.
There is a pre-configured debug config for VSCode, just click on the debugging tab.
Advices for debugging mwoffliner issues:
- For pre-packaged Kiwix downloads, look at the scripts at
https://github.com/kiwix/maintenance/tree/master/mwoffliner
- If both, then you may need separate corrections for each.
- Create Parsoid output to understand what mwoffliner is working
with, including checking whether the error is with the Parsoid
output itself. For Wikimedia wikis you can easily generate and
view the output in your browser using the Parsoid REST interface.
Example URLs:
Mobile (most pages): https://en.wikivoyage.org/api/rest_v1/page/mobile-sections/Hot_springs⚠️ DEPRECATED: Mobile Content Service endpoints are now deprecated.- Desktop (main page): https://es.wikipedia.org/api/rest_v1/page/html/Espa%C3%B1a
- If the error is with the Parsoid output
- Mark the issue in openzim/mwoffliner with the "parsoid/mediawiki" tag.
- It's good to reach out to Parsoid to open a corresponding bug and reference it. Even so, keep the openzim/mwoffliner bug open until the Parsoid bug is fixed.
- Consider whether a workaround in mwoffliner is possible and worthwhile.
- Make a small test case to use as you develop rather than
processing a large wiki. In particular, the argument
--articleList
are useful. Run mwoffliner with--help
for details on those and other flags that may be useful.
To publish/release, it's best to use a clean clone of the project:
- Clone
git clone https://github.com/openzim/mwoffliner.git
- Update
package.json
- Commit
:package: Release version vX.X.X
- Run
git tag vX.X.X
- Run
git push origin vX.X.X
Whenever a tag is pushed, the CI automatically publishes to npmjs.com
First, please read the contributing guidelines for our parent project, openZIM. They cover the general process.