Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New UI step 1 #68

Merged
merged 21 commits into from
Oct 23, 2023
Merged
Show file tree
Hide file tree
Changes from 19 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .github/workflows/Publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,16 +16,19 @@ jobs:
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version-file: pyproject.toml
python-version-file: scraper/pyproject.toml
architecture: x64

- name: Build packages
working-directory: scraper
run: |
pip install -U pip build
python -m build --sdist --wheel

- name: Upload to PyPI
uses: pypa/gh-action-pypi-publish@release/v1.8
with:
packages-dir: scraper/dist/

- name: Build and push Docker image
uses: openzim/docker-publish-action@v10
Expand Down
36 changes: 33 additions & 3 deletions .github/workflows/QA.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ on:
- main

jobs:
check-qa:
check-scraper-qa:
runs-on: ubuntu-22.04

steps:
Expand All @@ -16,19 +16,49 @@ jobs:
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version-file: pyproject.toml
python-version-file: scraper/pyproject.toml
architecture: x64

- name: Install dependencies (and project)
- name: Install dependencies
working-directory: scraper
run: |
pip install -U pip
pip install -e .[lint,check,scripts,test]

- name: Check black formatting
working-directory: scraper
run: inv lint-black

- name: Check ruff
working-directory: scraper
run: inv lint-ruff

- name: Check pyright
working-directory: scraper
run: inv check-pyright

check-zimui-qa:
runs-on: ubuntu-22.04

steps:
- uses: actions/checkout@v3

- name: Set up Node.js
uses: actions/setup-node@v3
with:
node-version-file: zimui/.node-version

- name: Install JS dependencies
working-directory: zimui
run: |
yarn install

- name: Check prettier
working-directory: zimui
run: |
yarn format

- name: Check eslint
working-directory: zimui
run: |
yarn lint
33 changes: 28 additions & 5 deletions .github/workflows/Tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ on:
- main

jobs:
run-tests:
test-scraper:
runs-on: ubuntu-22.04

steps:
Expand All @@ -16,39 +16,62 @@ jobs:
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version-file: pyproject.toml
python-version-file: scraper/pyproject.toml
architecture: x64

- name: Install dependencies (and project)
working-directory: scraper
run: |
pip install -U pip
pip install -e .[test,scripts]

- name: Run the tests
working-directory: scraper
run: inv coverage --args "-vvv"

- name: Upload coverage report to codecov
uses: codecov/codecov-action@v3
with:
token: ${{ secrets.CODECOV_TOKEN }}

build_python:
build-scraper:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v3

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version-file: pyproject.toml
python-version-file: scraper/pyproject.toml
architecture: x64

- name: Ensure we can build Python targets
working-directory: scraper
run: |
pip install -U pip build
python3 -m build --sdist --wheel

build_docker:
build-zimui:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v3

- name: Set up Node.js
uses: actions/setup-node@v3
with:
node-version-file: zimui/.node-version

- name: Install dependencies
working-directory: zimui
run: |
yarn install

- name: Build
working-directory: zimui
run: |
yarn build

build-docker:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v3
Expand Down
10 changes: 0 additions & 10 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -368,16 +368,6 @@ pyrightconfig.json

# assets that we download
.dockerignore
src/kolibri2zim/templates/assets/bootstrap/
src/kolibri2zim/templates/assets/pdfjs/
src/kolibri2zim/templates/assets/videojs/
src/kolibri2zim/templates/assets/jquery.min.js
src/kolibri2zim/templates/assets/ogvjs/
src/kolibri2zim/templates/assets/videojs-ogvjs.js
src/kolibri2zim/templates/assets/epub.min.js
src/kolibri2zim/templates/assets/bootstrap-icons/
src/kolibri2zim/templates/assets/jszip.min.js
src/kolibri2zim/templates/assets/perseus/

# output dir
output
Expand Down
20 changes: 18 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,24 @@ repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.4.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: trailing-whitespace
- id: end-of-file-fixer
- repo: https://github.com/pre-commit/mirrors-prettier
rev: v3.0.3
hooks:
- id: prettier
files: zimui\/.*$ # files in zimui folder
- repo: https://github.com/pre-commit/mirrors-eslint
rev: v8.51.0
hooks:
- id: eslint
types: [file]
files: zimui\/src\/.*(?:\.[jt]sx?|\.vue)$ # *.js, *.jsx, *.ts, *.tsx, *.vue in zimui/src folder
args:
- --ignore-path
- zimui/.eslintignore
- --config
- zimui/.eslintrc.cjs
- repo: https://github.com/psf/black
rev: "23.3.0"
hooks:
Expand Down
25 changes: 25 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,31 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Small fixes in invoke tasks

### Changed
- Scraper (Python code) has been moved to the scraper subfolder
- Vue.JS is now used as main UI framework ; all its code is in the zimui subfolder ; it is rendered with Vite to produce a static website
- QA and Tests workflows have been adapted
- to the new folder structure
- to also QA and Test the Vue.JS part
- precommit hooks have been configured for the Vue.JS part
- Dockerfile has been adapted to first build the Vue.JS part in a dedicated stage and then embed the generated files into the final Python-based image
- Topics are stored as JSON files in the ZIM
- JSON is generated by pydantic
- these files are consumed by the Vue.JS UI
- content (video, audio, pdf, epub, ...) is still rendered by Jinja2 as before
- URLs are meaningful slugs
- generated by Python slugify lib
- from Kolibri node title
- should two distinct nodes have the same title resulting in the same slug, conflicts are handled with a _1, _2, ... suffix
- changes in the ZIM "folder" structure:
- files generated by Vite are placed in /
- thumbnails are placed in /thumbnails
- JSON files generated to render topics are placed in /topics
- most Kolibri content (video, audio, ePub, PDF) are placed in /files (some content is still placed at the root to not break some stuff which was found hard to fix for now, will be tackled in specific issues for each content type)
- legacy MANIFEST.in has been deleted (left-over from migration to hatch)
- is_front property has been adjusted when adding the item to the ZIM
- one new CLI argument --zimui-dist to specify the folder where zimui has been built (by Vite)


- Dockerfile: split installation of Python dependencies for more efficiency
- Github workflow: publish `dev` tag on every push to `main` branch
- Github workflow: build Docker image + test its startup
Expand Down
27 changes: 20 additions & 7 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
FROM node:20-alpine as zimui

WORKDIR /src
COPY zimui /src
RUN yarn install --frozen-lockfile
RUN yarn build

FROM python:3.11-bookworm
LABEL org.opencontainers.image.source https://github.com/openzim/kolibri

Expand All @@ -12,22 +19,28 @@ RUN apt-get update \
pip

# Copy pyproject.toml and its dependencies
COPY pyproject.toml README.md get_js_deps.sh hatch_build.py /src/
COPY src/kolibri2zim/__about__.py /src/src/kolibri2zim/__about__.py
COPY README.md /src/
COPY scraper/pyproject.toml scraper/get_web_deps.sh scraper/hatch_build.py /src/scraper/
COPY scraper/src/kolibri2zim/__about__.py /src/scraper/src/kolibri2zim/__about__.py

# Install Python dependencies
RUN pip install --no-cache-dir /src
RUN pip install --no-cache-dir /src/scraper

# Copy code + associated artifacts
COPY src /src/src
COPY *.md LICENSE *.py /src/
COPY scraper/src /src/scraper/src
COPY *.md LICENSE /src/

# Install + cleanup
RUN pip install --no-cache-dir /src \
&& rm -rf /src
RUN pip install --no-cache-dir /src/scraper \
&& rm -rf /src/scraper

# Copy zimui build output
COPY --from=zimui /src/dist /src/zimui

# default output directory
RUN mkdir -p /output
WORKDIR /output

ENV KOLIBRI_ZIMUI_DIST=/src/zimui

CMD ["kolibri2zim", "--help"]
4 changes: 0 additions & 4 deletions MANIFEST.in

This file was deleted.

50 changes: 42 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,30 +16,38 @@ HTML files folder of it before creating a ZIM off of it.
Requirements
------------

* Node 20.x
* Python 3.11
* [`ffmpeg`](https://ffmpeg.org/) for video transcoding (only used with `--use-webm` or `--low-quality`).
* `curl` and `unzip` to install Javascript dependencies. See `get_js_deps.sh` if you want to do it manually.
* `curl` and `unzip` to install Javascript dependencies. See `get_web_deps.sh` if you want to do it manually.

Installation
------------

## Virtualenv
### Virtualenv

`kolibri2zim` is a Python3 software. If you are not using the
`kolibri2zim` is a Python3 software. If you are not using the
[Docker](https://docker.com) image, you are advised to use it in a
virtual environment to avoid installing software dependencies on your system.

```bash
python3 -m venv env # Create virtualenv
source env/bin/Activate # Activate the virtualenv
source env/bin/activate # Activate the virtualenv
pip3 install kolibri2zim # Install dependencies
kolibri2zim --help # Display kolibri2zim help
```

Call `deactivate` to quit the virtual environment.

See `requirements.txt` for the list of python dependencies.
See `pyproject.toml` for the list of python dependencies.

## Docker
To test epubs and pdfs rendering, a potential usefull command is:

```bash
benoit74 marked this conversation as resolved.
Show resolved Hide resolved
kolibri2zim --name "Biblioteca Elejandria" --output /output --tmp-dir /tmp --zim-file Biblioteca_Elejandria.zim --channel-id "fed29d60e4d84a1e8dcfc781d920b40e" --node-ids 'd92c07655128458f8248416154b18a68,89fe2f86ee3f4fbaa7fb2bf9bd56d088,75f99e6b97d14b14a4e74762ad77391f,89fe2f86ee3f4fbaa7fb2bf9bd56d088'
```

### Docker

```bash
docker run -v my_dir:/output ghcr.io/openzim/kolibri2zim kolibri2zim --help
Expand All @@ -56,8 +64,34 @@ Development
Before contributing be sure to check out the
[CONTRIBUTING.md](CONTRIBUTING.md) guidelines.

To test epubs and pdfs rendering, a potential usefull command is:
```bash
Some usefull test channels:

- 7f744ce8d28b471eaf663abd60c92267: a very minimal channel with all kind of content
benoit74 marked this conversation as resolved.
Show resolved Hide resolved
- 9f15f4e9aeaa48b5ae271e5749d6fe80 : a small channel with significantly nested items and all kind of content

### Build and running scraper locally

You have to:

- build the `zimui` frontend which will be embededed inside the ZIM (and redo it every time you make modifications to the `zimui`)
- run the `scraper` to retrieve FCC curriculum and build the ZIM

Sample commands:

```
benoit74 marked this conversation as resolved.
Show resolved Hide resolved
cd zimui
yarn install
yarn build
cd ../scraper
hatch run kolibri2zim --name "Biblioteca Elejandria" --output output --zim-file Biblioteca_Elejandria.zim --channel-id "fed29d60e4d84a1e8dcfc781d920b40e" --node-ids 'd92c07655128458f8248416154b18a68,89fe2f86ee3f4fbaa7fb2bf9bd56d088,75f99e6b97d14b14a4e74762ad77391f,89fe2f86ee3f4fbaa7fb2bf9bd56d088'
```

### Running scraper with Docker

Run from official version (published on GHCR.io) ; ZIM will be available in the `output` sub-folder of current working directory.

```
docker run --rm -it -v $(pwd)/output:/output ghcr.io/openzim/kolibri2zim:latest
kolibri2zim --name "Biblioteca Elejandria" --output /output --tmp-dir /tmp --zim-file Biblioteca_Elejandria.zim --channel-id "fed29d60e4d84a1e8dcfc781d920b40e" --node-ids 'd92c07655128458f8248416154b18a68,89fe2f86ee3f4fbaa7fb2bf9bd56d088,75f99e6b97d14b14a4e74762ad77391f,89fe2f86ee3f4fbaa7fb2bf9bd56d088'
```

Expand Down
File renamed without changes.
9 changes: 7 additions & 2 deletions get_js_deps.sh → scraper/get_web_deps.sh
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ fi
SCRIPT_PATH="$( cd "$(dirname "$0")" ; pwd -P )"
ASSETS_PATH="${SCRIPT_PATH}/src/kolibri2zim/templates/assets"

echo "About to download JS assets to ${ASSETS_PATH}"
echo "About to download web assets to ${ASSETS_PATH}"

echo "getting pdf.js"
curl -L -O https://github.com/mozilla/pdf.js/releases/download/v2.6.347/pdfjs-2.6.347-es5-dist.zip
Expand Down Expand Up @@ -100,7 +100,12 @@ mkdir -p $ASSETS_PATH/perseus
mv standalone-perseus-1.1.4/* $ASSETS_PATH/perseus
rm -rf standalone-perseus-1.1.4/
rm -f v1.1.4.zip
sed -i $SEDEXT '1s/""/"assets\/perseus\/"/' $ASSETS_PATH/perseus/build/frame-perseus.js
sed -i $SEDEXT '1s/""/"..\/assets\/perseus\/"/' $ASSETS_PATH/perseus/build/frame-perseus.js

echo "getting lato font"
mkdir -p $ASSETS_PATH/fonts
curl -L -o $ASSETS_PATH/fonts/lato-v24-latin-regular.ttf https://dev.kiwix.org/fonts/lato/lato-v24-latin-regular.ttf
curl -L -o $ASSETS_PATH/fonts/lato-v24-latin-regular.woff2 https://dev.kiwix.org/fonts/lato/lato-v24-latin-regular.woff2

if command -v fix_ogvjs_dist > /dev/null; then
echo "fixing JS files"
Expand Down
Loading