Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New UI step 1 #68

Merged
merged 21 commits into from
Oct 23, 2023
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .github/workflows/Publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,16 +16,19 @@ jobs:
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version-file: pyproject.toml
python-version-file: scraper/pyproject.toml
architecture: x64

- name: Build packages
working-directory: scraper
run: |
pip install -U pip build
python -m build --sdist --wheel

- name: Upload to PyPI
uses: pypa/gh-action-pypi-publish@release/v1.8
with:
packages-dir: scraper/dist/

- name: Build and push Docker image
uses: openzim/docker-publish-action@v10
Expand Down
36 changes: 33 additions & 3 deletions .github/workflows/QA.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ on:
- main

jobs:
check-qa:
check-scraper-qa:
runs-on: ubuntu-22.04

steps:
Expand All @@ -16,19 +16,49 @@ jobs:
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version-file: pyproject.toml
python-version-file: scraper/pyproject.toml
architecture: x64

- name: Install dependencies (and project)
- name: Install dependencies
working-directory: scraper
run: |
pip install -U pip
pip install -e .[lint,check,scripts,test]

- name: Check black formatting
working-directory: scraper
run: inv lint-black

- name: Check ruff
working-directory: scraper
run: inv lint-ruff

- name: Check pyright
working-directory: scraper
run: inv check-pyright

check-zimui-qa:
runs-on: ubuntu-22.04

steps:
- uses: actions/checkout@v3

- name: Set up Node.js
uses: actions/setup-node@v3
with:
node-version-file: zimui/.node-version

- name: Install JS dependencies
working-directory: zimui
run: |
yarn install

- name: Check prettier
working-directory: zimui
run: |
yarn format

- name: Check eslint
working-directory: zimui
run: |
yarn lint
33 changes: 28 additions & 5 deletions .github/workflows/Tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ on:
- main

jobs:
run-tests:
test-scraper:
runs-on: ubuntu-22.04

steps:
Expand All @@ -16,39 +16,62 @@ jobs:
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version-file: pyproject.toml
python-version-file: scraper/pyproject.toml
architecture: x64

- name: Install dependencies (and project)
working-directory: scraper
run: |
pip install -U pip
pip install -e .[test,scripts]

- name: Run the tests
working-directory: scraper
run: inv coverage --args "-vvv"

- name: Upload coverage report to codecov
uses: codecov/codecov-action@v3
with:
token: ${{ secrets.CODECOV_TOKEN }}

build_python:
build-scraper:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v3

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version-file: pyproject.toml
python-version-file: scraper/pyproject.toml
architecture: x64

- name: Ensure we can build Python targets
working-directory: scraper
run: |
pip install -U pip build
python3 -m build --sdist --wheel

build_docker:
build-zimui:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v3

- name: Set up Node.js
uses: actions/setup-node@v3
with:
node-version-file: zimui/.node-version

- name: Install dependencies
working-directory: zimui
run: |
yarn install

- name: Build
working-directory: zimui
run: |
yarn build

build-docker:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v3
Expand Down
20 changes: 10 additions & 10 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -368,16 +368,16 @@ pyrightconfig.json

# assets that we download
.dockerignore
src/kolibri2zim/templates/assets/bootstrap/
src/kolibri2zim/templates/assets/pdfjs/
src/kolibri2zim/templates/assets/videojs/
src/kolibri2zim/templates/assets/jquery.min.js
src/kolibri2zim/templates/assets/ogvjs/
src/kolibri2zim/templates/assets/videojs-ogvjs.js
src/kolibri2zim/templates/assets/epub.min.js
src/kolibri2zim/templates/assets/bootstrap-icons/
src/kolibri2zim/templates/assets/jszip.min.js
src/kolibri2zim/templates/assets/perseus/
scraper/src/kolibri2zim/templates/assets/bootstrap/
benoit74 marked this conversation as resolved.
Show resolved Hide resolved
scraper/src/kolibri2zim/templates/assets/pdfjs/
scraper/src/kolibri2zim/templates/assets/videojs/
scraper/src/kolibri2zim/templates/assets/jquery.min.js
scraper/src/kolibri2zim/templates/assets/ogvjs/
scraper/src/kolibri2zim/templates/assets/videojs-ogvjs.js
scraper/src/kolibri2zim/templates/assets/epub.min.js
scraper/src/kolibri2zim/templates/assets/bootstrap-icons/
scraper/src/kolibri2zim/templates/assets/jszip.min.js
scraper/src/kolibri2zim/templates/assets/perseus/

# output dir
output
Expand Down
20 changes: 18 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,24 @@ repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.4.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: trailing-whitespace
- id: end-of-file-fixer
- repo: https://github.com/pre-commit/mirrors-prettier
rev: v3.0.3
hooks:
- id: prettier
files: zimui\/.*$ # files in zimui folder
- repo: https://github.com/pre-commit/mirrors-eslint
rev: v8.51.0
hooks:
- id: eslint
types: [file]
files: zimui\/src\/.*(?:\.[jt]sx?|\.vue)$ # *.js, *.jsx, *.ts, *.tsx, *.vue in zimui/src folder
args:
- --ignore-path
- zimui/.eslintignore
- --config
- zimui/.eslintrc.cjs
- repo: https://github.com/psf/black
rev: "23.3.0"
hooks:
Expand Down
27 changes: 20 additions & 7 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
FROM node:20-alpine as zimui

WORKDIR /src
COPY zimui /src
RUN yarn install --frozen-lockfile
RUN yarn build

FROM python:3.11-bookworm
LABEL org.opencontainers.image.source https://github.com/openzim/kolibri

Expand All @@ -12,22 +19,28 @@ RUN apt-get update \
pip

# Copy pyproject.toml and its dependencies
COPY pyproject.toml README.md get_js_deps.sh hatch_build.py /src/
COPY src/kolibri2zim/__about__.py /src/src/kolibri2zim/__about__.py
COPY README.md /src/
COPY scraper/pyproject.toml scraper/get_js_deps.sh scraper/hatch_build.py /src/scraper/
COPY scraper/src/kolibri2zim/__about__.py /src/scraper/src/kolibri2zim/__about__.py

# Install Python dependencies
RUN pip install --no-cache-dir /src
RUN pip install --no-cache-dir /src/scraper

# Copy code + associated artifacts
COPY src /src/src
COPY *.md LICENSE *.py /src/
COPY scraper/src /src/scraper/src
COPY *.md LICENSE /src/

# Install + cleanup
RUN pip install --no-cache-dir /src \
&& rm -rf /src
RUN pip install --no-cache-dir /src/scraper \
&& rm -rf /src/scraper

# Copy zimui build output
COPY --from=zimui /src/dist /src/zimui

# default output directory
RUN mkdir -p /output
WORKDIR /output

ENV KOLIBRI_ZIMUI_DIST=/src/zimui

CMD ["kolibri2zim", "--help"]
4 changes: 0 additions & 4 deletions MANIFEST.in

This file was deleted.

44 changes: 37 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,30 +16,37 @@ HTML files folder of it before creating a ZIM off of it.
Requirements
------------

* Node 20.x
* Python 3.11
* [`ffmpeg`](https://ffmpeg.org/) for video transcoding (only used with `--use-webm` or `--low-quality`).
* `curl` and `unzip` to install Javascript dependencies. See `get_js_deps.sh` if you want to do it manually.

Installation
------------

## Virtualenv
### Virtualenv

`kolibri2zim` is a Python3 software. If you are not using the
`kolibri2zim` is a Python3 software. If you are not using the
[Docker](https://docker.com) image, you are advised to use it in a
virtual environment to avoid installing software dependencies on your system.

```bash
python3 -m venv env # Create virtualenv
source env/bin/Activate # Activate the virtualenv
source env/bin/activate # Activate the virtualenv
pip3 install kolibri2zim # Install dependencies
kolibri2zim --help # Display kolibri2zim help
```

Call `deactivate` to quit the virtual environment.

See `requirements.txt` for the list of python dependencies.
See `pyproject.toml` for the list of python dependencies.

## Docker
To test epubs and pdfs rendering, a potential usefull command is:
```bash
benoit74 marked this conversation as resolved.
Show resolved Hide resolved
kolibri2zim --name "Biblioteca Elejandria" --output /output --tmp-dir /tmp --zim-file Biblioteca_Elejandria.zim --channel-id "fed29d60e4d84a1e8dcfc781d920b40e" --node-ids 'd92c07655128458f8248416154b18a68,89fe2f86ee3f4fbaa7fb2bf9bd56d088,75f99e6b97d14b14a4e74762ad77391f,89fe2f86ee3f4fbaa7fb2bf9bd56d088'
```

### Docker

```bash
docker run -v my_dir:/output ghcr.io/openzim/kolibri2zim kolibri2zim --help
Expand All @@ -56,8 +63,31 @@ Development
Before contributing be sure to check out the
[CONTRIBUTING.md](CONTRIBUTING.md) guidelines.

To test epubs and pdfs rendering, a potential usefull command is:
```bash
Some usefull test channels:
- 7f744ce8d28b471eaf663abd60c92267: a very minimal channel with all kind of content
benoit74 marked this conversation as resolved.
Show resolved Hide resolved
- 9f15f4e9aeaa48b5ae271e5749d6fe80 : a small channel with significantly nested items and all kind of content

### Build and running scraper locally

You have to:
- build the `zimui` frontend which will be embededed inside the ZIM (and redo it every time you make modifications to the `zimui`)
- run the `scraper` to retrieve FCC curriculum and build the ZIM

Sample commands:
```
benoit74 marked this conversation as resolved.
Show resolved Hide resolved
cd zimui
yarn install
yarn build
cd ../scraper
hatch run kolibri2zim --name "Biblioteca Elejandria" --output output --zim-file Biblioteca_Elejandria.zim --channel-id "fed29d60e4d84a1e8dcfc781d920b40e" --node-ids 'd92c07655128458f8248416154b18a68,89fe2f86ee3f4fbaa7fb2bf9bd56d088,75f99e6b97d14b14a4e74762ad77391f,89fe2f86ee3f4fbaa7fb2bf9bd56d088'
```

### Running scraper with Docker

Run from official version (published on GHCR.io) ; ZIM will be available in the `output` sub-folder of current working directory.

```
docker run --rm -it -v $(pwd)/output:/output ghcr.io/openzim/kolibri2zim:latest
kolibri2zim --name "Biblioteca Elejandria" --output /output --tmp-dir /tmp --zim-file Biblioteca_Elejandria.zim --channel-id "fed29d60e4d84a1e8dcfc781d920b40e" --node-ids 'd92c07655128458f8248416154b18a68,89fe2f86ee3f4fbaa7fb2bf9bd56d088,75f99e6b97d14b14a4e74762ad77391f,89fe2f86ee3f4fbaa7fb2bf9bd56d088'
```

Expand Down
File renamed without changes.
2 changes: 1 addition & 1 deletion get_js_deps.sh → scraper/get_js_deps.sh
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ mkdir -p $ASSETS_PATH/perseus
mv standalone-perseus-1.1.4/* $ASSETS_PATH/perseus
rm -rf standalone-perseus-1.1.4/
rm -f v1.1.4.zip
sed -i $SEDEXT '1s/""/"assets\/perseus\/"/' $ASSETS_PATH/perseus/build/frame-perseus.js
sed -i $SEDEXT '1s/""/"..\/assets\/perseus\/"/' $ASSETS_PATH/perseus/build/frame-perseus.js

if command -v fix_ogvjs_dist > /dev/null; then
echo "fixing JS files"
Expand Down
File renamed without changes.
5 changes: 4 additions & 1 deletion pyproject.toml → scraper/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ authors = [{ name = "Kiwix", email = "dev@kiwix.org" }]
keywords = ["kiwix", "zim", "offline", "kolibri"]
requires-python = ">=3.11"
description = "Make ZIM file from Kolibri Channels"
readme = "README.md"
readme = "../README.md"
rgaudin marked this conversation as resolved.
Show resolved Hide resolved
license = { text = "GPL-3.0-or-later" }
classifiers = [
"Programming Language :: Python :: 3",
Expand All @@ -22,6 +22,8 @@ dependencies = [
"pif==0.8.2",
"beautifulsoup4==4.9.3",
"retrying==1.3.4",
"pydantic==2.4.2",
"python-slugify==8.0.1",
]
dynamic = ["version"]

Expand Down Expand Up @@ -105,6 +107,7 @@ all = "inv checkall --args '{args}'"
[tool.black]
line-length = 88
target-version = ['py311']
exclude="(src/kolibri2zim/templates/.*|.hatch/.*)"

[tool.ruff]
target-version = "py311"
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ def get_node_children(self, node_id, left=None, right=None):
right = node["right"]

for row in self.get_rows(
"SELECT id, title, kind "
"SELECT id, title, description, kind, lft as left, rght as right "
"FROM content_contentnode WHERE lft > ? AND rght < ? "
"AND parent_id=?"
"ORDER BY level ASC",
Expand Down
File renamed without changes.
Loading