Skip to content

Conversation

@parvit
Copy link

@parvit parvit commented Jul 25, 2022

This PR responds to issue openzim/sotoki/issues/243.

Disables the full text indexing and compression by default so that it's memory cost is only payed if requested (which can be an issue with big sites).

@rgaudin
Copy link
Member

rgaudin commented Jul 25, 2022

As explained in openzim/sotoki#243 (comment) I believe this is not needed as it is possible to disable it. Please confirm and we'll close this PR.

@rgaudin
Copy link
Member

rgaudin commented Jul 26, 2022

Checked ; works as expected.

@rgaudin rgaudin closed this Jul 26, 2022
@parvit
Copy link
Author

parvit commented Jul 26, 2022

Very well, just know however that if you pass the program with memray you'll still see the memory cost associated even when disabling it (because the defaults of the library still have effect).

@rgaudin
Copy link
Member

rgaudin commented Jul 27, 2022

Very well, just know however that if you pass the program with memray you'll still see the memory cost associated even when disabling it (because the defaults of the library still have effect).

Compression and indexing are both handled by libzim and only take action after the call to start(). Defaults only call those config_* methods which can be called several times before start.

As stated above, I've tested that it works as expected: by calling .config_compression and .config_indexing to disable both I end up with an uncompressed ZIM that does not include the full-text index.

@parvit
Copy link
Author

parvit commented Jul 27, 2022 via email

@rgaudin
Copy link
Member

rgaudin commented Jul 27, 2022

That's right. As @kelson42 said somewhere, this is the wanted behavior for 99.9% or our users. Compression and indexing should not have a significant impact on memory and if it does it may be a bug in libzim. I'll get to testing sotoki without both in the coming days so we can have a way forward.

I think the key here is that zimscraperlib.zim.creator.Creator inherits from libzim.writer.Creator so API might seem smaller than it actually is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants