Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mbstring dependency to composer.json #561

Merged
merged 1 commit into from
Feb 28, 2018

Conversation

PhrozenByte
Copy link
Contributor

Parsedown requires PHP's mbstring extension due to

$shortage = 4 - mb_strlen($line, 'utf-8') % 4;

Parsedown never worked without the mbstring extension and users were experiencing PHP errors. Most PHP installations have mbstring enabled (even though it's not one of PHP's default extensions, it's a default extension on virtually any Linux distribution), so IMO there's no need for action here besides making this implicit dependency explicit. Users which can't install mbstring on their platform can still use a polyfill like https://github.com/symfony/polyfill-mbstring, but since it's comparibly slow and users should rather install the mbstring extension whenever possible, we shouldn't make the polyfill the default. Letting users know about this is IMO the best way to deal with it.

Also see #541 - what should be closed IMO

@aidantwoods
Copy link
Collaborator

I don't contest that Parsedown should declare the dependencies it uses, but I wonder whether we have an opportunity to make Parsedown a little more compatible here.

Given that we only use mb_strlen from mbstring, I wonder whether Parsedown should implement this single function as an internally available polyfill to avoid needing the dependency. (The implementation in https://github.com/symfony/polyfill-mbstring is pretty short for what its worth, and looks very similar to something already suggested).

Of course, in the current version this dependency should be explicit. I just wonder whether a better solution would be to support vanilla PHP instead of declaring a dependency that just makes Parsedown a little faster?

@PhrozenByte
Copy link
Contributor Author

PhrozenByte commented Feb 28, 2018

The problem is not code complexity, but performance. The polyfill takes 38 times (!) longer than native mbstring. When calling native mb_strlen() using Parsedown's code above and the first paragraph of Lorem ipsum as input text takes around 0.058 seconds for 10,000 rounds (i.e. 10,000 lines), but 2.204 seconds using the mb_strlen() polyfill (testing on my local machine with PHP 5.6). Naturally this isn't a credible benchmark, but there's a reason why the creators of the polyfill tell people to rather install native mbstring 😉 The polyfill is intended as last resort solution, no more.

And there's a second catch: Even when using the polyfill we still require the iconv extension. iconv is one of PHP's default extensions, but still can be disabled, so we definitly still need a explicit dependency. There's virtually no difference to requiring mbstring, mbstring is available virtually anywhere (I'd even say that not having mbstring enabled is rather a misconfiguration of PHP, virtually any web application depends on mbstring and that's the reason why all Linux distributions enable mbstring by default).

Parsedown is a library, not an end-user application. We should leave this decision up to (1) the application developer (as soon as they require the polyfill the ext-mbstring requirement is fulfilled), and (2) the developer should IMO leave this decision up to the user (but there might be application developers with a different opinion on this, but we shouldn't enforce application developers on using the polyfill).

If you don't have native mbstring, install native mbstring. The polyfill really is just a last resort solution - and the user should be aware of the fact that the polyfill is a last resort solution and he sacrifices one of Parsedown's main goals: performance. Silently using a polyfill is a bad idea IMO.

@aidantwoods
Copy link
Collaborator

aidantwoods commented Feb 28, 2018

I think the performance impact doesn't really matter, because the performance impact would only be felt if the slow polyfill implementation was actually used. If mb_strlen is available then obviously we shouldn't use the alternative (slower) implementation ;p

i.e. If don't have mbstring then Parsedown would be slower, but would at least work. If you do have mbstring installed then there is no difference.

Would be something like

protected static function mb_strlen(string $s, $enc = null)
{
    static $polyfillNeeded;

    if (!isset($polyfillNeeded)) { $polyfillNeeded = !function_exists('mb_strlen'); }

    return $polyfillNeeded ? self::slower_mb_strlen($s, $enc) : mb_strlen($s, $enc);
}

This would still give the end user a decision: if they want to (and can?) install mbstring then Parsedown will be faster, if not then at least it'll still work.

You do make a good point about iconv also being an extension though (albeit being a core/default one), so relying on this instead of mbstring probably doesn't gain too much.

Maybe (as you say) it is just simpler to require the dependency we actually need, not having mbstring enabled is a pretty weird situation anyway ;p (and probably doesn't happen too often?)

@PhrozenByte
Copy link
Contributor Author

and probably doesn't happen too often?

One in a million or something like that? This problem virtually doesn't exist for production servers. We recently learned that there are millions of Parsedown installations out there - and there are exactly two Issues about this, this one count in 😆

Maybe (as you say) it is just simpler to require the dependency we actually need

Exactly.

This would still give the end user a decision

Simply put, it's opt-in vs. opt-out. I'd strongly vote for opt-in - not because I think that this makes Parsedown slower for users with native mbstring (it won't), but because I think that using the polyfill is never a good idea. By using a opt-in strategy users are forced to think about the problem and make a decision (and the decision should always be "install native mbstring" in the end 😆 and the "one in a billion" user who really wants to use the polyfill can still run composer require symfony/polyfill-mbstring and the warning disappears). However, a opt-out strategy rather makes this problem invisible and users won't understand why Parsedown is so damn slow on their server... It's because of a misconfiguration, but this isn't obvious. Since Parsedown is a library I assume that everyone here knows what to do when a error message like "PHP extension mbstring is missing" is shown 😆

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants