Skip to content

zend_atol() cleanup #7788

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

Conversation

arnaud-lb
Copy link
Member

@arnaud-lb arnaud-lb commented Dec 18, 2021

This PR is mostly a rebase of #4132 and continuation of 989205e.

#4132 primarily changes the zend_atol() function, which is used to parse sizes/quantity values in ini settings, so that a warning is triggered if a value has an invalid format. This warning is helpful as it makes it easier to notice configuration errors.

A warning or a deprecation is the right solution here, since alternatives introduce breaking changes and/or have security implications.

There are little changes compared to the original PR:

  • The warning messages specify that the fallback behavior is for backwards compatibility
  • The changes around zend_atoi() was handled already in 989205e
  • Switched the tests to a ext/zend-test function, and added expectations about the return value of zend_atol()

There are a few things we could consider after this change:

  • Deprecating zend_atoi() and removing it at some point in the future
  • Renaming zend_atol() to something reflecting its behavior. This would make it less surprising and would reduce the chances of accidentally using zend_atol() at a place where the intent is not to parse a size or quantity. Since the function is only used for parsing ini values now, we could move it to Zend/zend_ini and chose a name like zend_ini_parse_quantity(). We could keep a deprecated wrapper/alias for some time.
  • Ultimately removing the fallback and turning the warning to an error, although it may have security implication preventing us from doing that

@arnaud-lb arnaud-lb marked this pull request as ready for review December 18, 2021 16:07
/* Ignore trailing whitespace */
while (str_len && ZEND_IS_WHITESPACE(str[str_len-1])) --str_len;
if (!str_len) return 0;

/* Perform following multiplications on unsigned to avoid overflow UB.
* For now overflow is silently ignored -- not clear what else can be
* done here, especially as the final result of this function may be
* used in an unsigned context (e.g. "memory_limit=3G", which overflows
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks good. I've accidentally made the mistake GB in the past, which is ignored, and a warning would be useful.

I wonder if it'd also make sense to warn about overflow (including after multiplication for suffixes) at the same time, with this change making this helper possibly emit a warning.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your review @TysonAndre

The implementation looks good. I've accidentally made the mistake GB in the past, which is ignored, and a warning would be useful

All credit goes to @sgolemon who did the original PR

I wonder if it'd also make sense to warn about overflow (including after multiplication for suffixes) at the same time, with this change making this helper possibly emit a warning

Agreed, this seems like a good idea. Do you think it should be in this PR or a separate PR ? I would have a preference for a separate PR

Copy link
Contributor

@TysonAndre TysonAndre Dec 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks good

It also seemed like emitting a warning was safe in all the places that did this, as far as I could tell - e.g. nothing opcache would think was a side effect free opcode would use zend_strtol that I could find and start having side effects/throwing

All credit goes to sgolemon who did the original PR

I looked at that while reviewing this, just confirming that it still looks good after a rebase.

Agreed, this seems like a good idea. Do you think it should be in this PR or a separate PR ? I would have a preference for a separate PR

Separate PR if the suffix checking is uncontroversial, same PR if it ended up requiring an RFC.

https://externals.io/message/105646 didn't have much feedback (I don't remember seeing the email, though) - negative or otherwise. Hopefully there are no objections.

Another email should be sent to see if opinions had changed or to describe the latest changed behavior in more details, including

  • severity (E_WARNING)
  • Error message
  • Names of affected functions/macros
  • Affected functions (ini_set, ini parsing, ffi, upload progress, etc., and how they changed)
  • How this will affect future php versions (I'm assuming none and this will continue emitting an E_WARNING, I can't see how else this would be done (throwing wouldn't work for ini settings))
  • PECL extensions using zend_atol and other macros may now need to check for exceptions thrown by error handlers configured with https://www.php.net/set_error_handler and may need to be updated for tests, source code, examples, etc. if those were incorrectly using suffixes

@TysonAndre
Copy link
Contributor

Since the function is only used for parsing ini values now, we could move it to Zend/zend_ini and chose a name like zend_ini_parse_quantity(). We could keep a deprecated wrapper/alias for some time.

It ought not to, but somehow is, and I assume that's a bug that could be fixed separately from this PR.

https://www.php.net/ftp_size https://datatracker.ietf.org/doc/html/rfc3659#section-4.4 https://www.postgresql.org/docs/9.3/libpq-exec.html#LIBPQ-PQCMDTUPLES
(those are always integer strings as far as I'd expect? )

» ag 'zend_atol\b' -l
Zend/zend_long.h
Zend/zend_operators.h
Zend/zend_ini.c
Zend/zend.c
Zend/zend_alloc.c
Zend/zend_operators.c
ext/opcache/zend_accelerator_module.c
ext/zend_test/test.c
ext/bcmath/bcmath.c
ext/mysqlnd/php_mysqlnd.c
ext/pdo_pgsql/pgsql_statement.c
ext/pdo_pgsql/pgsql_driver.c
ext/ftp/ftp.c
ext/standard/basic_functions.c
ext/session/session.c
main/fastcgi.c
main/main.c
sapi/apache2handler/sapi_apache2.c
sapi/cli/php_cli.c
sapi/cli/php_cli_server.c

arnaud-lb and others added 2 commits January 14, 2022 12:15
zend_atol() parses integers with size suffixes (like "128M"). Here zend_atol() is
replaced with ZEND_ATOL() when the intent was not to parse a size or quantity.

Co-authored-by: Sara Golemon <pollita@php.net>
zend_atol() don't just do number parsing.
They also check for a 'K', 'M', or 'G' at the end of the string,
and multiply the parsed value out accordingly.

Unfortunately, they ignore any other non-numerics between the
numeric component and the last character in the string.
This means that numbers such as the following are both valid
and non-intuitive in their final output.

* "123KMG" is interpreted as "123G" -> 132070244352
* "123G " is interpreted as "123 " -> 123
* "123GB" is interpreted as "123B" -> 123
* "123 I like tacos." is also interpreted as "123." -> 123

This diff adds warnings for these cases when the output
would be a potentially unexpected, and unhelpful value.

Co-authored-by: Sara Golemon <pollita@php.net>
@arnaud-lb arnaud-lb force-pushed the zend-atol-cleanup branch 6 times, most recently from 59b7b36 to f0bffb7 Compare January 14, 2022 16:43
@arnaud-lb
Copy link
Member Author

@TysonAndre thank you for your reviews!

Following your comments, and after thinking more about this PR, I've found that the most important problems I wanted to address are these:

  • Making the non-intuitive behavior more visible to users, so that mistakes can be more easily noticed and fixed
  • Preventing accidental usage of zend_atol() / zend_atoi() in the future, or renaming them, since their behavior is not reflected by their name

I've also decided to not modify the behavior of any ini settings for now, apart from adding a warning, as this makes the change fully backwards compatible.

I've created a variant of this PR here: #7951

Copy link
Member

@cmb69 cmb69 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've also decided to not modify the behavior of any ini settings for now, apart from adding a warning, as this makes the change fully backwards compatible.

Thank you! I think this is the right way to approach this for now. I've sent a mail about this PR to the internals mailing list some days ago; so far, no reply.

@arnaud-lb
Copy link
Member Author

Thank you for the email @cmb69 !

What do you think of #7951 ? This is the PR in which I made the changes mentioned in my previous comment (I originally made them here and then reverted, it's possible that you approved just before that, sorry for the confusion)

@cmb69
Copy link
Member

cmb69 commented Jan 16, 2022

This PR, as is, looks fine to me. We skip leading whitespace, check for a number (and warn otherwise, falling back to 0), again skip whitespace, and then get the suffix (if any), warning if it is not known but interpreting it like previously. So, in short, everything works like before, but there are warnings about probably unintended values.

Separating #7951 is welcome for me.

@salarmehr
Copy link

any update about this PR?

@cmb69
Copy link
Member

cmb69 commented May 25, 2022

What's the status here? If we want to get this into PHP 8.2 (and I think we want to), we should go ahead. :)

@cmb69
Copy link
Member

cmb69 commented May 27, 2022

Oh, I had a closer look at PR #7951, and agree that that is preferable to this PR.

@arnaud-lb arnaud-lb closed this May 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants