Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unexpected handling of mysql_enable_utf8/mysql_enable_utf8 => 0 [rt.cpan.org #119079] #208

Open
mbeijen opened this issue Nov 15, 2017 · 0 comments
Labels
utf8 Unicode and UTF-8 handling

Comments

@mbeijen
Copy link
Contributor

mbeijen commented Nov 15, 2017

Migrated from rt.cpan.org#119079 (status was 'open')

Requestors:

From tzs@eacceleration.com on 2016-11-30 20:18:37:

At line 1884 of dbdimp.c, it handles the mysql_enable_utf8/mysql_enable_utf8mb4 flags in the options hash:

       if ((svp = hv_fetch(hv, "mysql_enable_utf8mb4", 20, FALSE)) && *svp && SvTRUE(*svp)) {
          mysql_options(sock, MYSQL_SET_CHARSET_NAME, "utf8mb4");
        }
        else if ((svp = hv_fetch(hv, "mysql_enable_utf8", 17, FALSE)) && *svp) {
          /* Do not touch imp_dbh->enable_utf8 as we are called earlier
           * than it is set and mysql_options() must be before:
           * mysql_real_connect()
          */
         mysql_options(sock, MYSQL_SET_CHARSET_NAME,
                       (SvTRUE(*svp) ? "utf8" : "latin1"));
         if (DBIc_TRACE_LEVEL(imp_xxh) >= 2)
           PerlIO_printf(DBIc_LOGPIO(imp_xxh),
                         "mysql_options: MYSQL_SET_CHARSET_NAME=%s\n",
                         (SvTRUE(*svp) ? "utf8" : "latin1"));
        }

The logic here leads to surprising results. If when connecting you give this:

mysql_enable_utf8mb4 => 1

you get MYSQL_SET_CHARSET_NAME  set to utf8mb4, as you would expect, . Similarly,

mysql_enable_utf8 => 1

gives MYSQL_SET_CHARSET_NAME  set to utf8, just as expected, and not specifying either option results in MYSQL_SET_CHARSET_NAME not being set so you presumably get whatever the underlying mysql default is.

Where it gets weird is if you give one of these:

mysql_enable_utf8mb4 => 0

or

mysql_enable_utf8 => 0.

The former is equivalent to not specifying an option, and so you get the underlying mysql default for MYSQL_SET_CHARSET_NAME.

The later, however, results in MYSQL_SET_CHARSET_NAME set to latin1.

It seems quite counterintuitive that if I want latin1, the way to get it is with mysql_enable_utf8 => 0, and mysql_enable_utf8mb4 => 0 will not get that (unless that happens to be the underlying default).

-- 
Tim Smith
tzs@eacceleration.com




From michielb@cpan.org on 2016-12-01 08:12:50:

Hi Tim!

On Wed 30 Nov 2016 15:18:37, tzs@eacceleration.com wrote:

> It seems quite counterintuitive that if I want latin1, the way to get
> it is with mysql_enable_utf8 => 0, and mysql_enable_utf8mb4 => 0 will
> not get that (unless that happens to be the underlying default).


You're right on that! Also, the behaviour of mysql_enable_utf8 => 0 is not documented.

My proposal would be to explicitly document this behavior and leave it at that. Do you agree?

--
Michiel

From pali@cpan.org on 2017-03-01 13:26:03:


@dveeden dveeden added the utf8 Unicode and UTF-8 handling label Oct 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
utf8 Unicode and UTF-8 handling
Projects
None yet
Development

No branches or pull requests

2 participants