Skip to content

unexpected handling of mysql_enable_utf8/mysql_enable_utf8 => 0 [rt.cpan.org #119079] #208

Open
@mbeijen

Description

@mbeijen

Migrated from rt.cpan.org#119079 (status was 'open')

Requestors:

From tzs@eacceleration.com on 2016-11-30 20:18:37:

At line 1884 of dbdimp.c, it handles the mysql_enable_utf8/mysql_enable_utf8mb4 flags in the options hash:

       if ((svp = hv_fetch(hv, "mysql_enable_utf8mb4", 20, FALSE)) && *svp && SvTRUE(*svp)) {
          mysql_options(sock, MYSQL_SET_CHARSET_NAME, "utf8mb4");
        }
        else if ((svp = hv_fetch(hv, "mysql_enable_utf8", 17, FALSE)) && *svp) {
          /* Do not touch imp_dbh->enable_utf8 as we are called earlier
           * than it is set and mysql_options() must be before:
           * mysql_real_connect()
          */
         mysql_options(sock, MYSQL_SET_CHARSET_NAME,
                       (SvTRUE(*svp) ? "utf8" : "latin1"));
         if (DBIc_TRACE_LEVEL(imp_xxh) >= 2)
           PerlIO_printf(DBIc_LOGPIO(imp_xxh),
                         "mysql_options: MYSQL_SET_CHARSET_NAME=%s\n",
                         (SvTRUE(*svp) ? "utf8" : "latin1"));
        }

The logic here leads to surprising results. If when connecting you give this:

mysql_enable_utf8mb4 => 1

you get MYSQL_SET_CHARSET_NAME  set to utf8mb4, as you would expect, . Similarly,

mysql_enable_utf8 => 1

gives MYSQL_SET_CHARSET_NAME  set to utf8, just as expected, and not specifying either option results in MYSQL_SET_CHARSET_NAME not being set so you presumably get whatever the underlying mysql default is.

Where it gets weird is if you give one of these:

mysql_enable_utf8mb4 => 0

or

mysql_enable_utf8 => 0.

The former is equivalent to not specifying an option, and so you get the underlying mysql default for MYSQL_SET_CHARSET_NAME.

The later, however, results in MYSQL_SET_CHARSET_NAME set to latin1.

It seems quite counterintuitive that if I want latin1, the way to get it is with mysql_enable_utf8 => 0, and mysql_enable_utf8mb4 => 0 will not get that (unless that happens to be the underlying default).

-- 
Tim Smith
tzs@eacceleration.com




From michielb@cpan.org on 2016-12-01 08:12:50:

Hi Tim!

On Wed 30 Nov 2016 15:18:37, tzs@eacceleration.com wrote:

> It seems quite counterintuitive that if I want latin1, the way to get
> it is with mysql_enable_utf8 => 0, and mysql_enable_utf8mb4 => 0 will
> not get that (unless that happens to be the underlying default).


You're right on that! Also, the behaviour of mysql_enable_utf8 => 0 is not documented.

My proposal would be to explicitly document this behavior and leave it at that. Do you agree?

--
Michiel

From pali@cpan.org on 2017-03-01 13:26:03:


Metadata

Metadata

Assignees

No one assigned

    Labels

    utf8Unicode and UTF-8 handling

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions