Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undefined array index for locale tr causing error #5305

Open
passioneight opened this issue Mar 1, 2022 · 6 comments
Open

Undefined array index for locale tr causing error #5305

passioneight opened this issue Mar 1, 2022 · 6 comments

Comments

@passioneight
Copy link

Current behaviour

When using the locale tr, the array_change_key_case function calls do not properly lower case the array indexes. For example, Index_Type will be converted to Index_type.

Using PHP8, this will result in an error.

Here are the related files that cause the error for us (in order of exectuion):

public function listTableIndexes($table)
{
$sql = $this->_platform->getListTableIndexesSQL($table, $this->_conn->getDatabase());
$tableIndexes = $this->_conn->fetchAllAssociative($sql);
return $this->_getPortableTableIndexesList($tableIndexes, $table);
}

public function getListTableIndexesSQL($table, $database = null)
{
if ($database !== null) {
return 'SELECT NON_UNIQUE AS Non_Unique, INDEX_NAME AS Key_name, COLUMN_NAME AS Column_Name,' .
' SUB_PART AS Sub_Part, INDEX_TYPE AS Index_Type' .
' FROM information_schema.STATISTICS WHERE TABLE_NAME = ' . $this->quoteStringLiteral($table) .
' AND TABLE_SCHEMA = ' . $this->quoteStringLiteral($database) .
' ORDER BY SEQ_IN_INDEX ASC';
}
return 'SHOW INDEX FROM ' . $table;
}

protected function _getPortableTableIndexesList($tableIndexes, $tableName = null)
{
foreach ($tableIndexes as $k => $v) {
$v = array_change_key_case($v, CASE_LOWER);
if ($v['key_name'] === 'PRIMARY') {
$v['primary'] = true;
} else {
$v['primary'] = false;
}
if (strpos($v['index_type'], 'FULLTEXT') !== false) {
$v['flags'] = ['FULLTEXT'];
} elseif (strpos($v['index_type'], 'SPATIAL') !== false) {
$v['flags'] = ['SPATIAL'];
}
// Ignore prohibited prefix `length` for spatial index
if (strpos($v['index_type'], 'SPATIAL') === false) {
$v['length'] = isset($v['sub_part']) ? (int) $v['sub_part'] : null;
}
$tableIndexes[$k] = $v;
}
return parent::_getPortableTableIndexesList($tableIndexes, $tableName);
}

Note that there seem to be many files that use the array_change_key_case function.

Expected behaviour

Either lowercasing should be done using mb_* methods or do not use different values for index and column aliases, where the latter would probably be a BC.

Steps to reproduce

  1. Make sure you are using the tr locale.
  2. Use the MySqlSchemaManager and call listTableIndexes

Also see #4490 as this issue seems to have had the same problem, but was closed for some reason.

@passioneight
Copy link
Author

FYI, using the method, linked in the description, actually works for us. We created a patch for our specific use case, which added the method like so:

private function array_change_key_case_unicode($arr, $c = CASE_LOWER) {
    $c = ($c == CASE_LOWER) ? MB_CASE_LOWER : MB_CASE_UPPER;
    foreach ($arr as $k => $v) {
        $ret[mb_convert_case($k, $c, "UTF-8")] = $v;
    }
    return $ret;
}

Of course, all calls to array_change_key_case were then replaced by a call to $this->array_change_key_case_unicode(...).

@morozov
Copy link
Member

morozov commented Mar 1, 2022

When using the locale tr, the array_change_key_case function calls do not properly lower case the array indexes. For example, Index_Type will be converted to Index_type.

Is it the expected behavior of the tr locale or it's some bug? As far as I can tell from the documentation,

Returns string with all alphabetic characters converted to lowercase.

Note that 'alphabetic' is determined by the current locale.

The conversion should be done on a per-character basis regardless of the character's position.

How does one reproduce this issue in a development environment? Which operating system is it reproducible on? Apart from having the locale available in the system, how does one set it for the PHP application?

Which of the tr locales are you using? I see the following available in my operating system:

$ grep tr /etc/locale.gen
#tr_CY.UTF-8 UTF-8  
#tr_CY ISO-8859-9  
#tr_TR.UTF-8 UTF-8  
#tr_TR ISO-8859-9

@passioneight
Copy link
Author

Is it the expected behavior of the tr locale or it's some bug? As far as I can tell from the documentation,

To me, it is the expected behaviour, due to the documentation you mentioned. However, accessing an array index should not cause any errors, unless there is an explicit check for the key followed by a throw ... kind of line - i.e.:

if(!array_key_exists($key, $array)) {
   throw new FooException("Missing required key '$key'");
}

In this case an exception is just fine, otherwise not - at least to me.

The conversion should be done on a per-character basis regardless of the character's position.

I also think that there shouldn't be a conversion at all. Personally, I'd rather have an enum to avoid having different values in the first place. But that's probably some kind of backwards compatibility. Not sure, though, as I'm not too savvy of this repo.

How does one reproduce this issue in a development environment? Which operating system is it reproducible on? Apart from having the locale available in the system, how does one set it for the PHP application?

We are working on a project using Pimcore with Docker on a Mac. Which means that this can be used to reproduce the issue locally, provided that the tr language is enabled and copy of a document is created.

I'm aware that this is probably not what you had in mind when asking for reproducible steps. So, this was just for full disclosure.

I've had a look at Pimcore's code (which builds on top of Symfony). So, ultimately, the locale is set, using Locale::setDefault($locale); (where $locale holds the value tr).

Which of the tr locales are you using? I see the following available in my operating system

Here's what your command outputs for us:

# tr_CY ISO-8859-9
# tr_CY.UTF-8 UTF-8
# tr_TR ISO-8859-9
# tr_TR.UTF-8 UTF-8

Using setlocale(LC_ALL, 0) to retrieve the current locale, the following is returned:

LC_CTYPE=tr_TR.utf8;LC_NUMERIC=C;LC_TIME=tr_TR.utf8;LC_COLLATE=tr_TR.utf8;LC_MONETARY=tr_TR.utf8;LC_MESSAGES=tr_TR.utf8;LC_PAPER=tr_TR.utf8;LC_NAME=tr_TR.utf8;LC_ADDRESS=tr_TR.utf8;LC_TELEPHONE=tr_TR.utf8;LC_MEASUREMENT=tr_TR.utf8;LC_IDENTIFICATION=tr_TR.utf8

@morozov
Copy link
Member

morozov commented Mar 2, 2022

I also think that there shouldn't be a conversion at all.

You're right. It exists due to the issues like #5226 where the results returned by the query may have been processed by the portability middleware and needs to be adjusted back to a predictable case. If we find a way not to use the middleware within the schema managers, this conversion will be no longer necessary.

To me, it is the expected behaviour, due to the documentation you mentioned.

What exactly in the documentation suggests that the case conversion behavior depends on the position of the character?

I'm aware that this is probably not what you had in mind when asking for reproducible steps.

You're right. Right now, I don't have the environment to reproduce this issue. At least, its subset like:

$row = array_change_key_case(['Index_Type' => 1], CASE_LOWER);
// expected: ['Index_type' => 1]

If you provide the instructions on building such an environment in Docker and reproducing the above behavior, it may help address this issue sooner than later.

@morozov
Copy link
Member

morozov commented Mar 2, 2022

According to https://forge.typo3.org/issues/76804#note-8,

There is no upper-case "i" in that locale.

So that's probably the problem, not the position of the character in the string.

@morozov
Copy link
Member

morozov commented Mar 3, 2022

The issue can be reproduced on Linux as follows:

  1. Uncomment tr_TR.UTF-8 in /etc/locale.gen
  2. Regenerate locales via locale-gen
  3. Run the following script:
    <?php
    
    setlocale(LC_CTYPE, 'tr_TR.UTF-8');
    echo setlocale(LC_CTYPE, 0), PHP_EOL;
    // tr_TR.UTF-8
    echo strtolower('Index_Type'), PHP_EOL;
    // Index_type

@morozov morozov changed the title [BUG]: undefined array index for locale tr causing error Undefined array index for locale tr causing error Mar 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants