Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Phalcon\Tag helpers mangle UTF-8 characters #1681

Closed
rlaffers opened this issue Dec 12, 2013 · 15 comments
Closed

[BUG] Phalcon\Tag helpers mangle UTF-8 characters #1681

rlaffers opened this issue Dec 12, 2013 · 15 comments

Comments

@rlaffers
Copy link
Contributor

    <?php 
        // inside any of my views
        echo $this->tag->textField(array('fieldname', 'id' => 'ľščťžýáíéôňď', 'value' => 'ľščťžýáíéôňď') );
        echo $this->tag->submitButton('ľščťžýáíéôňď') 
    ?>

What was expected:
tag_helper_broken2

What i got:
tag_helper_broken1

(yes, I have the correct charset specified for the page. If I put raw HTML into the same view, all characters are rendered fine)

<!-- this works fine -->    
<input type="text" name="fieldname" value="ľščťžýáíéôňď" id="ľščťžýáíéôňď">
<input type="submit" value="ľščťžýáíéôňď">
@xboston
Copy link
Contributor

xboston commented Dec 12, 2013

Check disable Autoescape: http://api.phalconphp.com/en/1.2.3/Phalcon/Tag#setAutoescape-details

Phalcon\Tag::setAutoescape(false);

@rlaffers
Copy link
Contributor Author

@xboston Thanks, that helped. Case closed.

@rlaffers
Copy link
Contributor Author

I'm going to reopen this issue. Although xboston's workaround works, it is suboptimal - turning off autoescaping opens your site to XSS attacks if you use form-related view helpers. That makes the view helpers useless if your site uses accented characters.

@rlaffers rlaffers reopened this Dec 16, 2013
@ghost
Copy link

ghost commented Dec 16, 2013

@rlaffers what version of Phalcon do you use?

In 1.3.0 everything seems to work fine, see this test case: https://github.com/sjinks/cphalcon/blob/1.3.0/ext/tests/issue-1700.phpt

@rlaffers
Copy link
Contributor Author

I'm using 1.2.4. Will switch to 1.3.0 in a couple of days.

@ghost
Copy link

ghost commented Dec 16, 2013

1.2.4 does work for me: https://www.diigo.com/item/image/3zdqq/nubc?size=o

(page charset is utf-8).

Could you please give me the code that reproduces the issue?

@rlaffers
Copy link
Contributor Author

Hmm, the error does not manifest itself on a freshly created project. Something in how my current project is set up then. I will track down the offending piece of code and report back here.

@ghost
Copy link

ghost commented Jan 6, 2014

echo Phalcon\Tag::textField(['test','value'=>'ą ć ę ł ń ś ź ż Ó ó']);

Then if you send the form more times, other non-ASCII characters will be broken!

Why does Phalcon normalize form values to UTF-32? How to disable normalization?

@ghost
Copy link

ghost commented Jan 6, 2014

@Politechniczny do you happen to have the code the reproduces the issue? No matter how many times I submit the form I still see the correct UTF-8.

@ghost
Copy link

ghost commented Jan 6, 2014

@sjinks XAMPP 1.8.3, PHP 5.5.3 VC2012 TS, Phalcon 1.2.4, libmbfl 1.3.2, Windows 7 SP1

This is form.volt template for all forms:

{{content()}}

<form method="post">
    <h1>{{title|e}}</h1>
    <table>
        {% for field in form %}
        <tr>
            <td>{{field.getLabel()}}</td>
            <td>{{field.render()}}</td>
            {% if form.hasMessagesFor(field.getName()) %}
            <td>
                {% for msg in form.getMessagesFor(field.getName()) %}
                <div class="msg">{{msg}}</div>
                {% endfor %}
            </td>
            {% endif %}
        </tr>
        {% endfor %}
        <tr>
            <td colspan="2" style="text-align:center">
                <input type="submit" value="Zapisz">
            </td>
        </tr>
    </table>
</form>

UserForm.php

<?php

use Phalcon\Forms\Element\Check;
use Phalcon\Forms\Element\Password;
use Phalcon\Forms\Element\Text;
use Phalcon\Forms\Form;
use Phalcon\Validation\Validator\PresenceOf;
use Phalcon\Validation\Validator\StringLength;  

class UserForm extends Form
{
    public function initialize(User $user, $options=null)
    {
        //Add login field
        $login = new Text('login', array(
            'maxlength' => 25,
            'pattern'   => '.{1,25}',
            'required'  => 'required',
            'autofocus' => 'autofocus'
        ));
        $login->addValidator(new StringLength(array(
            'min' => 200, //Fake value to make the form be returned back
            'max' => 25,
            'messageMinimum' => 'Login musi mieć min 2 znaki!',
            'messageMaximum' => 'Login jest za długi - max 25 znaków!'
        )));
        $login->setLabel('Login');
        $this->add($login);

        //Add full name field
        $name = new Text('name');
        $name->addValidator(new PresenceOf(array(
            'message' => 'Podaj imię i nazwisko!'
        )));
        $name->setLabel('Imię i nazwisko');
        $this->add($name);

        //Add password field
        $pass = new Password('pass');
        $pass->addValidators(array(
            new StringLength(array(
                'min' => 5,
                'messageMinimum' => 'Hasło za krótkie!'
            ))
        ));
        $pass->setLabel('Hasło');
        $this->add($pass);

        //Prepare active field
        $active = new Check('active');
        $active->setLabel('Konto aktywne');
        $this->add($active);

        //Prepare admin field
        $admin = new Check('admin');
        $admin->setLabel('Jest adminem');
        $this->add($admin);
    }
}

UsersController.php

<?php
class UsersController extends Phalcon\Mvc\Controller
{
    public function addAction()
    {
        $this->form(new User);
        $this->view->title = 'Dodaj użytkownika';
    }
    public function editAction($id)
    {
        if(!$user = User::findFirstByID($id))
        {
            return $this->dispatcher->forward(['action'=>'index']);
        }
        $this->form($user);
        $this->view->title = 'Edytuj użytkownika '.$user->login;
    }
    private function form(User $user)
    {
        $form = new UserForm($user);
        if($this->request->isPost())
        {
            $form->bind($_POST, $user);

            if($form->isValid($_POST))
            {
                if($user->save())
                {
                    $this->view->disable();
                    $this->response->redirect('users');
                }
                else
                {
                    foreach($user->getMessages() as $msg)
                    {
                        $this->flash->error($msg);
                    }
                }
            }
        }
        $this->view->form = $form;
        $this->view->pick('form');
    }
}

@ghost
Copy link

ghost commented Jan 13, 2014

Works fine for me, no matter how many times I submit the form: https://www.diigo.com/item/image/3zdqq/m2ju

My guess is that you forgot <meta charset="utf-8"> tag in the <head>.

Is anybody else able to reproduce this behavior?

@pgasiorowski
Copy link
Contributor

I'm able to repro this in IE 11 when I don't add the <meta charset="UTF-8"> to the code it sometimes automatically picks it up as *Western European (Windows)" even though the file is saved as UTF8

<?php

$di = new Phalcon\Di\FactoryDefault();

$tagService = $di->get('tag');
$tagService->setDi($di);
?><!DOCTYPE html>
<head>
<meta charset="UTF-8">
</head>
<body>
<?php
echo '<form method="post">';
echo Phalcon\Tag::textField(['ľščťžýáíéôňď','value'=>'ą ć ę ł ń ś ź ż Ó ó']);
echo Phalcon\Tag::submitButton('ľščťžýáíéôňď');
echo '</form>';


foreach($_POST as $name => $value)
{
     echo $name,' => ', $value, '<br>';
}
?>
</body>
</html>

@rlaffers
Copy link
Contributor Author

@sjinks No, I have included the charset meta tag in the output - that's not the problem here. The characters are mangled only when the string is passed through the form helpers. UTF-8 in plain HTML tags works fine. I'm going to track the problem down.

@rlaffers
Copy link
Contributor Author

OK, the problem manifests when using the \Phalcon\Translate\Adapter\Gettext library from incubator. More specifically, the offending line from it is:

setlocale(LC_ALL, $options['locale']);

When I comment this line out, the form helpers work fine even with UTF-characters. I checked the passed locale "sk_SK" on my system with locale -a - it is installed. The problem persists even when using any other locale (e.g. "en_US").

To reproduce:

// put this into your service.php
require __DIR__ . '/../library/incubator/Library/Phalcon/Translate/Adapter/Gettext.php';
$di->set('i18n', function() use ($di) {
    $i18n = new \Phalcon\Translate\Adapter\Gettext(array(
        'locale' => 'en_US',
        'file' => 'default',
        'directory' => __DIR__ . '/../locale'
    ));
    return $i18n;
});        
$di->get('i18n');

I don't think this is a bug in the Gettext adapter. It is the simple call to setlocale() which upsets form helpers.

@ghost
Copy link

ghost commented Jan 13, 2014

Looks like setlocale() breaks mb_detect_encoding() :-(

What if you set locale to en_US.UTF-8 instead of just en_US?

@ghost
Copy link

ghost commented Jan 13, 2014

Actually the issue is not mb_detect_encoding(), the issue is isalnum():

<?php

setlocale(LC_ALL, 'sk_SK.UTF-8');
var_dump(ctype_alnum("\xD3"));
var_dump(ctype_alnum("\xF3"));

setlocale(LC_ALL, 'sk_SK');
var_dump(ctype_alnum("\xD3"));
var_dump(ctype_alnum("\xF3"));

yields

bool(false)
bool(false)
bool(true)
bool(true)

In the latter case both \xD3 and \xF3 are output as is:

        if (value < 256 && isalnum(value)) {
            smart_str_appendc(&escaped_str, (unsigned char) value);
            continue;
        }

and this forms invalid UTF-8.

@bliuchak
Copy link

@sjinks, @phalcon
Issue still exists.

Method Phalcon\Escaper::escapeHtmlAttr() still does not work as desired when using phalcon v 1.2.6.

Here is info from phpinfo():

phalcon

Phalcon Framework => enabled
Phalcon Version => 1.2.6

Please, try the following code to reproduce an issue:

<pre>
<?php
$di = new Phalcon\DI\FactoryDefault();
$escaper = new Phalcon\Escaper();
echo 'Get Encoding: '.$escaper->getEncoding();echo "\n";
//strange behavior!!! Why the encoding is "ISO-8859-1"
echo 'Detect Encoding: '.$escaper->detectEncoding('ąćęłńśźżÓó');echo "\n";
//strange behavior!!! Now encoding is correct "utf-8"
echo 'Detect Encoding (String with spaces): '.$escaper->detectEncoding(' ą ć ę ł ń ś ź ż Ó ó');echo "\n";
// this method works as desired
echo 'Escape Html: '.$escaper->escapeHtml('ąćęłńśźżÓó');echo "\n";
// this method is still buggy
echo 'Escape Html Attribute: '.$escaper->escapeHtmlAttr('ąćęłńśźżÓó');echo "\n";
?>
</pre>

I've get the following output:

Get Encoding: utf-8
Detect Encoding: ISO-8859-1
Detect Encoding (String with spaces): UTF-8
Escape Html: ąćęłńśźżÓó
Escape Html Attribute: ąćęłńśźżÓó

Expected behavior: Phalcon\Escaper::escapeHtmlAttr() should work in the same way as Phalcon\Escaper::escapeHtml().

@phalcon
Copy link
Collaborator

phalcon commented Mar 17, 2014

Same code gives me:

Get Encoding: utf-8
Detect Encoding: UTF-8
Detect Encoding (String with spaces): UTF-8
Escape Html: ąćęłńśźżÓó
Escape Html Attribute: ąćęłńśźżÓó

Phalcon 1.2.6
Multibyte Support => enabled
Multibyte string engine => libmbfl
libmbfl version => 1.3.2
Multibyte regex (oniguruma) version => 5.9.2

@bliuchak
Copy link

@phalcon
Updated to Phalcon 1.3.1
Unable to reproduce this issue anymore.

Thanks for help!

@scrnjakovic
Copy link
Contributor

@phalcon
Using that code I get:

Get Encoding: utf-8
Detect Encoding: ISO-8859-1
Detect Encoding (String with spaces): UTF-8
Escape Html: ąćęłńśźżÓó
Escape Html Attribute: ąćęłńśźżÓó

Phalcon Version:1.3.1
Multibyte Support => enabled
Multibyte string engine => libmbfl
libmbfl version => 1.3.2
Multibyte regex (oniguruma) version => 5.9.2

Server: nginx 1.4.6
PHP: 5.5.9-1ubuntu4 (fpm-fcgi)
OS: Ubuntu 14.04 LTS

Although setting 'charset' => 'utf8' to DB Adapter initialization array made characters appear correctly, I'm still worried about getting ISO there.

Any thoughts?

@tonytcb
Copy link

tonytcb commented Sep 22, 2014

@xboston Thanks man! It saved me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants