Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query fails with mismatched input 'X' expecting {<EOF>, '&', '|'} #400

Closed
lnicolas opened this issue Apr 22, 2015 · 4 comments
Closed

Query fails with mismatched input 'X' expecting {<EOF>, '&', '|'} #400

lnicolas opened this issue Apr 22, 2015 · 4 comments

Comments

@lnicolas
Copy link

Hello,

we are using Annis for a German Learner corpus with some attributes having German characters.
Unfortunately, when there is a "special" (non [a-z] I believe) in the attribute searched, the query cannot be performed.

Thanks for the great work.
Regards,

Lionel

@thomaskrause
Copy link
Member

Hi, sorry for the delay I was in holiday and could not check my mails.

in general Umlaut and other special characters should work fine (e.g. https://korpling.german.hu-berlin.de/annis3/#_q=bGVtbWE9ImRhZsO8ciI&_c=cGNjMg&cl=5&cr=5&s=0&l=10 or even https://korpling.german.hu-berlin.de/annis3/scriptorium#_q=bm9ybT0i4rKb4rKf4rKp4rKn4rKJIg&_c=YWJyYWhhbS5vdXIuZmF0aGVy&cl=5&cr=5&s=0&l=10&_seg=d29yZA ) but on some configurations of Tomcat there might be problems:
http://korpling.github.io/ANNIS/doc/admin-configure-webapp.html#admin-configure-tomcat-utf8

Since we use a web-service and sometimes this web-service might be behind a proxy web-server there are also possibilities were the URLs can be mixed up. If you are running the backend web service or the frontend web application behind a proxy please send me more details about your configuration.

Best,

Thomas

@lnicolas
Copy link
Author

Hello Thomas,

thanks for the examples and no issue at all for the delay, I'm already glad you take the time to answer!
Proxi server should not be the problem as I can reproduce it on publicly available Annis instances.

I could notice that the bug happens when the special caracter is within the attribute name but not at the start or at the end.

Actually, when it is at the start or at the end, the bug is different: the "special" caracters are ignored.
=> https://korpling.german.hu-berlin.de/annis3/#_q=w7xsZW1tYcO8PSJkYWbDvHIi&_c=cGNjMg&cl=5&cr=5&s=0&l=10

When it is in the middle then the query fails with "Query fails with mismatched input 'X' expecting {, '&', '|'} #400 "
=> https://korpling.german.hu-berlin.de/annis3/#_q=bGVtw7xtYT0iZGFmw7xyIg&_c=cGNjMg&cl=5&cr=5&s=0&l=10

Regards

Lionel

@thomaskrause
Copy link
Member

We discussed this internally and currently we don't plan to introduce support for non ASCII-character for the attribute names (of course we still support them for the values).

Allowing all characters for the names might introduce some tricky problems for parsing, E.g. if a user uses the quotation mark ” (U+201D) instead of the proper " (U+0022) in the query. There would be much more corner cases than now and just renaming the annotation names seems to be easier than to get into that hurdle. I will also make sure that the new version of the ANNIS import format converter in Pepper will handle this gracefully.

However I updated the parser and the error messages should now be consistent. So an e.g. umlaut before or after the annotation name are now recognized as lexer errors instead of being silently ignored. Also the error message now explicitly states that the token could not be recognized.

https://korpling.german.hu-berlin.de/annis3-snapshot/#_q=w7xsZW1tYcO8PSJkYWbDvHIi&_c=cGNjMg&cl=5&cr=5&s=0&l=10
https://korpling.german.hu-berlin.de/annis3-snapshot/#_q=bGVtw7xtYT0iZGFmw7xyIg&_c=cGNjMg&cl=5&cr=5&s=0&l=10

@lnicolas
Copy link
Author

lnicolas commented Jun 2, 2015

Ciao Thomas,

after reading your explanations, I completely agree with your conclusion.
Thanks for having taken the time to consider the question.
Regards,

Lionel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants