Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] CSV output doesn't handle entries with quotes but not commas. #3050

Open
Michael-S opened this issue Oct 1, 2024 · 1 comment · May be fixed by #3063
Open

[BUG] CSV output doesn't handle entries with quotes but not commas. #3050

Michael-S opened this issue Oct 1, 2024 · 1 comment · May be fixed by #3063
Labels
bug Something isn't working

Comments

@Michael-S
Copy link

What is the bug?
CSV output of entries like "a" b or a " b should, according to RFC 4180, escape them to be """a"" b" and "a "" b", respectively. See section 2.5, https://www.rfc-editor.org/rfc/rfc4180

How can one reproduce the bug?
Steps to reproduce the behavior:

  1. Add values to a test index.
#!/usr/bin/env bash
# Assuming OPENSEARCH_SERVER is the server,
# OPENSEARCH_USER is the host, OPENSEARCH_PASSWORD is the password.
# and the user has the rights to create and query this index.
# Add an entry to the index that has two quotes in it, not at the boundaries.
curl -XPUT "https://${OPENSEARCH_SERVER}:9200/csv_error/_doc/1" \
    -H 'Content-Type: application/json' \
    -d'{  "field1": "\"a\" b" }' --user "${OPENSEARCH_USER}":"${OPENSEARCH_PASSWORD}"
# Add an entry that has a quote in it.
curl -XPUT "https://${OPENSEARCH_SERVER}:9200/csv_error/_doc/2" \
    -H 'Content-Type: application/json' \
    -d'{ "field1": "\"a\" b" }' --user "${OPENSEARCH_USER}":"${OPENSEARCH_PASSWORD}"
# Optional, add an entry that has no quotes or commas.
curl -XPUT "https://${OPENSEARCH_SERVER}:9200/csv_error/_doc/3" \
    -H 'Content-Type: application/json' \
    -d'{ "field1": "a b" }' --user "${OPENSEARCH_USER}":"${OPENSEARCH_PASSWORD}"
# Optional, add an entry that has commas.
curl -XPUT "https://${OPENSEARCH_SERVER}:9200/csv_error/_doc/4" \
    -H 'Content-Type: application/json' \
    -d'{ "field1": "a, b" }' --user "${OPENSEARCH_USER}":"${OPENSEARCH_PASSWORD}"
  1. Query the index to CSV output.
#!/usr/bin/env bash
curl -XPOST "https://${OPENSEARCH_SERVER}:9200/_plugins/_sql?format=csv" \
    -H 'Content-Type: application/json' \
    -d'{ "query": "SELECT * FROM csv_error LIMIT 50" }' \
     --user "${OPENSEARCH_USER}":"${OPENSEARCH_PASSWORD}"
  1. Response data is:
"a" b  # expected to be """a"" b"
a " b  # expected to be "a "" b"
a b     # correct
"a, b" # correct

What is the expected behavior?
The CSV output escapes double quotes properly.

What is your host/environment?

  • OS: Ubuntu 20.04
  • Version - 2.17.0.0
  • Plugins - default

Do you have any additional context?
As far as I can tell, the problem appears to be in function quoteIfRequired in FlatResponseBase.java,

protected String quoteIfRequired(String separator, String cell) {
but it's possible changing it there might affect other output formats.

@Michael-S Michael-S added bug Something isn't working untriaged labels Oct 1, 2024
Michael-S added a commit to Michael-S/opensearch-sql-plugin that referenced this issue Oct 9, 2024
Michael-S added a commit to Michael-S/opensearch-sql-plugin that referenced this issue Oct 9, 2024
Fixes opensearch-project#3050

Signed-off-by: Michael Swierczek <mike.swierczek@tizra.com>
@Michael-S Michael-S linked a pull request Oct 9, 2024 that will close this issue
5 tasks
Michael-S added a commit to Michael-S/opensearch-sql-plugin that referenced this issue Oct 9, 2024
Fixes opensearch-project#3050

Signed-off-by: Mike Swierczek <441523+Michael-S@users.noreply.github.com>
@dblock dblock removed the untriaged label Oct 21, 2024
@dblock
Copy link
Member

dblock commented Oct 21, 2024

[Catch All Triage - 1, 2]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants