Skip to content

SQL: Add text formatting support for multivalue #68606

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 9, 2021

Conversation

bpintea
Copy link
Contributor

@bpintea bpintea commented Feb 5, 2021

This adds the text formatting for multivalue fields (available through the ARRAY function),
that is:

  • CSV and TSV exports;
  • TXT formatting.

This adds the text formatting for multivalue doc fields, that is:
- CSV and TSV exports
- TXT formatting
@bpintea bpintea marked this pull request as ready for review February 8, 2021 08:13
Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, left 2 comments, a minor one and a question regarding the data type.

@@ -1055,6 +1055,16 @@ private void executeQueryWithNextPage(String format, String expectedHeader, Stri
assertEquals(0, getNumberOfSearchContexts(client(), "test"));
}

public void testMultiValueQueryText() throws IOException {
index("{\"text\":[" + toJson("one") + "," + toJson("two, three") + "," + toJson("\"four\"") + "], \"number\" : [1, [2, 3], 4] }");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you use toJson only for the values but not also for the field names?

Copy link
Contributor Author

@bpintea bpintea Feb 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C&p inertia. Thx, updated it.
Because of the spoteless formatting, which looks pretty bad. But I've applied it now anyways...

}

String typeName = fromJava(o).typeName();
headers.add(new ColumnInfo("index", typeName + "_column", typeName));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember our take here, shouldn't the type name reveal that it's an array? something like byte_array or _array (postgres format) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean the 3rd argument should be typeName + "_array"? I've updated it anyways, to be clearer, but it's not used in these tests (the formatters don't look at the type, only content).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx, yep, I get, the columnInfo is not actually asserted, but it's clearer with the _array suffix. Thx!

- extend `toJson` use in one test;
- add + `"_array"` in column def in one test.
- which is pretty ugly in this case.
Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thx!


public void testCsvMultiValueIntegers() {
String text = CSV.format(req(), withData(singletonList(asList(
asList((byte)1, (byte)2), asList((short)3, (short)4), asList(5, 6), asList(7L, 8L)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitty picky: whitespace between the casting and the casted value.

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bpintea bpintea added the :Analytics/SQL SQL querying label Feb 9, 2021
@elasticmachine elasticmachine added the Team:QL (Deprecated) Meta label for query languages team label Feb 9, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

@bpintea bpintea added the v8.0.0 label Feb 9, 2021
Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines 173 to 193
sb.append('[');
Collection<?> values = (Collection<?>) value;
values.forEach(x -> {
// quote strings and `\`-escape the `"` character inside them
if (x instanceof String) { // TODO: IPs should ideally not be quoted.
sb.append('"');
sb.append(((String) x).replace("\"", "\\\""));
sb.append('"');
} else if (x == null) {
sb.append("NULL");
} else {
sb.append(toString(x, sqlVersion));
}
sb.append(',');
});
if (sb.length() > 1) { // strip trailing comma
sb.deleteCharAt(sb.length() - 1);
}
sb.append(']');
return sb.toString();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to use StringJoiner since it's part of the JDK and handles the exact case of prefix/suffix plus delimiter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks, applied.

- white-space style;
- s/StringBuilder/StringJoiner/
@@ -169,27 +170,19 @@ public static String toString(Object value, SqlVersion sqlVersion) {

// multivalue
if (value instanceof Collection) {
final StringBuilder sb = new StringBuilder();
sb.append('[');
final StringJoiner sj = new StringJoiner(",", "[", "]");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 nice!

Copy link
Contributor

@palesz palesz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

if (x instanceof String) { // TODO: IPs should ideally not be quoted.
sj.add('"' + ((String) x).replace("\"", "\\\"") + '"');
} else if (x == null) {
sj.add("NULL");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpicky: we return uppercase NULL, but here we return lowercase null for simple null objects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, GH only updated the page with your comment after I hit merge.
The link in the comment doesn't seem to resolve correctly, but I assume you refer to the first if in the function?
I followed Posgresql's capitalised formatting for lists. Generally (and debatable) nulls are capitalised in textual formats, so I'd be in favour for updating the listing of simple null objects.

As extra context:
The NULL would appear "alone" in CSVs and TSVs -- a simple null object wouldn't be output at all in these formats (they'd be represented by two consecutive commas or tabs).
In CLI, the NULL would also appear to the letter in arrays output and the CLI chooses how to represent JSON nulls by itself (which is currently small lettered).
In txt formatting output the difference would be visible and "uncontrollable".

@bpintea bpintea merged commit 0d40d65 into elastic:feat/sql-multivalue Feb 9, 2021
@bpintea bpintea deleted the feat/multivalue_text branch February 9, 2021 20:55
bpintea added a commit that referenced this pull request Feb 11, 2021
* Add text formatting support for multivalue

This adds the text formatting for multivalue doc fields, that is:
- CSV and TSV exports
- TXT formatting
bpintea added a commit that referenced this pull request Mar 16, 2021
* Add text formatting support for multivalue

This adds the text formatting for multivalue doc fields, that is:
- CSV and TSV exports
- TXT formatting
bpintea added a commit that referenced this pull request Mar 25, 2021
* Add text formatting support for multivalue

This adds the text formatting for multivalue doc fields, that is:
- CSV and TSV exports
- TXT formatting
bpintea added a commit that referenced this pull request May 20, 2021
* Add text formatting support for multivalue

This adds the text formatting for multivalue doc fields, that is:
- CSV and TSV exports
- TXT formatting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/SQL SQL querying >enhancement Team:QL (Deprecated) Meta label for query languages team v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants