Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested array in tuple array is incorrectly deserialized #1324

Closed
GFriedrich opened this issue Apr 23, 2023 · 6 comments · Fixed by #1326 or #1344
Closed

Nested array in tuple array is incorrectly deserialized #1324

GFriedrich opened this issue Apr 23, 2023 · 6 comments · Fixed by #1326 or #1344
Labels

Comments

@GFriedrich
Copy link

GFriedrich commented Apr 23, 2023

Describe the bug

After upgrading from ClickHouse HTTP driver 0.3.2 to 0.4.4 I found that data from nested numeric arrays in tuples is not correctly deserialized. E.g. instead of getting [(0, [1, 2]), (1, [2, 3])] I got [(0, [2, 3]), (1, [2, 3])]
This is because the inner array gets recycled and overwrites the previous values instead of using a fresh array or copying the array after reading.

Steps to reproduce

Create a statement that returns in a column an array of tuples which in turn also contains a numeric array.

Expected behaviour

The values should be correctly deserialized.

Configuration

Environment

  • Client version: 0.4.4
@GFriedrich GFriedrich added the bug label Apr 23, 2023
@zhicwu
Copy link
Contributor

zhicwu commented Apr 24, 2023

Hi @GFriedrich, thanks for the reporting. It's indeed an issue being introduced in 0.4.0, but I thought it's been fixed in 0.4.1.

Could you provide table schema and/or sample code for me to reproduce the issue? Judging from the value you provided, the column is something like Array(Tuple(UInt16, Array(UInt32))), but I was not able to reproduce the issue even with additional test cases based on your description.

@Test(groups = "integration")
public void testNestedArrayInTuple() throws SQLException {
    Properties props = new Properties();
    try (ClickHouseConnection conn = newConnection(props);
            ClickHouseStatement stmt = conn.createStatement()) {
        // nested values on same row
        Assert.assertFalse(stmt.execute("drop table if exists test_nested_array_in_tuple; "
                + "create table test_nested_array_in_tuple(id UInt64, v1 Tuple(Array(Int32)), v2 Tuple(Array(Int32)))engine=Memory; "
                + "insert into test_nested_array_in_tuple values(1, ([1, 2]), ([2, 3]))"));
        try (ResultSet rs = stmt.executeQuery("select * from test_nested_array_in_tuple order by id")) {
            Assert.assertTrue(rs.next());
            Assert.assertEquals(rs.getInt(1), 1);
            Assert.assertEquals(((List<?>) rs.getObject(2)).size(), 1);
            Assert.assertEquals(((List<?>) rs.getObject(2)).get(0), new int[] { 1, 2 });
            Assert.assertEquals(((List<?>) rs.getObject(3)).size(), 1);
            Assert.assertEquals(((List<?>) rs.getObject(3)).get(0), new int[] { 2, 3 });
            Assert.assertFalse(rs.next());
        }

        // nested values on same column
        Assert.assertFalse(stmt.execute("drop table if exists test_nested_array_in_tuple; "
                + "create table test_nested_array_in_tuple(id UInt64, val Tuple(Array(Int32)))engine=Memory; "
                + "insert into test_nested_array_in_tuple values(1, ([1, 2])), (2, ([2, 3]))"));
        try (ResultSet rs = stmt.executeQuery("select * from test_nested_array_in_tuple order by id")) {
            Assert.assertTrue(rs.next());
            Assert.assertEquals(rs.getInt(1), 1);
            Assert.assertEquals(((List<?>) rs.getObject(2)).size(), 1);
            Assert.assertEquals(((List<?>) rs.getObject(2)).get(0), new int[] { 1, 2 });
            Assert.assertTrue(rs.next());
            Assert.assertEquals(rs.getInt(1), 2);
            Assert.assertEquals(((List<?>) rs.getObject(2)).size(), 1);
            Assert.assertEquals(((List<?>) rs.getObject(2)).get(0), new int[] { 2, 3 });
            Assert.assertFalse(rs.next());
        }

        // deeper nested level and more elements
        Assert.assertFalse(stmt.execute("drop table if exists test_nested_array_in_tuple; "
                + "create table test_nested_array_in_tuple(id UInt64, val Array(Tuple(UInt16,Array(UInt32))))engine=Memory; "
                + "insert into test_nested_array_in_tuple values(1, [(0, [1, 2]), (1, [2, 3])]), (2, [(2, [4, 5]), (3, [6, 7])])"));
        try (ResultSet rs = stmt.executeQuery("select * from test_nested_array_in_tuple order by id")) {
            Assert.assertTrue(rs.next());
            Assert.assertEquals(rs.getInt(1), 1);
            Assert.assertEquals(((Object[]) rs.getObject(2)).length, 2);
            Assert.assertEquals(((List<?>) ((Object[]) rs.getObject(2))[0]).size(), 2);
            Assert.assertEquals(((List<?>) ((Object[]) rs.getObject(2))[0]).get(0), UnsignedShort.ZERO);
            Assert.assertEquals(((List<?>) ((Object[]) rs.getObject(2))[0]).get(1), new int[] { 1, 2 });
            Assert.assertEquals(((List<?>) ((Object[]) rs.getObject(2))[1]).size(), 2);
            Assert.assertEquals(((List<?>) ((Object[]) rs.getObject(2))[1]).get(0), UnsignedShort.ONE);
            Assert.assertEquals(((List<?>) ((Object[]) rs.getObject(2))[1]).get(1), new int[] { 2, 3 });
            Assert.assertTrue(rs.next());
            Assert.assertEquals(((Object[]) rs.getObject(2)).length, 2);
            Assert.assertEquals(((List<?>) ((Object[]) rs.getObject(2))[0]).size(), 2);
            Assert.assertEquals(((List<?>) ((Object[]) rs.getObject(2))[0]).get(0),
                    UnsignedShort.valueOf((short) 2));
            Assert.assertEquals(((List<?>) ((Object[]) rs.getObject(2))[0]).get(1), new int[] { 4, 5 });
            Assert.assertEquals(((List<?>) ((Object[]) rs.getObject(2))[1]).size(), 2);
            Assert.assertEquals(((List<?>) ((Object[]) rs.getObject(2))[1]).get(0),
                    UnsignedShort.valueOf((short) 3));
            Assert.assertEquals(((List<?>) ((Object[]) rs.getObject(2))[1]).get(1), new int[] { 6, 7 });
            Assert.assertFalse(rs.next());
        }
    }
}

@zhicwu zhicwu linked a pull request Apr 24, 2023 that will close this issue
3 tasks
@GFriedrich
Copy link
Author

Hey @zhicwu,
thanks already for having a look into this.
I'll try to create a sample tomorrow - but can't promise anything.
In worst case I'll get back at this next weekend.

@zhicwu zhicwu reopened this Apr 24, 2023
@zhicwu
Copy link
Contributor

zhicwu commented Apr 24, 2023

Cool, thanks @GFriedrich. I'll create a patch release to fix that, once we can reproduce the issue.

@GFriedrich
Copy link
Author

Hey @zhicwu,
I was finally able to reproduce the issue with the tests you've added.
So the problem with your test not being able to reproduce the issue is, that the very last testcase with // deeper nested level and more elements is not going through the same code path as it did for my scenario.
Your code path is calling the following deserializers:
ArrayDeserializer -> TupleDeserializer -> BinaryDataProcessor::readIntegerArray
But the last one here is an optimization which doesn't take place for me - likely because my result is a lot more complex.
It is important that the latter one is also an ArrayDeserializer.
You may enforce this by using a Decimal instead of a nested UInt32 like so:

Assert.assertFalse(stmt.execute("drop table if exists test_nested_array_in_tuple; "
        + "create table test_nested_array_in_tuple(id UInt64, val Array(Tuple(UInt16,Array(Decimal(10,0)))))engine=Memory; "
        + "insert into test_nested_array_in_tuple values(1, [(0, [1, 2]), (1, [2, 3])]), (2, [(2, [4, 5]), (3, [6, 7])])"));
try (ResultSet rs = stmt.executeQuery("select * from test_nested_array_in_tuple order by id")) {
    Assert.assertTrue(rs.next());
    Assert.assertEquals(rs.getInt(1), 1);
    Assert.assertEquals(((Object[]) rs.getObject(2)).length, 2);
    Assert.assertEquals(((List<?>) ((Object[]) rs.getObject(2))[0]).size(), 2);
    Assert.assertEquals(((List<?>) ((Object[]) rs.getObject(2))[0]).get(0), UnsignedShort.ZERO);
    Assert.assertEquals(((BigDecimal[]) ((List<?>) ((Object[]) rs.getObject(2))[0]).get(1))[0], BigDecimal.valueOf(1));
    Assert.assertEquals(((BigDecimal[]) ((List<?>) ((Object[]) rs.getObject(2))[0]).get(1))[1], BigDecimal.valueOf(2));
    Assert.assertEquals(((List<?>) ((Object[]) rs.getObject(2))[1]).size(), 2);
    Assert.assertEquals(((List<?>) ((Object[]) rs.getObject(2))[1]).get(0), UnsignedShort.ONE);
    Assert.assertEquals(((BigDecimal[]) ((List<?>) ((Object[]) rs.getObject(2))[1]).get(1))[0], BigDecimal.valueOf(2));
    Assert.assertEquals(((BigDecimal[]) ((List<?>) ((Object[]) rs.getObject(2))[1]).get(1))[1], BigDecimal.valueOf(3));
    Assert.assertTrue(rs.next());
    Assert.assertEquals(((Object[]) rs.getObject(2)).length, 2);
    Assert.assertEquals(((List<?>) ((Object[]) rs.getObject(2))[0]).size(), 2);
    Assert.assertEquals(((List<?>) ((Object[]) rs.getObject(2))[0]).get(0),
            UnsignedShort.valueOf((short) 2));
    Assert.assertEquals(((BigDecimal[]) ((List<?>) ((Object[]) rs.getObject(2))[0]).get(1))[0], BigDecimal.valueOf(4));
    Assert.assertEquals(((BigDecimal[]) ((List<?>) ((Object[]) rs.getObject(2))[0]).get(1))[1], BigDecimal.valueOf(5));
    Assert.assertEquals(((List<?>) ((Object[]) rs.getObject(2))[1]).size(), 2);
    Assert.assertEquals(((List<?>) ((Object[]) rs.getObject(2))[1]).get(0),
            UnsignedShort.valueOf((short) 3));
    Assert.assertEquals(((BigDecimal[]) ((List<?>) ((Object[]) rs.getObject(2))[1]).get(1))[0], BigDecimal.valueOf(6));
    Assert.assertEquals(((BigDecimal[]) ((List<?>) ((Object[]) rs.getObject(2))[1]).get(1))[1], BigDecimal.valueOf(7));
    Assert.assertFalse(rs.next());
}

If you run this code, you'll see that it fails as the first inner array will be overwritten by the second inner array.
I hope that helps.
Thanks in advance for a fix. 👍

@GFriedrich GFriedrich changed the title Nested array in tuples is incorrectly deserialized Nested array in tuple array is incorrectly deserialized Apr 29, 2023
@zhicwu
Copy link
Contributor

zhicwu commented Apr 30, 2023

Thanks @GFriedrich for helping me to reproduce the issue along with all the details and suggestions. It's caused by the tricky way trying to reduce object creation when the inner array length stays the same. I've fixed the issue in local, but I need to think it over to understand if there's better way without impacting performance and memory usage.

Will definitely include the fix in v0.4.6.

@GFriedrich
Copy link
Author

Can confirm that with version 0.4.6 things are now working as expected for me.
Thanks again @zhicwu for your quick help. Much appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants