Skip to content

Confused about columnar cache regression test #261

Open
@japinli

Description

What's wrong?

Hi,

When I read the regression test in columnar_cache.sql, I noticed that it contains the following test case:

CREATE TABLE big_table (
  id INT,
  firstname TEXT,
  lastname TEXT
) USING columnar;

INSERT INTO big_table (id, firstname, lastname)
  SELECT i,
         CONCAT('firstname-', i),
         CONCAT('lastname-', i)
    FROM generate_series(1, 1000000) as i;

-- get some baselines from multiple chunks
SELECT firstname,
       lastname,
       SUM(id)
  FROM big_table
 WHERE id < 1000
 GROUP BY firstname,
       lastname
UNION
SELECT firstname,
       lastname,
       SUM(id)
  FROM big_table
 WHERE id BETWEEN 15000 AND 16000
 GROUP BY firstname,
       lastname
 ORDER BY firstname;


-- enable caching
SET columnar.enable_column_cache = 't';

-- the results should be the same as above
SELECT firstname,
       lastname,
       SUM(id)
  FROM big_table
 WHERE id < 1000
 GROUP BY firstname,
       lastname
UNION
SELECT firstname,
       lastname,
       SUM(id)
  FROM big_table
 WHERE id BETWEEN 15000 AND 16000
 GROUP BY firstname,
       lastname
 ORDER BY firstname;

The comments claim that both queries produce the same outcome but columnar_cache.out results differ. The first query returns 2000 rows while the second only returns 999 rows.

Is this expected?

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions