Skip to content

HIVE-29474: DESCRIBE FORMATTED for columns produces wrong output when…#6333

Open
tanishq-chugh wants to merge 2 commits intoapache:masterfrom
tanishq-chugh:desc_for_testing
Open

HIVE-29474: DESCRIBE FORMATTED for columns produces wrong output when…#6333
tanishq-chugh wants to merge 2 commits intoapache:masterfrom
tanishq-chugh:desc_for_testing

Conversation

@tanishq-chugh
Copy link
Contributor

@tanishq-chugh tanishq-chugh commented Feb 23, 2026

… the column datatype is STRUCT

What changes were proposed in this pull request?

Fix DESCRIBE FORMATTED for STRUCT datatype columns output

Why are the changes needed?

Currently, when we have a table with STRUCT datatype column, running the DESCRIBE FORMATTED query on the particular STRUCT column produces wrong output,
For Exm:
CREATE TABLE tbl_t (id int, point STRUCT<x:INT, y:INT>);
DESCRIBE FORMATTED tbl_t point;

gives the following output:
image

Here as we can observe that the col_name & data_type are wrong & instead should be point & struct<x:int,y:int> respectively.

Does this PR introduce any user-facing change?

Yes

How was this patch tested?

Manual Testing & Qtest

@sonarqubecloud
Copy link

Copy link
Contributor

@soumyakanti3578 soumyakanti3578 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HiveMetaStoreUtils#getFieldsFromDeserializer should be agnostic of where it was called from, so adding a boolean variable in its args is not ideal.

Also, I don't like the idea that we have to add false everywhere we are calling the method from, as the changes seem irrelevant, just to add true in a couple of places.

It seems we are running into this issue because the tableName in method getFieldsFromDeserializer is default.[tbl].point. This is then split into names:

    String[] names = tableName.split("\\.");
    String last_name = names[names.length - 1];
    for (int i = 2; i < names.length; i++) {

and since the length of names is 3, we go into the for loop to iterate through the fields of point.

You could probably just pass the colName instead of desc.getColumnPath() in:

Hive.getFieldsFromDeserializer(desc.getColumnPath(), deserializer, context.getConf()));

in DescTableOperation.java‎. This will force it to skip over the for loop as there will just be 1 item in names.

Probably you won't need to change anything anywhere else. Please try this and let's see if all the tests pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants