Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-49966][SQL] Codegen Support for JsonToStructs(from_json) #48466

Closed
wants to merge 2 commits into from

Conversation

panbingkun
Copy link
Contributor

@panbingkun panbingkun commented Oct 15, 2024

What changes were proposed in this pull request?

The pr aims to add Codegen Support for JsonToStructs(from_json).

Why are the changes needed?

  • improve codegen coverage.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass GA & Existed UT (eg: JsonFunctionsSuite#*from_json*)

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Oct 15, 2024
@panbingkun panbingkun marked this pull request as ready for review October 15, 2024 05:45
@panbingkun
Copy link
Contributor Author

cc @MaxGekk @cloud-fan

code"""
|${eval.code}
|$resultType $resultTerm = ($resultType) $refEvaluator.evaluate(
| ${eval.isNull} ? null : ${eval.value});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this check? seems like evaluate() does this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's redundant. I have already removed it.

Copy link
Contributor Author

@panbingkun panbingkun Oct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code generated by an example is roughly like this:

  • Before
/* 031 */       boolean localtablescan_isNull_0 = localtablescan_row_0.isNullAt(0);
/* 032 */       UTF8String localtablescan_value_0 = localtablescan_isNull_0 ?
/* 033 */       null : (localtablescan_row_0.getUTF8String(0));
/* 034 */       InternalRow project_result_0 = (InternalRow) ((org.apache.spark.sql.catalyst.expressions.json.JsonToStructsEvaluator) references[1] /* evaluator */).evaluate(
/* 035 */         localtablescan_isNull_0 ? null : localtablescan_value_0);
/* 036 */       boolean project_isNull_0 = project_result_0 == null;
/* 037 */       InternalRow project_value_0 = null;
/* 038 */       if (!project_isNull_0) {
/* 039 */         project_value_0 = project_result_0;
/* 040 */       }
/* 041 */       project_mutableStateArray_0[0].reset();
/* 042 */
/* 043 */       project_mutableStateArray_0[0].zeroOutNullBytes();
  • After
/* 031 */       boolean localtablescan_isNull_0 = localtablescan_row_0.isNullAt(0);
/* 032 */       UTF8String localtablescan_value_0 = localtablescan_isNull_0 ?
/* 033 */       null : (localtablescan_row_0.getUTF8String(0));
/* 034 */       InternalRow project_result_0 = (InternalRow) ((org.apache.spark.sql.catalyst.expressions.json.JsonToStructsEvaluator) references[1] /* evaluator */).evaluate(localtablescan_value_0);
/* 035 */       boolean project_isNull_0 = project_result_0 == null;
/* 036 */       InternalRow project_value_0 = null;
/* 037 */       if (!project_isNull_0) {
/* 038 */         project_value_0 = project_result_0;
/* 039 */       }
/* 040 */       project_mutableStateArray_0[0].reset();
/* 041 */
/* 042 */       project_mutableStateArray_0[0].zeroOutNullBytes();
  • Obviously unnecessary
image
  • So it has been removed.

@MaxGekk
Copy link
Member

MaxGekk commented Oct 16, 2024

+1, LGTM. Merging to master.
Thank you, @panbingkun.

@MaxGekk MaxGekk closed this in 2a13011 Oct 16, 2024

override def nullSafeEval(json: Any): Any = evaluator.evaluate(json.asInstanceOf[UTF8String])

override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to use Invoke with Literal(new JsonToStructsEvaluator(...), ObjectType(...)) to rewrite the expression?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants