[TASK][MEDIUM] Spark engine query results support reading from HDFS

### Code of Conduct

- [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)


### Search before creating

- [X] I have searched in the [task list](https://github.com/orgs/apache/projects/296) and found no similar tasks.


### Mentor

- [ ] I have sufficient knowledge and experience of this task, and I volunteer to be the mentor of this task to guide contributors to complete the task.


### Skill requirements

- Basic knowledge on Scala Programing Language
- Familiar with Apache Spark


### Background and Goals

The client's SQL query may cause the Spark engine to fail because the query data result is too large, and the driver to fail due to OOM.
Although the number of query results can be limited by configuring `kyuubi.operation.result.max.rows`, if the amount of data in one row is too large, it will still cause OOM.
If it can support writing query results to HDFS or other storage systems, when the client needs to obtain the results, the engine will obtain the results from HDFS, which can avoid the OOM problem.


### Implementation steps

1. Distinguish execution plans with query results
2. Estimate the output size of the execution plan
`org.apache.spark.sql.catalyst.plans.logical.statsEstimation.EstimationUtils#getSizePerRow`
3. Use `df.write` api to write the query results of the execution plan to HDFS
4. Implement an iterator to read data from HDFS and return it to the client

### Additional context

Original reporter is @cxzl25 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[TASK][MEDIUM] Spark engine query results support reading from HDFS #5377

Code of Conduct

Search before creating

Mentor

Skill requirements

Background and Goals

Implementation steps

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[TASK][MEDIUM] Spark engine query results support reading from HDFS #5377

Description

Code of Conduct

Search before creating

Mentor

Skill requirements

Background and Goals

Implementation steps

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions