-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Subtask] support fileset DDL operations for spark-connector #2461
Comments
Hi @FANNG1 what do you think of this? |
From the perspective of user, Spark sql normanly operate on tables, How should Spark operate on fileset? cc @jerryshao |
Refer to databricks' volumn, which provides ddl operations for volumn cc @FANNG1 @jerryshao |
I think spark/sparksql can support operating fileset data via SQL/RDD/Dataframe by using #1700 , we don't have to do anything more. The linked above is to support manipulating volume(fileset) itself using SQL, it requires SQL extension. Currently, we don't have plan to do it. |
cc @coolderli |
@jerryshao Do we have a plan to support some Fileset operations such as List Files、Drop Files and so on? If we want to achieve TTL, we may need an interface to operate the Fileset. We may have some ambiguity about the positioning of the Fileset. The Fileset is managed by Gravitino, and we have already supported creating a table by Gravitino, why not support creating a Fileset? Some users may prefer to use SQL other than the UI. Actually, I think it is truly not consistent with the position of Gravitino. But we can supply tools or actions to help users manage the Fileset. It may not be our current highest priority, but we can implement it later. |
I don't say we don't do it, what I said is that we don't have a plan to do it currently. For ML users/DS, they can use our python client to manage filesets, it is much more straightforward than using SQL (which needs a separate query engine like Spark besides ML engine). For DE, they can use Java client in their program (like Spark program) to achieve this. Providing SQL interface is just an alternative compared to Java/Python, I don't see it is a must-have thing for now. SO IMO, I don't see a super high priority to achieve this in SQL. If you have a concrete scenario that requires SQL support, we can have a off-line discussion about this. |
Much appreciate your response. No intention of offending. I completely agree with your point of view that this is not the highest priority right now. |
Describe the subtask
support fileset DDL,such as create, drop, etc
Parent issue
#1227
The text was updated successfully, but these errors were encountered: