-
Notifications
You must be signed in to change notification settings - Fork 971
Allow user to provide batch id on submitting batch job #4390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
kyuubi-server/src/main/scala/org/apache/kyuubi/server/api/v1/BatchesResource.scala
Outdated
Show resolved
Hide resolved
kyuubi-server/src/main/scala/org/apache/kyuubi/server/api/v1/BatchesResource.scala
Show resolved
Hide resolved
turboFei
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left some comments
|
seems mistake, let me think it twice. |
|
How about add a |
I think it makes things more complex, IMO let the user provide a custom batch id does not has many risks, we already restrict the batch id MUST be a UUID. |
|
cc @turboFei @ulysses-you it's ready for the next review. |
Codecov Report
@@ Coverage Diff @@
## master #4390 +/- ##
============================================
- Coverage 53.24% 53.24% -0.01%
Complexity 13 13
============================================
Files 569 569
Lines 31111 31146 +35
Branches 4203 4208 +5
============================================
+ Hits 16565 16583 +18
- Misses 12979 12992 +13
- Partials 1567 1571 +4
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
|
TODO: render the batchInfo in kyuubi-ctl |
Tracking in #4438 |
|
Thanks, merged to master/1.7 |
### _Why are the changes needed?_ This PR proposes to allow the user to provide a batch id on submitting a batch job. If the batch id already existed in metastore, Kyuubi ignores this submission and just returns the existing one, w/ a marker in response, this could avoid duplicated batch job submission. Talking about the implementation, the key things are How does the user set the custom batch id? - User can optionally set the `kyuubi.batch.id` in `conf: Map[String, String]`, and the value must be a UUID, for Java users, it can be generated by `UUID.randomUUID().toString()` How does the Kyuubi Server detect the duplication? - It's simple in single Kyuubi Server instance case, Kyuubi just needs to look up the metastore before creating a batch job - In HA mode, suppose the user requests to create the batch jobs w/ the same batch id concurrently, multiple Kyuubi Servers may process the request and try to insert to metastore DB at the same time, but only the first insertion success, others will fail w/ "duplicated key", Kyuubi Server needs to catch this error and return the existing batch job information instead of creating a new one. How does the user know if the returned batch job is new created or duplicated? - a new field `batchInfo: Map[String, String]` is added to the response, and for duplicated batch job, `"kyuubi.batch.duplicated": "true"` will be contained. ### _How was this patch tested?_ - [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4390 from pan3793/batch-id. Closes #4390 b6917ba [Cheng Pan] move constant to rest client 79ef1b5 [Cheng Pan] flaky test f822285 [Cheng Pan] it 88bdfa5 [Cheng Pan] ut fd8bc22 [Cheng Pan] ut c820f5e [Cheng Pan] Support user provided batch id on batch job submission Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org> (cherry picked from commit efbaaff) Signed-off-by: Cheng Pan <chengpan@apache.org>
### _Why are the changes needed?_ Close #4438 Follow up of #4390 ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4463 from turboFei/render_batch. Closes #4438 b5ae211 [fwang12] render batch info Authored-by: fwang12 <fwang12@ebay.com> Signed-off-by: fwang12 <fwang12@ebay.com>
### _Why are the changes needed?_ Close #4438 Follow up of #4390 ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4463 from turboFei/render_batch. Closes #4438 b5ae211 [fwang12] render batch info Authored-by: fwang12 <fwang12@ebay.com> Signed-off-by: fwang12 <fwang12@ebay.com> (cherry picked from commit 39eac5a) Signed-off-by: fwang12 <fwang12@ebay.com>
Why are the changes needed?
This PR proposes to allow the user to provide a batch id on submitting a batch job. If the batch id already existed in metastore, Kyuubi ignores this submission and just returns the existing one, w/ a marker in response, this could avoid duplicated batch job submission.
This feature relies on #4418
Talking about the implementation, the key things are
How does the user set the custom batch id?
kyuubi.batch.idinconf: Map[String, String], and the value must be a UUID, for Java users, it can be generated byUUID.randomUUID().toString()How does the Kyuubi Server detect the duplication?
How does the user know if the returned batch job is new created or duplicated?
batchInfo: Map[String, String]is added to the response, and for duplicated batch job,"kyuubi.batch.duplicated": "true"will be contained.How was this patch tested?
Add some test cases that check the changes thoroughly including negative and positive cases if possible
Add screenshots for manual tests if appropriate
Run test locally before make a pull request