Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-28829][k8s] Support prepreparing K8S resources before JM creation #20498

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

bzhaoopenstack
Copy link
Contributor

In this PR, we introduces several things for supporting create another k8s resources before JM deploy creation

What is the purpose of the change

(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)
For supporting customized k8s scheduler, we need to extend the ability of current code tree to support the cases of all customized k8s scheduler during they schedule the real pods. So we'd better to extend all cases during k8s scheduling, such as before/during/after our target resources creation. In this PR, we add the before case. And current code tree had covered during/after cases.

Brief change log

  1. We introduce a new interface func for all decorators to support
    create pre-prepared k8s resources.
  2. We extend a attribute for JM spec for storing the resources list.
  3. Extending the ability for supporting refresh the preprepared
    resource's ownerreference based on original code logic.
  4. Add the ability for creating K8S resource before JM deployment
    creation in Fabric client.

Verifying this change

We can not verify this change at this moment, as this is a internal process in flink-kubernetes now. It doesn't affect the user interface now. The only way we can test it is mocking a JM spec with another k8s resources which is not contained by Flink. And check whether the resources can be created before the JM and refresh the correct ower reference(Flink cluster id-- deployment id).

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (no)
  • If yes, how is the feature documented? (not documented)

@bzhaoopenstack bzhaoopenstack changed the title [FLINK-28829] Support prepreparing K8S resources before JM creation [FLINK-28829][k8s] Support prepreparing K8S resources before JM creation Aug 8, 2022
@flinkbot
Copy link
Collaborator

flinkbot commented Aug 8, 2022

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@@ -30,10 +30,15 @@ public class KubernetesJobManagerSpecification {

private List<HasMetadata> accompanyingResources;

private List<HasMetadata> prePreparedResources;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private List<HasMetadata> prePreparedResources;
private final List<HasMetadata> prePreparedResources;

Copy link
Contributor Author

@bzhaoopenstack bzhaoopenstack Oct 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, how about just follow the style of accompanyingResources? If we change this, we should also keep the same with accompanyingResources. And that was not related with this PR.


// check the preprepared resources had been created.
final List<Pod> resultPods = kubeClient.pods().inNamespace(NAMESPACE).list().getItems();
assertThat(resultPods).hasSize(3);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be at least 4 pods ?
1 pod for the JM itself and 3 mock-pod-preprepared-resource-**.

Copy link
Contributor Author

@bzhaoopenstack bzhaoopenstack Oct 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is UT, it won't create any real Pods, just a k8s models objects. Actually, JM will be created by K8S deployment, In the real env, that's true it should be 4 pods.

In this PR, we introduces several things below:
1. We introduce a new interface func for all decorators to support
   create pre-prepared k8s resources.
2. We extend a attribute for JM spec for storing the resources list.
3. Extending the ability for supporting refresh the preprepared
   resource's ownerreference based on original code logic.
4. Add the ability for creating K8S resource before JM deployment
   creation in Fabric client.
@bzhaoopenstack
Copy link
Contributor Author

@flinkbot run azure

Copy link
Contributor

@wangyang0918 wangyang0918 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for creating this PR. After more consideration, I have one concern about whether we really need the pre-prepared resources. If the accompanyingResources are created before the JobManager deployment, then it could satisfy our requirement. Right?

FYI: accompanyingResources is designed for being created before deployment. The current implementation is just to avoid resource leak and save one K8s apiserver request.

final List<HasMetadata> prePreparedResources = kubernetesJMSpec.getPrePreparedResources();

// before create Deployment
this.internalClient.resourceList(prePreparedResources).createOrReplace();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might have resources leak here if the flink client crashed exactly after the pre-prepared resources created.

@@ -630,4 +630,37 @@ private KubernetesConfigMap buildTestingConfigMap() {
.withData(data)
.build());
}

@Test
void testMockPrePreparedResources() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The throws Exception is useless.

@wangyang0918 wangyang0918 self-assigned this Oct 19, 2022
@jrfaller
Copy link

Hi! As you probably know, this PR is introducing some breaking changes. Our academic tool BreakBot helps developers to better assess the impact that these BCs may have on client projects. While this can be on purpose, we found out that some clients among the most popular on GitHub may be faced with broken code.
You can find the full BreakBot report in our fork repository: report for PR.

We hope this information is valuable to you, and apologize otherwise.
If you're willing to help, we would kindly ask for your help to fill in a 5-minutes survey about the report.
Your feedback will help us improve the tool and help us in our research!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants