Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix metadata_writer service connection to a proper GRPC server #3087

Closed
wants to merge 1 commit into from

Conversation

areshytko
Copy link

@areshytko areshytko commented Feb 14, 2020

Metadata service uses ml_metadata python API but tries to connect to kubeflow metadata service instead of TFX metadata service hence it fails to connect with the error:

Expected behavior:

Should connect to metadata server without any errors

Actual behavior before this PR:

Pod fails with the following error in logs:

Failed to access the Metadata store. Exception: "Trying to connect an http1.x server"

How to test

  1. build the service and push to a docker registry with the tag metadata_writer:latest
  2. make a deployment with the following YAML:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app.kubernetes.io/component: pipelines
    app.kubernetes.io/instance: metadata-writer
    app.kubernetes.io/managed-by: wla
    app.kubernetes.io/name: metadata-writer-deployment
    app.kubernetes.io/part-of: kubeflow
    app.kubernetes.io/version: 0.2.1
    component: server
  name: metadata-writer
  namespace: kubeflow
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app.kubernetes.io/component: pipelines
      app.kubernetes.io/instance: metadata-writer
      app.kubernetes.io/managed-by: wla
      app.kubernetes.io/name: metadata-writer
      app.kubernetes.io/part-of: kubeflow
      app.kubernetes.io/version: 0.2.1
      component: server
  template:
    metadata:
      labels:
        app.kubernetes.io/component: pipelines
        app.kubernetes.io/instance: metadata-writer
        app.kubernetes.io/managed-by: wla
        app.kubernetes.io/name: metadata-writer
        app.kubernetes.io/part-of: kubeflow
        app.kubernetes.io/version: 0.2.1
        component: server
        kustomize.component: metadata
    spec:
      serviceAccountName: metadata-writer-sa
      containers:
      - envFrom:
        - configMapRef:
            name: metadata-db-parameters
        - secretRef:
            name: metadata-db-secrets
        image: metadata_writer:latest
        imagePullPolicy: IfNotPresent
        name: container
        env:
        - name: NAMESPACE_TO_WATCH
          value: kubeflow
---
apiVersion: v1
kind: ServiceAccount
metadata:
  namespace: kubeflow
  name: metadata-writer-sa
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: Role
metadata:
  namespace: kubeflow
  name: metadata-writer-role
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
  namespace: kubeflow
  name: metadata-writer-rb
roleRef:
  kind: Role
  name: metadata-writer-role
  apiGroup: rbac.authorization.k8s.io
subjects:
 - kind: ServiceAccount
   name: metadata-writer-sa
   namespace: kubeflow

This change is Reviewable

@googlebot
Copy link
Collaborator

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign ark-kun
You can assign the PR to them by writing /assign @ark-kun in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

Hi @areshytko. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@elikatsis
Copy link
Member

This change shouldn't be made on its own. Please refer to #2889 for context.

I've prepared a PR that targets the issue, but I didn't have the time to test it thoroughly. I should have it ready early next week.

@Ark-kun
Copy link
Contributor

Ark-kun commented Feb 14, 2020

I've prepared a PR that targets the issue, but I didn't have the time to test it thoroughly. I should have it ready early next week.

It would be great if the PR is minimal. Is it significantly bigger than changing the service names in the manifests and in Metadata Writer? We'd like to see the PR soon.

@stale
Copy link

stale bot commented Jun 24, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jun 24, 2020
@Bobgy
Copy link
Contributor

Bobgy commented Jun 24, 2020

this is now integrated with kubeflow

@stale stale bot removed the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jun 24, 2020
@Bobgy Bobgy closed this Jun 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants