Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Harvester chart marked as modified when upgrading to v1.2-head #5566

Closed
bk201 opened this issue Apr 12, 2024 · 6 comments
Closed

[BUG] Harvester chart marked as modified when upgrading to v1.2-head #5566

bk201 opened this issue Apr 12, 2024 · 6 comments
Assignees
Labels
area/longhorn issues depends on the upstream longhorn area/upgrade kind/bug Issues that are defects reported by users or that we know have reached a real release not-require/test-plan Skip to create a e2e automation test issue priority/0 Must be fixed in this release reproduce/often Reproducible 10% to 99% of the time
Milestone

Comments

@bk201
Copy link
Member

bk201 commented Apr 12, 2024

Describe the bug

While upgrading Harvester from v1.2.1 to v1.2-head, the apply-manifests job can't continue because the harvester chart is considered "Modified".

To Reproduce
Steps to reproduce the behavior:

  1. Create a harvester v1.2.1 cluster.
  2. Upgrade to v1.2-head.
  3. The hvst-upgrade-xxxxx-apply-manifests job keep waiting for Harvester chart to become Ready:
Current version: 0.0.0-test-pr-4881-0f11c106, Current state: Modified, Current generation: 4
Sleep for 5 seconds to retry
Current version: 0.0.0-test-pr-4881-0f11c106, Current state: Modified, Current generation: 4
Sleep for 5 seconds to retry
  1. Check the harvester fleet bundle, it's in Modified:
  summary:
    desiredReady: 1
    modified: 1
    nonReadyResources:
    - bundleState: Modified
      modifiedStatus:
      - apiVersion: apiextensions.k8s.io/v1
        kind: CustomResourceDefinition
        name: engines.longhorn.io
        patch: '{"status":{"acceptedNames":{"kind":"","plural":""},"conditions":[],"storedVersions":[]}}'
      - apiVersion: apiextensions.k8s.io/v1
        kind: CustomResourceDefinition
        name: instancemanagers.longhorn.io
        patch: '{"status":{"acceptedNames":{"kind":"","plural":""},"conditions":[],"storedVersions":[]}}'
      - apiVersion: apiextensions.k8s.io/v1
        kind: CustomResourceDefinition
        name: replicas.longhorn.io
        patch: '{"status":{"acceptedNames":{"kind":"","plural":""},"conditions":[],"storedVersions":[]}}'
      - apiVersion: apiextensions.k8s.io/v1
        kind: CustomResourceDefinition
        name: settings.longhorn.io
        patch: '{"status":{"acceptedNames":{"kind":"","plural":""},"conditions":[],"storedVersions":[]}}'
      name: fleet-local/local

Expected behavior

The upgrade should continue.

Support bundle

Environment

  • Harvester ISO version: v1.2.1 -> v1.2-head (1a44ad3)
  • Underlying Infrastructure (e.g. Baremetal with Dell PowerEdge R630):

Additional context

I can add these diff.comparePatches to the Harvester managed chart to bypass the fleet complaint:

    - apiVersion: apiextensions.k8s.io/v1
      jsonPointers:
      - /status/acceptedNames
      - /status/conditions
      - /status/storedVersions
      kind: CustomResourceDefinition
      name: settings.longhorn.io
    - apiVersion: apiextensions.k8s.io/v1
      jsonPointers:
      - /status/acceptedNames
      - /status/conditions
      - /status/storedVersions
      kind: CustomResourceDefinition
      name: replicas.longhorn.io
    - apiVersion: apiextensions.k8s.io/v1
      jsonPointers:
      - /status/acceptedNames
      - /status/conditions
      - /status/storedVersions
      kind: CustomResourceDefinition
      name: instancemanagers.longhorn.io
    - apiVersion: apiextensions.k8s.io/v1
      jsonPointers:
      - /status/acceptedNames
      - /status/conditions
      - /status/storedVersions
      kind: CustomResourceDefinition
      name: engines.longhorn.io
@bk201 bk201 added kind/bug Issues that are defects reported by users or that we know have reached a real release priority/0 Must be fixed in this release area/longhorn issues depends on the upstream longhorn area/upgrade not-require/test-plan Skip to create a e2e automation test issue labels Apr 12, 2024
@bk201 bk201 added this to the v1.2.2 milestone Apr 12, 2024
@bk201
Copy link
Member Author

bk201 commented Apr 12, 2024

The issue shows up in a v1.2-head new installation too.

@harvesterhci-io-github-bot
Copy link

harvesterhci-io-github-bot commented Apr 15, 2024

Pre Ready-For-Testing Checklist

  • If labeled: require/HEP Has the Harvester Enhancement Proposal PR submitted?
    The HEP PR is at:

  • Where is the reproduce steps/test steps documented?
    The reproduce steps/test steps are at:

    New cluster

    • Create a new cluster.

    • Run kubectl get bundles -n fleet-local mcc-harvester, the BUNDLEDEPLOYMENTS-READY should be 1/1 and STATUS should be empty. For example:

      NAME            BUNDLEDEPLOYMENTS-READY   STATUS
      mcc-harvester   1/1
      

    Upgrade

    • Upgrade from v1.2.1

    • apply-manifest job should not block.

    • After the upgrade finishes, run kubectl get bundles -n fleet-local mcc-harvester, the BUNDLEDEPLOYMENTS-READY should be 1/1 and STATUS should be empty. For example:

      ```
      NAME            BUNDLEDEPLOYMENTS-READY   STATUS
      mcc-harvester   1/1 
      ```
      
  • Is there a workaround for the issue? If so, where is it documented?
    The workaround is at:

  • Have the backend code been merged (harvester, harvester-installer, etc) (including backport-needed/*)?
    The PR is at:

    • Does the PR include the explanation for the fix or the feature?

    • Does the PR include deployment change (YAML/Chart)? If so, where are the PRs for both YAML file and Chart?
      The PR for the YAML change is at:
      The PR for the chart change is at:

  • If labeled: area/ui Has the UI issue filed or ready to be merged?
    The UI issue/PR is at:

  • If labeled: require/doc, require/knowledge-base Has the necessary document PR submitted or merged?
    The documentation/KB PR is at:

  • If NOT labeled: not-require/test-plan Has the e2e test plan been merged? Have QAs agreed on the automation test case? If only test case skeleton w/o implementation, have you created an implementation issue?

    • The automation skeleton PR is at:
    • The automation test case PR is at:
  • If the fix introduces the code for backward compatibility Has a separate issue been filed with the label release/obsolete-compatibility?
    The compatibility issue is filed at:

@lanfon72
Copy link
Member

I had only encountered this 1 to 2 times in around 10 times upgrade test, add label reproduce/often.

@lanfon72
Copy link
Member

Close as resolved, unable to reproduce it in around 10 times upgrade v1.2.1 -> v1.2.2-rc1

Test Information

  • Environment: baremetal DL360G9 3-4 nodes
  • Harvester Version: v1.2.1 -> v1.2.2-rc1 -> v1.3-d2f04e5a-head
  • ui-source Option: Auto

Verify Steps

  1. Install Harvester with any nodes
  2. Create an image for VM creation
  3. Create cluster network and VM network
  4. Create VM vm1 with VLAN NIC
  5. Perform upgrade
  6. Upgrade should success

@brandboat
Copy link
Contributor

Confirmed this one also happened when upgrading v1.3-head -> master. Should I create another issue or just add a label here ? c.c @bk201

@bk201
Copy link
Member Author

bk201 commented May 23, 2024

@brandboat yes, please open one. We'll see this because we upgraded Longhorn to 1.6.2. Nice finding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/longhorn issues depends on the upstream longhorn area/upgrade kind/bug Issues that are defects reported by users or that we know have reached a real release not-require/test-plan Skip to create a e2e automation test issue priority/0 Must be fixed in this release reproduce/often Reproducible 10% to 99% of the time
Projects
None yet
Development

No branches or pull requests

4 participants