Updated Endpoint Selector to pick the Cluster in Enabled state (in addition to Host state)#10757
Conversation
…lection (in addition to Host state)
|
@blueorangutan package |
|
@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 4.20 #10757 +/- ##
============================================
+ Coverage 16.02% 16.14% +0.11%
- Complexity 13127 13228 +101
============================================
Files 5652 5652
Lines 495994 496986 +992
Branches 60067 60218 +151
============================================
+ Hits 79466 80214 +748
- Misses 407668 407838 +170
- Partials 8860 8934 +74
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 13121 |
|
@blueorangutan test |
|
@borisstoyanov a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
|
[SF] Trillian test result (tid-13082)
|
|
code lgtm pod could be disabled as well. |
|
@sureshanaparti @borisstoyanov could you review Wei's remarks: And advise if this is ready for merging? |
|
@rohityadavcloud, Suresh will be syncing the latest fixes with this one, I'll doublecheck and LGTM. |
- the pool id is persisted while creating the volume, when it fails the pool id is not reverted. On next create volume attempt, CloudStack couldn't find any suitable primary storage even there are pools available with enough capacity as the pool is already assigned to volume which is in Allocated state (and storage pool compatibility check fails). Ensure volume is not assigned to any pool if create volume fails (so the next creation job would pick the suitable pool).
@weizhouapache disabled pods are checked during deployment plan, for volumes operations only zone disabled state is checked not pod state. so, there are chances that host in disabled pod (but cluster enabled) can be returned as an endpoint to perform operations. |
|
@blueorangutan package |
|
@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 13343 |
|
@blueorangutan test |
|
@borisstoyanov a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
|
[SF] Trillian test result (tid-13276)
|
…dition to Host state) (apache#10757) * Consider the clusters with allocation state 'Enabled' for EndPoint selection (in addition to Host state) * Reset the pool id when create volume fails on the allocated pool - the pool id is persisted while creating the volume, when it fails the pool id is not reverted. On next create volume attempt, CloudStack couldn't find any suitable primary storage even there are pools available with enough capacity as the pool is already assigned to volume which is in Allocated state (and storage pool compatibility check fails). Ensure volume is not assigned to any pool if create volume fails (so the next creation job would pick the suitable pool). * endpoint check for resize * update the resize error through callback result instead of exception * logger fix
Description
Current EndPoint selection checks for Host status Up and and resource_state Enabled, however there is a case where whole Cluster (where Host located is) can be disabled. This PR updates the EndPoint selection query to consider the clusters with allocation state 'Enabled' (in addition to current Host state check - status Up and and resource_state Enabled), which ensures the Host is chosen from Enabled cluster.
This includes changes of PR #10777 (targeted for 4.19.3).
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
How Has This Been Tested?
How did you try to break this feature and the system with this change?