Skip to content

Conversation

@liulinC
Copy link
Collaborator

@liulinC liulinC commented Feb 20, 2024

Just routinely feature branch sync,
No any update on feature branch.

gangj and others added 30 commits December 5, 2023 15:05
This is the initial commit for the update guidance improvement:

1. Update comments for re-purposing host.pending_guidances.
2. Add new pending-guidance fields: host.pending_guidances_recommended,
   host.pending_guidances_full, vm.pending_guidances_recommended and
   vm.pending_guidances_full.
3. Add definition for new guidance: RestartVM,
   reboot_host_on_kernel_livepatch_failure and
   reboot_host_on_xen_livepatch_failure
4. Add DB upgrade rule: replace reboot_host_on_livepatch_failure in
   "host.pending_guidances" with both
   reboot_host_on_kernel_livepatch_failure and
   reboot_host_on_xen_livepatch_failure in
   "host.pending_guidances_recommended”
5. Clear reboot_host_on_kernel_livepatch_failure and
   reboot_host_on_xen_livepatch_failure on reboot
6. CLI: expose those new added pending guidance fields

Signed-off-by: Gang Ji <gang.ji@citrix.com>
As mandatory host guidance must be applied after applying updates,
otherwise there may be e.g. VM failures, it is safe not to enable a host
when its mandatory host guidance is pending:

1. do not enable host on xapi startup if there is pending mandatory
   guidance in the host.
2. raise error "host_pending_mandatory_guidances_not_empty" for API
   host.enable if there is pending mandatory guidance in the host.
3. not enable host when both on host startup and on only xapi startup.

Signed-off-by: Gang Ji <gang.ji@citrix.com>
The guidance in new format in updateinfo.xml file deprecates the
guidance in old format. For backwards compatibility support, both
old and new guidance data will be in updateinfo.xml.

With this commit, XAPI uses the guidance in new format only as the data
in new format has been presented in updateinfo.xml.

Signed-off-by: Ming Lu <ming.lu@cloud.com>
This commit updates the guidance evaluation logic and exposes the result
of evaluation to the response of query on HTTP /updates endpoint.

The response will be consumed by clients like XenCenter. The format of
response is changed in this commit. Regarding this change, the backwards
compatibility is not supported on this interface.

Additionally, the evaluation is simplified by not considering the failed
livapatches any more. Otherwise, the client like XenCenter has to
introduce complicated UI design to populate the guidance change due to
the livepatch failures. This may cause un-necessary carrying out of
livepatch guidances. But the overall impact of carrying out these
guidances is considered as ignorable.

Signed-off-by: Ming Lu <ming.lu@cloud.com>
This commit updates the setting pending guidances logic. Since the
host|VM.pending_guidances are re-purposed to be mandatory guidances, the
setting logic is updated accordingly.

Aslo fix-up setting pending mandatory guidances logic. The VMs' pending lists
were missed.

Meanwhile, the handling logic is changed to be easy for unit testing as well.

Livepatch failures related guidance will not be in mandatory pending
list. They will be handled in another pull request.

Signed-off-by: Ming Lu <ming.lu@cloud.com>
This commit updates and adds unit tests for previous commits.

Signed-off-by: Ming Lu <ming.lu@cloud.com>
Signed-off-by: Ming Lu <ming.lu@cloud.com>
…ge_from_master

Merge master to update guidance improvement feature branch
Signed-off-by: Ming Lu <ming.lu@cloud.com>
This commit adds logic to set pending recommended and full guidance
lists. What's more, the livepatch failures are merged with the pending
livepatch failures. Livepatch failures are placed in pending recommended
guidance list.

Signed-off-by: Ming Lu <ming.lu@cloud.com>
Add unit tests for merge_livepatch_failures.

Signed-off-by: Ming Lu <ming.lu@cloud.com>
In host.apply_updates, some safety checks are added in this commit:
- disallow if there is any outstanding mandatory guidance on the host.
- disallow if new mandatory guidance includes EvacuateHost or RebootHost
  and there are any running or paused VMs on the host.
- automatically disable host (if not yet disabled).

Signed-off-by: Ming Lu <ming.lu@cloud.com>
RestartVM is a task specified in update guidance. It has to be done on
an updated host. Unlike other VM tasks in update guidance, a VM may
escape from the tracking on individual host. E.g. a host evacuation will
migrate a VM to aother host before host.apply_updates call setting the
pending guidance on it. It's also not good to expose an API to set the
pending guidance lists by clients.

The solution in this commit is to set it for all running/paused/suspended
VMs in the pool when the host.apply_updates is called for the coordinator
host which is always the first one to be updated in a pool update.

Signed-off-by: Ming Lu <ming.lu@cloud.com>
Signed-off-by: Ming Lu <ming.lu@cloud.com>
…46747

CP-46747: Expose 'title' field in updateinfo.xml to HTTP /updates
The RestartVM will be cleared when a VM's power state transits to
Halted. So it is not good to add this update guidance action into
shutdown VMs' pending guidance list, although it could be cleared in VM
start also.

Signed-off-by: Ming Lu <ming.lu@cloud.com>
…387034

CA-387034: RestartVM is added to pending guidances of shutdown VMs
Update document for error: "host_pending_mandatory_guidances_not_empty" as
it is used not only when enabling host.

Signed-off-by: Gang Ji <gang.ji@citrix.com>
Add field "last_update_hash" on host object to record the SHA256 checksum
of updateinfo of the most recently applied update on the host.

Signed-off-by: Gang Ji <gang.ji@citrix.com>
CP-43875: Record the repository hash on the host object when updating
The non-empty host pending mandatory guidance will result in the host
blocked from being enabled or updated. This commit introduces a new API
to clear the pending mandatory guidance in emergency cases. Otherwise,
the host might be stuck in such a situation.

Signed-off-by: Ming Lu <ming.lu@cloud.com>
A safety check in host.apply_updates will raise error if there is a
mandatory EvacuateHost guidance but some VMs are running on the host.
Meanwhile, EvacuateHost should not be added into pending list as it
should be be carried out before host.apply_updates. Hence it needs
to be filtered out.

This commit fixes a bug that filering out the EvacuateHost before the
safety check. It should be filtered out after the safety check.

Signed-off-by: Ming Lu <ming.lu@cloud.com>
…45569

CP-45569: Add API Host.emergency_clear_mandatory_guidance
…up-safety-check

Fixup: filter out EvacuateHost before safety check
…te channel

When a user adds more update repositories as enabled ones, the
pool.last_update_sync will be invalid as at the moment not all the
enabled repositories have been synced at the time which is recorded
in pool.last_update_sync.

To avoid confusion, this commit resets the field to epoch so that the
user can be notified to do a pool.sync_updates soon.

Disabling an update repository will not make the pool.last_update_sync
be invalid unless no enabled repositories after the disabling.

Signed-off-by: Ming Lu <ming.lu@cloud.com>
There were 2 problems about host update guidance:

1. RestartToolstack guidance in pool member was cleared when there was
a xapi restart in pool coordinator.

This is because "consider_enabling_host_request" in pool member would be
called when there is a xapi restart in pool coordinator, as the db
connection to the coordinator will be re-established then.

2. CA-387695: RebootHost guidance in pool member got cleared when there
was a xapi restart in pool coordinator.

When xapi is started with a host (re)boot, its command line looks like:
/opt/xensource/bin/xapi -nowatchdog
                        -writereadyfile /var/run/xapi_startup.cookie
                        -writeinitcomplete /var/run/xapi_init_complete.cookie
                        -onsystemboot
then "!Xapi_globs.on_system_boot" will always be true.

Same as above, "consider_enabling_host_request" in pool member would be
called when there is a xapi restart in pool coordinator. And when
"!Xapi_globs.on_system_boot" is true for xapi in the pool member, the
RebootHost guidance in pool member would be cleared when there was a
xapi restart in pool coordinator.

These 2 problems are resolved by clearing host update guidance in
function "Dbsync_slave.update_env", which is called only during xapi
startup.

Signed-off-by: Gang Ji <gang.ji@citrix.com>
When a VM reboots, its qemu-dm-<domid> process restarts with its
domid changed, but its RestartDeviceModel guidance was not cleared.
This issue is resolved in this commit.

When a VM starts, or reboots, xapi will be notified by xenopsd event
to update its "last_start_time", RestartDeviceModel and RestartVM
guidance is cleared there on VM start and VM reboot.

When a VM is shutdown, "force_state_reset_keep_current_operations"
will always be called to set the VM's state to "`Halted", RestartVM
and RestartDeviceModel guidance is cleared there on VM shutdown.

When a VM is suspended, "force_state_reset_keep_current_operations"
will always to called to set the VM's state to "`Suspended",
RestartDeviceModel guidance is cleared there on VM suspend.

When a VM is migrated, "pool_migrate_complete" will always be called,
RestartDeviceModel guidance is cleared there on VM migration.

Below are the tested scenarios:

VM running -> (force)halted
result: RestartDeviceMode RestartVM cleared if any

VM running -> paused
result: Nothing

VM running -> suspended
result: RestartDeviceMode cleared if any

VM halted -> running
result: RestartDeviceMode RestartVM cleared if any

VM paused -> running
result: Nothing

VM paused -> (force)halted
result: RestartDeviceMode RestartVM cleared if any

VM paused -> (force)reboot
result: RestartDeviceMode RestartVM cleared if any

VM suspend -> running
result: Nothing

VM suspend -> force halted
result: RestartDeviceMode RestartVM cleared if any

VM running -> (force)reboot
result: RestartDeviceMode RestartVM cleared if any

VM running -> take snapshot with no memory
result: Nothing

VM running -> take snapshot with memory
result: RestartDeviceMode cleared if any

VM running(with RestartDeviceMode and RestartVM guidance)
-> take snapshot with no memory
-> revert to snapshot with no memory
result: both RestartDeviceMode RestartVM cleared

VM running(with RestartDeviceMode and RestartVM guidance)
-> take snapshot with memory -> add RestartDeviceMode guidance
-> revert to snapshot with memory
result: RestartDeviceMode cleared, RestartVM remains

VM running(with RestartDeviceMode and RestartVM guidance)
-> take snapshot with no memory
-> clear RestartDeviceMode RestartVM guidance by reboot
-> revert to snapshot with no memory
result: both RestartDeviceMode and RestartVM cleared

VM running(with RestartDeviceMode and RestartVM guidance)
-> take snapshot with memory
-> clear RestartDeviceMode RestartVM guidance by reboot
-> revert to snapshot with memory
result: only RestartDeviceMode cleared, RestartVM restored

One side effect from this change:
In case there is any RestartVM or RestartDeviceModel guidance on
a halted VM, then a toolstack restart on pool master will clear
those VM guidance from the halted VMs.

However it should not be a problem as:
1. xapi will not add VM guidance on halted VMs
2. once a VM turns to halted state, the VM guidance will be
   cleared by xapi
So there should not be any VM guidance on halted VMs.

Signed-off-by: Gang Ji <gang.ji@citrix.com>
The checking task status logic in CLI server side is used by multiple
interfaces. It's better to split it out to resolve the duplications.

Signed-off-by: Ming Lu <ming.lu@cloud.com>
minglumlu and others added 23 commits January 30, 2024 13:08
Signed-off-by: Ming Lu <ming.lu@cloud.com>
Signed-off-by: Ming Lu <ming.lu@cloud.com>
Signed-off-by: Ming Lu <ming.lu@cloud.com>
…45572

Add "xe host-show-updates" CLI and update "xe host-apply-updates"
Signed-off-by: Gang Ji <gang.ji@citrix.com>
…rom_master.2

Merge master into update guidance improvement feature branch
So that the existing XenCenter will refuse to connect to the new XAPI.
This aims to force users to use the new XenCenter since the old XenCenter
can't work with the new XAPI with update guidance improvement.

Signed-off-by: Gang Ji <gang.ji@citrix.com>
…new xapi

Add livepatch guidance DB upgrade rule: except
reboot_host_on_livepatch_failure, move any guidance in
"host.pending_guidances" into "host.pending_guidances_recommended".

Signed-off-by: Gang Ji <gang.ji@citrix.com>
Signed-off-by: Ming Lu <ming.lu@cloud.com>
Signed-off-by: Ming Lu <ming.lu@cloud.com>
…ate-datamodel-lifecycle

Update datamodel lifecycle for update guidance feature
The default value of 'flags' is '[`Session; `Async]'. Explicit value
'[`Session]' caused no async support on this API method.

The fix is to use the default value.

Signed-off-by: Ming Lu <ming.lu@cloud.com>
…388699

CA-388699: No async support on VM.restart_device_models
Signed-off-by: Konstantina Chremmou <Konstantina.Chremmou@cloud.com>
[ASAP] CP-47509: Expose RequestHeaders and ResponseHeaders in C# SDK.
the data field is not part of meta. see the XML generator at
ocaml/libs/xapi-rrd/lib/rrd_updates.ml line 124

The issue was introduced when serialization was changed to use yojson:
xapi-project/xcp-rrd@048b0e3

Signed-off-by: Pau Ruiz Safont <pau.ruizsafont@cloud.com>
Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
…ance-improvement

Merge feature/update-guidance-improvement branch to master
@codecov
Copy link

codecov bot commented Feb 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (b4f6563) 49.07% compared to head (34dabdf) 49.07%.
Report is 4 commits behind head on feature/py3.

Additional details and impacted files
@@             Coverage Diff              @@
##           feature/py3    #5461   +/-   ##
============================================
  Coverage        49.07%   49.07%           
============================================
  Files               18       18           
  Lines             2319     2319           
============================================
  Hits              1138     1138           
  Misses            1181     1181           
Flag Coverage Δ
python2.7 53.38% <ø> (ø)
python3.11 55.85% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@gangj gangj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last commit in the source branch https://github.com/liulinC/xen-api/tree/private/linl/sync is 34dabdf, it is just the current latest commit in master.

@bernhardkaindl bernhardkaindl self-requested a review February 20, 2024 08:27
@gangj
Copy link
Contributor

gangj commented Feb 20, 2024

A merge commit is introduced in PR: #5454, so we can't just sync from master now as the history in the feature branch is different from master now. We need a PR to merge from master. @liulinC

@liulinC liulinC merged commit a9d70da into xapi-project:feature/py3 Feb 20, 2024
@github-actions
Copy link

pytype_reporter extracted 50 problem reports from pytype output

.

You can check the results of the job here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants