-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: crmsh enhancements, master slave, validations #197
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #197 +/- ##
===========================================
+ Coverage 50.54% 76.79% +26.24%
===========================================
Files 1 3 +2
Lines 91 181 +90
Branches 12 12
===========================================
+ Hits 46 139 +93
+ Misses 45 42 -3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
# Verify if constraint.resource_leader.id exists | ||
- name: Verify resource_leader presence {{ constraint.resource_leader.id }} | ||
ansible.builtin.command: | ||
cmd: | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmd: | | |
cmd: >- |
? That is, should the command be one line crm -c {{ __ha_cluster_crm_shadow }} configure show {{ constraint.resource_leader.id }}
? because the yaml string |
operator will preserve newlines, so the command will have newlines in it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@richm I have run it with pipes too many times and it never resulted in error, crmsh was able to handle multi line input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@richm I have changed all pipes where it was necessary, change is in latest commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@richm I have changed pipe to >- where it was possible and I am keeping it for multi-line commands only. Can we close pipe discussion?
@tomjelinek @richm Master/Slave is valid naming for cluster resources, removing them as per "anti-woke" filter would not make sense. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Master / slave resources have been deprecated in Pacemaker 2.0 released in July 2018 and replaced by promotable clone resources. In Pacemaker 3, which is expected to be released this year, they are going to be removed completely. I see no reason for adding a special variable for defining those.
The role already supports configuring promotable clones by the means of promotable
option in ha_cluster_resource_clones
. What would be a difference between a master_slave_clone and a promotable clone? I don't see any difference in the variable structure, except the promotable
key. It looks to me like you are creating a new variable for something that can be defined using already existing variables.
If for whatever reason you need to create master / slave resources in pacemaker, I suggest using clones with promotable: true
to define those. It would be then up to the crmsh portion of the role or the crmsh itself to decide if a promotable clone or a master / slave should be created based on pacemaker or crmsh version. That way, it will be transparent for the role users and the same playbook could be used for both old and new versions of pacemaker as well as both pcs-oriented and crmsh-oriented distros. Such interoperability is one of the goals of the system roles.
@tomjelinek Master Slave resources were indeed deprecated, but SUSE still uses Master Slave for existing clusters. It is replaced by Clone promotable in new Resource Agent SAPHanaSR-angi, but all current SAPHanaSR implementations are documented and recommended to use Master Slave. This is added to ensure proper support coverage and it has no impact on pcs. |
@marcelmamula I'm not arguing that master / slave resources do not need to be supported. My point is different: What is preventing you to use already existing
I don't see any other difference between promotable clones and master / slave resources than this. |
@tomjelinek My idea was to have new code for different type of resource and have it documented. |
@tomjelinek I have looked into variations we have coming for SAP Installation and there are few: I dont believe promotable attribute can be used as deciding factor by itself. Maybe introduce dictionary parameter under clone that I could use, like "ms" or "master". pcs would not need to be adjusted, I would just pick if it is defined or not. |
@tomjelinek I can simplify this msl scenario with hidden attribute under clone dictionary, that would determine ms/clone conditional. There is currently no difference for 2 of our clones in terms of promotable, as it is undefined so it cannot be used for decision. I would love to avoid using id as decision factor. it can be hidden atribute like ms, master, msl as boolean.
|
@tomjelinek master slave code is removed and existing clone is adjusted to expect clone attribute ms: true for this specific use case. It would stay with default clone if it is not used. |
configure delete {{ __ha_cluster_resource_id }} | ||
when: __ha_cluster_resource_status.rc == 0 | ||
check_mode: false | ||
changed_when: not ansible_check_mode | ||
|
||
# Clone is default resource type, unless attribute ms: true is specified | ||
# Pacemaker 2.0 deprecated use of ms, but it is still valid and supported | ||
# solution for SAP Hana clusters on SUSE using SAPHanaSR. | ||
- name: Configure resource clone {{ __ha_cluster_resource_id }} | ||
ansible.builtin.command: | ||
cmd: | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmd: | | |
cmd: >- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@richm I have changed pipe to >- where it was possible and I am keeping it for multi-line commands only. Can we close pipe discussion?
@@ -47,5 +49,7 @@ | |||
{% endif %} | |||
{% endfor %} | |||
{% endif %} | |||
# crm can get stuck if it encounters error and expects prompt response. | |||
timeout: 60 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this? Where is the behavior of the timeout
keyword documented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@richm Timeout is ansible task parameter. I have added it because crmsh gets stuck if it encounters error and asks for user input. This is handled with expect in constraint tasks, while keeping resource tasks with timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I see - https://docs.ansible.com/ansible/latest/reference_appendices/config.html#task-timeout
"Version Added: 2.10"
Note that for RedHat os_family, we still have to support Ansible 2.9, and still have many customers using 2.9, until late 2024 when 2.9 support will be ended. Not sure if you have the same restrictions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@richm I was doing testing on my execution node with 2.16 so I was not checking version of it.
It seemed to fulfill task of making sure it does not hang forever. Only other way I see to handle it would be using async and then async_status instead of it.
Do you know of other way to ensure that command/shell does not get stuck like this? I dont see any option with crmsh to skip or auto-decline user inputs without using command/shell with echo pipe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@richm I was doing testing on my execution node with 2.16 so I was not checking version of it. It seemed to fulfill task of making sure it does not hang forever. Only other way I see to handle it would be using async and then async_status instead of it.
Do you know of other way to ensure that command/shell does not get stuck like this? I dont see any option with crmsh to skip or auto-decline user inputs without using command/shell with echo pipe.
I would say - use timeout:
unless you are sure that you will have some customers using Ansible 2.9
If you must support Ansible 2.9, then you will need to use async
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@richm I have removed timeout and replaced it with expect module with response y/n and ERROR.
@richm @tomjelinek I will be pushing one more commit with new functionality that was on todo list for crmsh: rsc/op defaults. |
@tomjelinek @richm Let me know if all your concerns were covered with corrective commits. Pipes were replaced where possible, master slave code was removed and supplemented by ms: true switch, rsc/op code was added. |
[citest] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for updating the PR, it looks good to me now. Let's wait for the CI.
Enhancement:
Reason:
Result:
Issue Tracker Tickets (Jira or BZ if any):