-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enable-rbd-metrics--test_ceph_rbd_metrics_available 4.15 #9506
base: master
Are you sure you want to change the base?
enable-rbd-metrics--test_ceph_rbd_metrics_available 4.15 #9506
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR validation on existing cluster
Cluster Name: vavuthuextdevyp1
Cluster Configuration:
PR Test Suite: tier1
PR Test Path: tests/functional/monitoring/prometheus/metrics/test_monitoring_defaults.py::TestCephMonitoringAvailable
Additional Test Params:
OCP VERSION: 4.15
OCS VERSION: 4.15
tested against branch: master
Job UNSTABLE (some or all tests failed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR validation on existing cluster
Cluster Name: vavuthuextdevyp1
Cluster Configuration:
PR Test Suite: tier1
PR Test Path: tests/functional/monitoring/prometheus/metrics/test_monitoring_defaults.py::TestCephMonitoringAvailable
Additional Test Params:
OCP VERSION: 4.15
OCS VERSION: 4.15
tested against branch: master
Job UNSTABLE (some or all tests failed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR validation on existing cluster
Cluster Name: vavuthuextdevyp1
Cluster Configuration:
PR Test Suite: tier1
PR Test Path: tests/functional/monitoring/prometheus/metrics/test_monitoring_defaults.py::TestCephMonitoringAvailable
Additional Test Params:
OCP VERSION: 4.15
OCS VERSION: 4.15
tested against branch: master
Job UNSTABLE (some or all tests failed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR validation on existing cluster
Cluster Name: vavuthuextdevyp1
Cluster Configuration:
PR Test Suite: tier1
PR Test Path: tests/functional/monitoring/prometheus/metrics/test_monitoring_defaults.py::TestCephMonitoringAvailable
Additional Test Params:
OCP VERSION: 4.15
OCS VERSION: 4.15
tested against branch: master
Job UNSTABLE (some or all tests failed).
list of metrics for the test test_ceph_metrics_available that are still unavailable: 'ceph_bluestore_state_aio_wait_lat_sum', Consulting with Awan Thakkar |
|
||
@pytest.fixture(scope="session") | ||
def enable_rbd_metrics(request): | ||
ct_pod = pod.get_ceph_tools_pod() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we add a condition for external mode only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good question. Even though the fixture returns back the values of exclude_perf_counters and rbd_stats_pools it may cover regression bug, when these values are not configured by default.
I will add skip for this fixture for non-external mode clusters.
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 30 days if no further activity occurs. |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 30 days if no further activity occurs. |
This pull request has been automatically closed due to inactivity. Please re-open if these changes are still required. |
Signed-off-by: Daniel Osypenko <dosypenk@redhat.com>
Signed-off-by: Daniel Osypenko <dosypenk@redhat.com>
…unters=false Signed-off-by: Daniel Osypenko <dosypenk@redhat.com>
…unters=false 0.1 Signed-off-by: Daniel Osypenko <dosypenk@redhat.com>
998712d
to
4eea528
Compare
Signed-off-by: Daniel Osypenko <dosypenk@redhat.com>
Signed-off-by: Daniel Osypenko <dosypenk@redhat.com>
even after setting external ceph cluster report metrics we still see some of them are unavailable
https://url.corp.redhat.com/bbe0c24 Hello @fbalak In past I was talking with Awan Thakkar and he did not have quick answer on this being not sure if it is possible to make all metrics available or not, he was also stating that we do not show metrics to external users, it is not supported by ODF and never been a part of ODF product. I also think that by default on internal mode cluster ODF manages all mgr settings to make ceph cluster broadcast metrics. Trying to make ceph storage show up metrics by our own manual actions means:
Question, what if I add |
Ok, we can add those markers until we resolve how it should work consistently. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: DanielOsypenko, fbalak The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This PR was carried to master branch. Originally it was tested against old master branch, which was the release-4.13
copy from #8457
We need to solve the problem frequently happening on External mode deployments when the test_ceph_rbd_metrics_available fails because ceph is not configured to enable rbd metrics.
My intention was to enable it and disable after the test.
More info about problem is here -> https://bugzilla.redhat.com/show_bug.cgi?id=2237412
Test passes on internal deployment -> http://pastebin.test.redhat.com/1109172
Test fails on my currently available External mode deployment
I am getting an error when trying to run any 'ceph' command on external mode cluster (for instance 'ceph -s ')
via 'oc rsh' to external toolbox
{CommandFailed}Error during execution of command: oc -n openshift-storage rsh rook-ceph-tools-6ccd65499-cwplh ceph config get mgr mgr/prometheus/rbd_stats_pools --format json-pretty.
Error is 2023-09-11T13:13:12.312+0000 7f815a6f3640 -1 auth: error parsing file /etc/ceph/keyring: error setting modifier for [client.admin] type=key val=admin-secret: Malformed input2023-09-11T13:13:12.312+0000 7f815a6f3640 -1 auth: failed to load /etc/ceph/keyring: (5) Input/output error2023-09-11T13:13:12.317+0000 7f815a6f3640 -1 auth: error parsing file /etc/ceph/keyring: error setting modifier for [client.admin] type=key val=admin-secret: Malformed input2023-09-11T13:13:12.317+0000 7f815a6f3640 -1 auth: failed to load /etc/ceph/keyring: (5) Input/output error2023-09-11T13:13:12.317+0000 7f815a6f3640 -1 auth: error parsing file /etc/ceph/keyring: error setting modifier for [client.admin] type=key val=admin-secret: Malformed input2023-09-11T13:13:12.317+0000 7f815a6f3640 -1 auth: failed to load /etc/ceph/keyring: ...
via 'oc debug' to external toolbox
ceph -s
Error initializing cluster client: ObjectNotFound('RADOS object not found (error calling conf_read_file)')