Description
- mq-metrics-samples version(s) that are affected by this issue.
v5.2.5
I'm using IBM MQ 9.2.0.0 and AWS CloudWatch metrics collection stops at random with:
time="2023-08-19T12:30:27Z" level=trace msg="> [getMessage]"
time="2023-08-19T12:30:27Z" level=trace msg="> [getMessageWithHObj]"
time="2023-08-19T12:30:27Z" level=trace msg="< [getMessageWithHObj] rp: 0 Error: nil"
time="2023-08-19T12:30:27Z" level=trace msg="< [getMessage] rp: 0 Error: nil"
time="2023-08-19T12:30:27Z" level=trace msg="> [parsePCFResponse]"
time="2023-08-19T12:30:27Z" level=trace msg="< [parsePCFResponse] rp: 0"
time="2023-08-19T12:30:27Z" level=trace msg="> [getMessage]"
time="2023-08-19T12:30:27Z" level=trace msg="> [getMessageWithHObj]"
time="2023-08-19T12:30:27Z" level=trace msg="< [getMessageWithHObj] rp: 0 Error: MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]"
time="2023-08-19T12:30:27Z" level=trace msg="< [getMessage] rp: 0 Error: MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]"
time="2023-08-19T12:30:27Z" level=trace msg="< [ProcessPublications] rp: 0"
time="2023-08-19T12:30:27Z" level=debug msg="Polling for object status"
time="2023-08-19T12:30:27Z" level=trace msg="> [CollectQueueManagerStatus]"
time="2023-08-19T12:30:27Z" level=trace msg="> [QueueManagerInitAttributes]"
time="2023-08-19T12:30:27Z" level=trace msg="< [QueueManagerInitAttributes] rp: 1"
time="2023-08-19T12:30:27Z" level=trace msg="> [collectQueueManagerStatus]"
time="2023-08-19T12:30:27Z" level=trace msg="> [statusClearReplyQ]"
time="2023-08-19T12:30:27Z" level=trace msg="< [statusClearReplyQ] rp: 0"
time="2023-08-19T12:30:27Z" level=trace msg="> [statusSetCommandHeaders]"
time="2023-08-19T12:30:27Z" level=trace msg="< [statusSetCommandHeaders] rp: 0"
time="2023-08-19T12:30:27Z" level=trace msg="> [statusGetReply]"
time="2023-08-19T12:30:30Z" level=trace msg="< [statusGetReply] rp: 3 Error: MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]"
time="2023-08-19T12:30:30Z" level=trace msg="< [collectQueueManagerStatus] rp: 0 Error: MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]"
time="2023-08-19T12:30:30Z" level=trace msg="< [CollectQueueManagerStatus] rp: 0 Error: MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]"
time="2023-08-19T12:30:30Z" level=error msg="Error collecting queue manager status: MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]"
time="2023-08-19T12:30:30Z" level=fatal msg="Error collecting status: MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]"
I noted that another issue response suggested that this issue could be a 3 second timeout. I can see a 3 second timeout between the [statusGetReply] and [statusGetReply] rp:3.
However, the MQ server is under very little load so I don't understand why such a timeout might occur.
Once this occurs the metric collection simply stops. Is there a way to ignore MQCC_FAILED if a timeout occasionally occurs?
Activity