Skip to content

Conversation

@brianparry
Copy link
Contributor

This diff makes the following changes to the IPMI SDR poller.

  1. Publish an alert with 'inCondition' asserted for each SDR reporting a fault.
  2. Correct the key used to get saved inCondition values for each sensor ID.

@RackHD/corecommitters @benbp @VulpesArtificem

@brianparry
Copy link
Contributor Author

Unit tests are failing:
https://roebling.hwimo.lab.emc.com/jenkins/job/Pull%20Request%20Builds/job/on-tasks-dev/136/

I'll fix and update the PR

@brianparry
Copy link
Contributor Author

Unit tests fixed

@benbp
Copy link
Contributor

benbp commented Oct 22, 2015

To make sure I'm understanding correctly from the code/your comments:

  1. This fixes a bug where we weren't doing lookups properly because of the '_' and '.' sanitization?
  2. Publish an alert every time we see inCondition as true, even if we've already published that same alert before.

Sound right, or off the mark?

@benbp
Copy link
Contributor

benbp commented Oct 22, 2015

👍

@brianparry
Copy link
Contributor Author

@benbp Your understanding is correct. This is a little different from the original logic we requested, but reduces the probability of missing events. LTAE has logic that understands alert uniqueness so it will not log a new alert each time the poller runs.

@AlliumApotheosis
Copy link
Contributor

👍

@benbp
Copy link
Contributor

benbp commented Oct 22, 2015

@brianparry sounds good. Note that we still have the same missed event probability for inCondition: false events that are only published once. I think probably the best way to handle this is to set the IPMI alerts AMQP queue to be persistent, or change the alerter to create database entries + events, so that messages are guaranteed delivery.

benbp added a commit that referenced this pull request Oct 22, 2015
…nges

Revise 'inCondition' logic in IPMI SDR alert job.
@benbp benbp merged commit 046722c into RackHD:master Oct 22, 2015
@benbp benbp deleted the feature/in_condition_logic_changes branch October 22, 2015 18:33
kellylu2sym pushed a commit to kellylu2sym/on-tasks that referenced this pull request Aug 8, 2017
Test updates, minor fixes to TaskGraph
kellylu2sym pushed a commit to kellylu2sym/on-tasks that referenced this pull request Aug 8, 2017
…ster

* commit '0e729dcaf0f164ad3cfdea9a193a5a984e7f208a':
  MAG-55 Add chassis poller to default and remove power and uid poller
kellylu2sym pushed a commit to kellylu2sym/on-tasks that referenced this pull request Aug 8, 2017
fix houndci false alarm about undefined variable
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants