Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Monit] Monitor multiple processes with the same name but using different arguments. #4257

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
7 changes: 7 additions & 0 deletions dockers/docker-dhcp-relay/base_image_files/monit_dhcp_relay
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
###############################################################################
## Monit configuration for dhcp_relay container
## process list
## dhcrelay
###############################################################################
check program monit_dhcrelay with path "/usr/bin/monit_dhcrelay_processes"
if status != 0 then alert
50 changes: 50 additions & 0 deletions dockers/docker-dhcp-relay/monit_dhcrelay_processes
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#!/usr/bin/python
'''
This script is used to monitor dhcrelay processes in dhcp_relay docker container.
Since Monit can only monitor the process with unique name, it is unable to do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/the process with unique name/processes with unique names/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded.

this monitoring for dhcrelay processes. Usually there will be multiple dhcrelay
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Usually there will be multiple/There can exist multiple/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded.

processes which executes a same commad but with different arguments. The number
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/commad/command

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

of dhcrelay processes is determined by Vlans which have non-empry list of dhcp servers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/non-empry/non-empty/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

As such, we let Monit to monitor this script which will read number of vlans with
no-empty list of dhcp servers form Config_DB, then find whether there exist a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/no-empry/non-empty/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/exist/exists/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

process in Linux corresponding to a vlan. If this script fails to find such process,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove extra space before "such"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed extra space.

it will write an alert message into syslog file.
'''

import os
import subprocess
import re
import sys
import syslog

from swsssdk import ConfigDBConnector

def retrieve_vlans():
vlans = []

config_db = ConfigDBConnector()
config_db.connect()
vlan_table = config_db.get_table('VLAN')

for vlan in vlan_table.keys():
if vlan_table[vlan].has_key('dhcp_servers') and len(vlan_table[vlan]['dhcp_servers']) != 0:
vlans.append(vlan)

return vlans

def check_dhcrelay_processes():
vlans = retrieve_vlans()
cmd = "sudo monit procmatch '/usr/sbin/dhcrelay -d -m discard'"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than making a call to monit, I'd prefer if we use a Python library like psutil.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used psutil library to check whether one of dhcrelay processes is running or not. Please help me review.

cmd_res = subprocess.check_output(cmd, shell=True)

for vlan in vlans:
found_process = re.findall(vlan, cmd_res)
if len(found_process) == 0:
syslog.syslog(syslog.LOG_ERR, "dhcrelay process with {} is not running.".format(vlan))


def main():
check_dhcrelay_processes()

if __name__ == '__main__':
main()
7 changes: 7 additions & 0 deletions dockers/docker-teamd/base_image_files/monit_teamd
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
###############################################################################
## Monit configuration for teamd container
## process list:
## teamd
###############################################################################
check program monit_teamd with path "/usr/bin/monit_teamd_processes"
if status != 0 then alert
48 changes: 48 additions & 0 deletions dockers/docker-teamd/monit_teamd_processes
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
#!/usr/bin/python
'''
This script is used to monitor teamd process in teamd docker container.
Since Monit can only monitor the process with unique name,
it is unable to do this monitoring for teamd processes. Usually there will be
multiple teamd processes which executes a same command but with different arguments.
The number of teamd processes is decided by the number of port channels in Config_DB.
As such, we let Monit to monitor this script which will read number of port channels,
then find whether there exist a process in Linux corresponding to a port channel.
If this script fails to find such process, it will write an alert message into syslog file.
'''

import os
import subprocess
import re
import sys
import syslog

from swsssdk import ConfigDBConnector


def retrieve_portchannels():
port_channels = []

config_db = ConfigDBConnector()
config_db.connect()
port_channel_table = config_db.get_table('PORTCHANNEL')

for key in port_channel_table.keys():
port_channels.append(key)

return port_channels

def check_teamd_processes():
port_channels = retrieve_portchannels()
cmd = "sudo monit procmatch '/usr/bin/teamd -r -t '"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than making a call to monit, I'd prefer if we use a Python library like psutil.

Copy link
Contributor Author

@yozhao101 yozhao101 Mar 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion! I will do that. I also found psutil library is not installed by default in host image.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used psutil library to check whether one of teamd processes is running or not. Please help me review.

cmd_res = subprocess.check_output(cmd, shell=True)

for port_channel in port_channels:
found_process = re.findall(port_channel, cmd_res)
if len(found_process) == 0:
syslog.syslog(syslog.LOG_ERR, "Teamd process with {} is not running.".format(port_channel))

def main():
check_teamd_processes()

if __name__ == '__main__':
main()
2 changes: 2 additions & 0 deletions rules/docker-dhcp-relay.mk
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,5 @@ $(DOCKER_DHCP_RELAY)_CONTAINER_NAME = dhcp_relay
$(DOCKER_DHCP_RELAY)_RUN_OPT += --privileged -t
$(DOCKER_DHCP_RELAY)_RUN_OPT += -v /etc/sonic:/etc/sonic:ro
$(DOCKER_DHCP_RELAY)_FILES += $(SUPERVISOR_PROC_EXIT_LISTENER_SCRIPT)
$(DOCKER_DHCP_RELAY)_BASE_IMAGE_FILES += monit_dhcp_relay:/etc/monit/conf.d
$(DOCKER_DHCP_RELAY)_BASE_IMAGE_FILES += monit_dhcrelay_processes:/usr/bin/monit_dhcrelay_processes
2 changes: 2 additions & 0 deletions rules/docker-teamd.mk
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,6 @@ $(DOCKER_TEAMD)_RUN_OPT += -v /etc/sonic:/etc/sonic:ro
$(DOCKER_TEAMD)_RUN_OPT += -v /host/warmboot:/var/warmboot

$(DOCKER_TEAMD)_BASE_IMAGE_FILES += teamdctl:/usr/bin/teamdctl
$(DOCKER_TEAMD)_BASE_IMAGE_FILES += monit_teamd:/etc/monit/conf.d
$(DOCKER_TEAMD)_BASE_IMAGE_FILES += monit_teamd_processes:/usr/bin/monit_teamd_processes
$(DOCKER_TEAMD)_FILES += $(SUPERVISOR_PROC_EXIT_LISTENER_SCRIPT)