Skip to content

Conversation

@anmolsachan
Copy link
Member

Signed-off-by: anmolsachan asachan@redhat.com

@anmolsachan anmolsachan force-pushed the sosreport_integration branch from 694159b to 3aec857 Compare April 4, 2017 11:26
@anmolsachan
Copy link
Member Author

@sankarshanmukhopadhyay @r0h4n Please review.

*** Status of tendrl-node-agent.socket service
*** SELinux configurations
*** Firewall status and configurations
*** Package requirements
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should figure out some way to get logs outta etcd

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way this can be done is shipping a small script with tendrl (or as a part of the plugin itself )which can be run by the sos-report and this can be used to capture those logs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can start by testing the existing etcd plugin for sosreport https://github.com/sosreport/sos/blob/master/sos/plugins/etcd.py

I can see in that plugin they arent collecting the actual data store, but I guess we can add that part in the tendrl specific sos report plugin

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which all directories are we looking to get out of etcd ? @r0h4n

Copy link
Member Author

@anmolsachan anmolsachan May 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use a script like this and call from the plugin if it exists and store the dump

import etcd
import json
import os
from ruamel import yaml


config = "/etc/tendrl/node-agent/node-agent.conf.yaml"
dump_location = "etcd_dump"


def get_etcd_ip(fil):
    if os.path.isfile(fil):
        with open(fil, 'r') as confyml:
            cfg = yaml.load(confyml)
            return cfg.get("etcd_connection", None)


etcd_ip = get_etcd_ip(config)
if etcd_ip:
    keys = ["/queue", "/clusters"]
    data = {}
    for key in keys:
        try:
            client = etcd.Client(host=etcd_ip, port=2379)
            data[key] = client.read(key, recursive=True).__dict__
        except etcd.EtcdKeyNotFound:
            print("key %s not found" % key)
    with open(dump_location, "w") as dump:
        dump.write(json.dumps(data))
else:
    with open(dump_location, "w") as dump:
        dump.write("Etcd Ip not found.")


** Tendrl-gluster-integration
*** Rpm versions of commons, node-agent and gluster-integration
*** Tendrl-node-agent service status
*** Glusterd service status
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gluster has its own plugins for sosreport, we do not need to worry about that, simply ensure gluster/ceph specific sosreport plugins are invoked

Copy link
Member Author

@anmolsachan anmolsachan Apr 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will change this.

Copy link
Member Author

@anmolsachan anmolsachan Apr 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One concern that came into my mind is... running gluster and ceph plugins might increase the sizes of generated sos reports. It might go waste in cases where gluster or ceph is working fine and there is some problem with tendrl services or configs. I don't know if size of sos report is a point of concern. @r0h4n

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Size of sos report is not a problem

*** Configurations in /etc/tendrl/node-node/
*** Status of tendrl-node-agent.socket service
*** SELinux configurations
*** Firewall status and configurations
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*** If tendrl-tendrl-epel-7.repo is enabled
*** Configurations in /etc/tendrl/node-node/
*** Status of tendrl-node-agent.socket service
*** SELinux configurations
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*** Installed ruby version
*** Package requirements
*** Gem dependencies
*** Apache httpd process status and configurations
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anmolsachan anmolsachan force-pushed the sosreport_integration branch from 45580b2 to cdaa1a8 Compare April 20, 2017 20:51
@anmolsachan anmolsachan force-pushed the sosreport_integration branch from cdaa1a8 to 40dec5c Compare April 20, 2017 20:54

** There is a multi-node failure. Will it feasible to let admin run SOS Report on all of the failed nodes?

* Using policies in SOS Report it is decided how it will behave on a particular distribution. It has to be decided for which distributions the policies have to be written.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need suggestions on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If N number of nodes fail, the admin has to run sosreport all the nodes.

=== Alternatives

* Rather than creating different plugings for different tendrl services, a
single plugin can also be taken into consideration.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need review on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put up a question on the sosreport repo about multi plugin or single plugin approach
https://github.com/sosreport/sos


** There is a multi-node failure. Will it feasible to let admin run SOS Report on all of the failed nodes?

* Using policies in SOS Report it is decided how it will behave on a particular distribution. It has to be decided for which distributions the policies have to be written.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If N number of nodes fail, the admin has to run sosreport all the nodes.

*** Status of tendrl-node-agent.socket service
*** SELinux configurations
*** Firewall status and configurations
*** Package requirements
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can start by testing the existing etcd plugin for sosreport https://github.com/sosreport/sos/blob/master/sos/plugins/etcd.py

I can see in that plugin they arent collecting the actual data store, but I guess we can add that part in the tendrl specific sos report plugin

=== Alternatives

* Rather than creating different plugings for different tendrl services, a
single plugin can also be taken into consideration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put up a question on the sosreport repo about multi plugin or single plugin approach
https://github.com/sosreport/sos

@r0h4n
Copy link
Contributor

r0h4n commented Apr 28, 2017

Please send a PR (https://github.com/sosreport/sos) for tendrl components which dont have pending review comments

@mbukatov
Copy link
Contributor

fyi @Tendrl/qe

@r0h4n
Copy link
Contributor

r0h4n commented Aug 8, 2017

cc @mbukatov

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants