Closed
Description
Describe the bug
Hi!
Today I was wondering about a strange "backend unavailable" flapping in icingaweb2 in our qs stage.
Upon investigating I found out, that our secondary master had died:
/var/log/icinga2.log
[2021-01-05 10:53:12 +0100] information/ConfigObjectUtility: Created and activated object 'HOSTNAME!ab06acac-fad3-48d7-bd9c-8fd1991bd53c' of type 'Downtime'.
[2021-01-05 10:53:12 +0100] information/Downtime: Triggering downtime 'HOSTNAME!CPU!06c488b0-2b95-40cf-96c5-3925a01cfd7b' for checkable 'HOSTNAME!CPU'.
[2021-01-05 10:53:12 +0100] information/ConfigObjectUtility: Created and activated object 'HOSTNAME!CPU!06c488b0-2b95-40cf-96c5-3925a01cfd7b' of type 'Downtime'.
[2021-01-05 10:53:12 +0100] information/Downtime: Triggering downtime 'HOSTNAME!Interface Usage!2cb19350-92b5-4fec-9f15-019bb8b228cd' for checkable 'HOSTNAME!Interface Usage'.
[2021-01-05 10:53:12 +0100] information/ConfigObjectUtility: Created and activated object 'HOSTNAME!Interface Usage!2cb19350-92b5-4fec-9f15-019bb8b228cd' of type 'Downtime'.
[2021-01-05 10:53:12 +0100] information/ConfigObjectUtility: Created and activated object 'HOSTNAME!ping4!3eded8b1-a01c-46ef-a3d2-3783b862e8aa' of type 'Downtime'.
[2021-01-05 10:53:13 +0100] information/Checkable: Checkable 'HOSTNAME' has 2 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2021-01-05 10:53:13 +0100] information/Notification: Sending 'DowntimeStart' notification 'HOSTNAME!DOWNTIMENAME' for user 'ICINGAUSER'
[2021-01-05 10:53:13 +0100] information/Notification: Completed sending 'DowntimeStart' notification 'HOSTNAME!DOWNTIMENAME' for checkable 'HOSTNAME' and user 'ICINGAUSER' using command 'DATEVitm-notification'.
[2021-01-05 10:53:17 +0100] information/Downtime: Triggering downtime 'HOSTNAME!ab06acac-fad3-48d7-bd9c-8fd1991bd53c' for checkable 'HOSTNAME'.
[2021-01-05 10:53:17 +0100] information/Checkable: Checkable 'HOSTNAME!ping4' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2021-01-05 10:53:17 +0100] information/Downtime: Triggering downtime 'HOSTNAME!ping4!3eded8b1-a01c-46ef-a3d2-3783b862e8aa' for checkable 'HOSTNAME!ping4'.
[2021-01-05 10:53:38 +0100] information/ConfigObject: Dumping program state to file '/var/lib/icinga2/icinga2.state'
[2021-01-05 10:57:22 +0100] information/Checkable: Checkable 'HOSTNAME!CPU' has 1 notification(s). Checking filters for type 'DowntimeEnd', sends will be logged.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!CPU!06c488b0-2b95-40cf-96c5-3925a01cfd7b' of type 'Downtime'.
[2021-01-05 10:57:22 +0100] information/Checkable: Checkable 'HOSTNAME!Interface Usage' has 1 notification(s). Checking filters for type 'DowntimeEnd', sends will be logged.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!Interface Usage!2cb19350-92b5-4fec-9f15-019bb8b228cd' of type 'Downtime'.
[2021-01-05 10:57:22 +0100] information/Checkable: Checkable 'HOSTNAME!ping4' has 1 notification(s). Checking filters for type 'DowntimeEnd', sends will be logged.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!ping4!3eded8b1-a01c-46ef-a3d2-3783b862e8aa' of type 'Downtime'.
[2021-01-05 10:57:22 +0100] information/Checkable: Checkable 'HOSTNAME' has 2 notification(s). Checking filters for type 'DowntimeEnd', sends will be logged.
[2021-01-05 10:57:22 +0100] information/Notification: Sending 'DowntimeEnd' notification 'HOSTNAME!DOWNTIMENAME' for user 'ICINGAUSER'
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!ab06acac-fad3-48d7-bd9c-8fd1991bd53c' of type 'Downtime'.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!Interface Usage!DATEVitm-netzwerkservice' of type 'Notification'.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!CPU!DATEVitm-netzwerkservice' of type 'Notification'.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!DOWNTIMENAME' of type 'Notification'.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!NOTIFICATIONNAME-HOST' of type 'Notification'.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!CPU' of type 'Service'.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!Interface Usage' of type 'Service'.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!ping4!NOTIFICATIONNAME-SERVICE' of type 'Notification'.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!ping4' of type 'Service'.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME' of type 'Host'.
[2021-01-05 10:57:22 +0100] critical/Application: Icinga 2 has terminated unexpectedly. Additional information can be found in '/var/log/icinga2/crash/report.1609840642.827869'
[2021-01-05 10:59:17 +0100] information/ApiListener: New client connection for identity 'MASTER01' from [x.x.x.x]:44222
[2021-01-05 10:59:27 +0100] information/ApiListener: New client connection for identity 'MASTER01' from [x.x.x.x]:44266
[2021-01-05 10:59:37 +0100] information/ApiListener: New client connection for identity 'MASTER01' from [x.x.x.x]:44294
[2021-01-05 10:59:47 +0100] information/ApiListener: New client connection for identity 'MASTER01' from [x.x.x.x]:44306
<...>
The same logfiles repeat every 10 seconds until manual icinga2 restart this morning. It seems like the 2nd master was not able to receive any data after this.
/var/log/icinga2/crash/report.1609840642.827869
Application version: 2.11.7-1
System information:
Platform: Red Hat Enterprise Linux Server
Platform version: 7.9 (Maipo)
Kernel: Linux
Kernel version: 3.10.0-1160.6.1.el7.x86_64
Architecture: x86_64
Build information:
Compiler: GNU 4.8.5
Build host: runner-hh8q3bz2-project-322-concurrent-0
Application information:
General paths:
Config directory: /etc/icinga2
Data directory: /var/lib/icinga2
Log directory: /var/log/icinga2
Cache directory: /var/cache/icinga2
Spool directory: /var/spool/icinga2
Run directory: /run/icinga2
Old paths (deprecated):
Installation root: /usr
Sysconf directory: /etc
Run directory (base): /run
Local state directory: /var
Internal paths:
Package data directory: /usr/share/icinga2
State path: /var/lib/icinga2/icinga2.state
Modified attributes path: /var/lib/icinga2/modified-attributes.conf
Objects path: /var/cache/icinga2/icinga2.debug
Vars path: /var/cache/icinga2/icinga2.vars
PID path: /run/icinga2/icinga2.pid
Stacktrace:
(0) libc.so.6: gsignal (+0x37) [0x7ff5bdfaa387]
(1) libc.so.6: abort (+0x148) [0x7ff5bdfaba78]
(2) libc.so.6: <unknown function> (+0x78ed7) [0x7ff5bdfeced7]
(3) libc.so.6: <unknown function> (+0x7f3e4) [0x7ff5bdff33e4]
(4) libc.so.6: <unknown function> (+0x814db) [0x7ff5bdff54db]
(5) icinga2: icinga::Notification::ExecuteNotificationHelper(icinga::NotificationType, boost::intrusive_ptr<icinga::User> const&, boost::intrusive_ptr<icinga::CheckResult> const&, bool, icinga::String const&, icinga::String const&) (+0x1df) [0xa31e7f]
(6) icinga2: std::_Function_handler<void (), std::_Bind<std::_Mem_fn<void (icinga::Notification::*)(icinga::NotificationType, boost::intrusive_ptr<icinga::User> const&, boost::intrusive_ptr<icinga::CheckResult> const&, bool, icinga::String const&, icinga::String const&)> (icinga::Notification*, icinga::NotificationType, boost::intrusive_ptr<icinga::User>, boost::intrusive_ptr<icinga::CheckResult>, bool, icinga::String, icinga::String)> >::_M_invoke(std::_Any_data const&) (+0x42) [0xa2bf22]
(7) /usr/lib64/icinga2/sbin/icinga2() [0x907b33]
(8) icinga2: boost::asio::detail::executor_op<boost::asio::detail::work_dispatcher<bool icinga::ThreadPool::Post<std::function<void ()> >(std::function<void ()>, icinga::SchedulerPolicy)::{lambda()#1}>, std::allocator<void>, boost::asio::detail::scheduler_operation>::do_complete(void*, std::allocator<void>*, boost::system::error_code const&, unsigned long) (+0x106) [0xbfed66]
(9) /usr/lib64/icinga2/sbin/icinga2() [0x6313e1]
(10) /usr/lib64/icinga2/sbin/icinga2() [0x631732]
(11) icinga2: boost_asio_detail_posix_thread_function (+0xf) [0x823d5f]
(12) libpthread.so.0: <unknown function> (+0x7ea5) [0x7ff5be349ea5]
(13) libc.so.6: clone (+0x6d) [0x7ff5be07296d]
***
* This would indicate a runtime problem or configuration error. If you believe this is a bug in Icinga 2
* please submit a bug report at https://github.com/Icinga/icinga2 and include this stack trace as well as any other
* information that might be useful in order to reproduce this problem.
***
/var/log/icinga2/error.log
*** Error in `/usr/lib64/icinga2/sbin/icinga2': corrupted size vs. prev_size: 0x00007ff4f404e630 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x7f3e4)[0x7ff5bdff33e4]
/lib64/libc.so.6(+0x814db)[0x7ff5bdff54db]
/usr/lib64/icinga2/sbin/icinga2(_ZN6icinga12Notification25ExecuteNotificationHelperENS_16NotificationTypeERKN5boost13intrusive_ptrINS_4UserEEERKNS3_INS_11CheckResultEEEbRKNS_6StringESE_+0x1df)[0xa31e7f]
/usr/lib64/icinga2/sbin/icinga2(_ZNSt17_Function_handlerIFvvESt5_BindIFSt7_Mem_fnIMN6icinga12NotificationEFvNS3_16NotificationTypeERKN5boost13intrusive_ptrINS3_4UserEEERKNS7_INS3_11CheckResultEEEbRKNS3_6StringESI_EEPS4_S5_S9_SD_bSG_SG_EE
E9_M_invokeERKSt9_Any_data+0x42)[0xa2bf22]
/usr/lib64/icinga2/sbin/icinga2[0x907b33]
/usr/lib64/icinga2/sbin/icinga2(_ZN5boost4asio6detail11executor_opINS1_15work_dispatcherIZN6icinga10ThreadPool4PostISt8functionIFvvEEEEbT_NS4_15SchedulerPolicyEEUlvE_EESaIvENS1_19scheduler_operationEE11do_completeEPvPSF_RKNS_6system10err
or_codeEm+0x106)[0xbfed66]
/usr/lib64/icinga2/sbin/icinga2[0x6313e1]
/usr/lib64/icinga2/sbin/icinga2[0x631732]
/usr/lib64/icinga2/sbin/icinga2(boost_asio_detail_posix_thread_function+0xf)[0x823d5f]
/lib64/libpthread.so.0(+0x7ea5)[0x7ff5be349ea5]
/lib64/libc.so.6(clone+0x6d)[0x7ff5be07296d]
======= Memory map: ========
<...>
Caught SIGABRT.
Current time: 2021-01-05 10:57:22 +0100
To Reproduce
A coleague of mine created and deleted hosts (config objects) via API.
Expected behavior
Icinga should survive this.
Your Environment
Include as many relevant details about the environment you experienced the problem in
- Version used (
icinga2 --version
):
2.11.7
- Operating System and version:
RHEL 7.9
- Enabled features (
icinga2 feature list
):
Disabled features: command compatlog debuglog elasticsearch gelf influxdb livestatus opentsdb statusdata syslog
Enabled features: api checker graphite ido-mysql mainlog notification perfdata
- Icinga Web 2 version and modules (System - About):
Icinga Web 2 Version
2.7.4
Git Commit
3ebfb5c38330d22966d24d2ace57d7eb3fb2a6db
PHP Version
7.1.8
Git Commit Datum
2020-08-17
Copyright
© 2013-2021 Das Icinga Projekt
- Config validation (
icinga2 daemon -C
):
[2021-01-07 08:31:46 +0100] information/cli: Icinga application loader (version: 2.11.7-1)
[2021-01-07 08:31:46 +0100] information/cli: Loading configuration file(s).
[2021-01-07 08:31:46 +0100] information/ConfigItem: Committing config item(s).
[2021-01-07 08:31:46 +0100] information/ApiListener: My API identity: MASTER02
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 2 ScheduledDowntimes.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 37 HostGroups.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 1 FileLogger.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 1 NotificationCommand.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 1 NotificationComponent.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 292 Notifications.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 1 IcingaApplication.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 32 Hosts.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 13 Downtimes.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 1 ApiListener.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 1 GraphiteWriter.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 1 PerfdataWriter.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 1 CheckerComponent.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 9 Zones.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 9 Endpoints.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 5 ApiUsers.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 1 IdoMysqlConnection.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 242 CheckCommands.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 3 TimePeriods.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 1 User.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 248 Services.
[2021-01-07 08:31:46 +0100] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2021-01-07 08:31:46 +0100] information/cli: Finished validating the configuration file(s).
Additional context
I think this is related: the same coleague, the same tests :) #8583
This one here had a different behaviour, so I made a seperate issue.