Skip to content

Icinga has terminated unexpectatedly after API interaction #8587

Closed
@K0nne

Description

@K0nne

Describe the bug

Hi!

Today I was wondering about a strange "backend unavailable" flapping in icingaweb2 in our qs stage.
Upon investigating I found out, that our secondary master had died:

/var/log/icinga2.log

[2021-01-05 10:53:12 +0100] information/ConfigObjectUtility: Created and activated object 'HOSTNAME!ab06acac-fad3-48d7-bd9c-8fd1991bd53c' of type 'Downtime'.
[2021-01-05 10:53:12 +0100] information/Downtime: Triggering downtime 'HOSTNAME!CPU!06c488b0-2b95-40cf-96c5-3925a01cfd7b' for checkable 'HOSTNAME!CPU'.
[2021-01-05 10:53:12 +0100] information/ConfigObjectUtility: Created and activated object 'HOSTNAME!CPU!06c488b0-2b95-40cf-96c5-3925a01cfd7b' of type 'Downtime'.
[2021-01-05 10:53:12 +0100] information/Downtime: Triggering downtime 'HOSTNAME!Interface Usage!2cb19350-92b5-4fec-9f15-019bb8b228cd' for checkable 'HOSTNAME!Interface Usage'.
[2021-01-05 10:53:12 +0100] information/ConfigObjectUtility: Created and activated object 'HOSTNAME!Interface Usage!2cb19350-92b5-4fec-9f15-019bb8b228cd' of type 'Downtime'.
[2021-01-05 10:53:12 +0100] information/ConfigObjectUtility: Created and activated object 'HOSTNAME!ping4!3eded8b1-a01c-46ef-a3d2-3783b862e8aa' of type 'Downtime'.
[2021-01-05 10:53:13 +0100] information/Checkable: Checkable 'HOSTNAME' has 2 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2021-01-05 10:53:13 +0100] information/Notification: Sending 'DowntimeStart' notification 'HOSTNAME!DOWNTIMENAME' for user 'ICINGAUSER'
[2021-01-05 10:53:13 +0100] information/Notification: Completed sending 'DowntimeStart' notification 'HOSTNAME!DOWNTIMENAME' for checkable 'HOSTNAME' and user 'ICINGAUSER' using command 'DATEVitm-notification'.
[2021-01-05 10:53:17 +0100] information/Downtime: Triggering downtime 'HOSTNAME!ab06acac-fad3-48d7-bd9c-8fd1991bd53c' for checkable 'HOSTNAME'.
[2021-01-05 10:53:17 +0100] information/Checkable: Checkable 'HOSTNAME!ping4' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2021-01-05 10:53:17 +0100] information/Downtime: Triggering downtime 'HOSTNAME!ping4!3eded8b1-a01c-46ef-a3d2-3783b862e8aa' for checkable 'HOSTNAME!ping4'.
[2021-01-05 10:53:38 +0100] information/ConfigObject: Dumping program state to file '/var/lib/icinga2/icinga2.state'

[2021-01-05 10:57:22 +0100] information/Checkable: Checkable 'HOSTNAME!CPU' has 1 notification(s). Checking filters for type 'DowntimeEnd', sends will be logged.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!CPU!06c488b0-2b95-40cf-96c5-3925a01cfd7b' of type 'Downtime'.
[2021-01-05 10:57:22 +0100] information/Checkable: Checkable 'HOSTNAME!Interface Usage' has 1 notification(s). Checking filters for type 'DowntimeEnd', sends will be logged.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!Interface Usage!2cb19350-92b5-4fec-9f15-019bb8b228cd' of type 'Downtime'.
[2021-01-05 10:57:22 +0100] information/Checkable: Checkable 'HOSTNAME!ping4' has 1 notification(s). Checking filters for type 'DowntimeEnd', sends will be logged.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!ping4!3eded8b1-a01c-46ef-a3d2-3783b862e8aa' of type 'Downtime'.
[2021-01-05 10:57:22 +0100] information/Checkable: Checkable 'HOSTNAME' has 2 notification(s). Checking filters for type 'DowntimeEnd', sends will be logged.
[2021-01-05 10:57:22 +0100] information/Notification: Sending 'DowntimeEnd' notification 'HOSTNAME!DOWNTIMENAME' for user 'ICINGAUSER'
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!ab06acac-fad3-48d7-bd9c-8fd1991bd53c' of type 'Downtime'.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!Interface Usage!DATEVitm-netzwerkservice' of type 'Notification'.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!CPU!DATEVitm-netzwerkservice' of type 'Notification'.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!DOWNTIMENAME' of type 'Notification'.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!NOTIFICATIONNAME-HOST' of type 'Notification'.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!CPU' of type 'Service'.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!Interface Usage' of type 'Service'.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!ping4!NOTIFICATIONNAME-SERVICE' of type 'Notification'.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME!ping4' of type 'Service'.
[2021-01-05 10:57:22 +0100] information/ConfigObjectUtility: Deleted object 'HOSTNAME' of type 'Host'.
[2021-01-05 10:57:22 +0100] critical/Application: Icinga 2 has terminated unexpectedly. Additional information can be found in '/var/log/icinga2/crash/report.1609840642.827869'

[2021-01-05 10:59:17 +0100] information/ApiListener: New client connection for identity 'MASTER01' from [x.x.x.x]:44222
[2021-01-05 10:59:27 +0100] information/ApiListener: New client connection for identity 'MASTER01' from [x.x.x.x]:44266
[2021-01-05 10:59:37 +0100] information/ApiListener: New client connection for identity 'MASTER01' from [x.x.x.x]:44294
[2021-01-05 10:59:47 +0100] information/ApiListener: New client connection for identity 'MASTER01' from [x.x.x.x]:44306

<...>
The same logfiles repeat every 10 seconds until manual icinga2 restart this morning. It seems like the 2nd master was not able to receive any data after this.

/var/log/icinga2/crash/report.1609840642.827869

  Application version: 2.11.7-1

System information:
  Platform: Red Hat Enterprise Linux Server
  Platform version: 7.9 (Maipo)
  Kernel: Linux
  Kernel version: 3.10.0-1160.6.1.el7.x86_64
  Architecture: x86_64

Build information:
  Compiler: GNU 4.8.5
  Build host: runner-hh8q3bz2-project-322-concurrent-0

Application information:

General paths:
  Config directory: /etc/icinga2
  Data directory: /var/lib/icinga2
  Log directory: /var/log/icinga2
  Cache directory: /var/cache/icinga2
  Spool directory: /var/spool/icinga2
  Run directory: /run/icinga2

Old paths (deprecated):
  Installation root: /usr
  Sysconf directory: /etc
  Run directory (base): /run
  Local state directory: /var

Internal paths:
  Package data directory: /usr/share/icinga2
  State path: /var/lib/icinga2/icinga2.state
  Modified attributes path: /var/lib/icinga2/modified-attributes.conf
  Objects path: /var/cache/icinga2/icinga2.debug
  Vars path: /var/cache/icinga2/icinga2.vars
  PID path: /run/icinga2/icinga2.pid
Stacktrace:

        (0) libc.so.6: gsignal (+0x37) [0x7ff5bdfaa387]
        (1) libc.so.6: abort (+0x148) [0x7ff5bdfaba78]
        (2) libc.so.6: <unknown function> (+0x78ed7) [0x7ff5bdfeced7]
        (3) libc.so.6: <unknown function> (+0x7f3e4) [0x7ff5bdff33e4]
        (4) libc.so.6: <unknown function> (+0x814db) [0x7ff5bdff54db]
        (5) icinga2: icinga::Notification::ExecuteNotificationHelper(icinga::NotificationType, boost::intrusive_ptr<icinga::User> const&, boost::intrusive_ptr<icinga::CheckResult> const&, bool, icinga::String const&, icinga::String const&) (+0x1df) [0xa31e7f]
        (6) icinga2: std::_Function_handler<void (), std::_Bind<std::_Mem_fn<void (icinga::Notification::*)(icinga::NotificationType, boost::intrusive_ptr<icinga::User> const&, boost::intrusive_ptr<icinga::CheckResult> const&, bool, icinga::String const&, icinga::String const&)> (icinga::Notification*, icinga::NotificationType, boost::intrusive_ptr<icinga::User>, boost::intrusive_ptr<icinga::CheckResult>, bool, icinga::String, icinga::String)> >::_M_invoke(std::_Any_data const&) (+0x42) [0xa2bf22]
        (7) /usr/lib64/icinga2/sbin/icinga2() [0x907b33]
        (8) icinga2: boost::asio::detail::executor_op<boost::asio::detail::work_dispatcher<bool icinga::ThreadPool::Post<std::function<void ()> >(std::function<void ()>, icinga::SchedulerPolicy)::{lambda()#1}>, std::allocator<void>, boost::asio::detail::scheduler_operation>::do_complete(void*, std::allocator<void>*, boost::system::error_code const&, unsigned long) (+0x106) [0xbfed66]
        (9) /usr/lib64/icinga2/sbin/icinga2() [0x6313e1]
        (10) /usr/lib64/icinga2/sbin/icinga2() [0x631732]
        (11) icinga2: boost_asio_detail_posix_thread_function (+0xf) [0x823d5f]
        (12) libpthread.so.0: <unknown function> (+0x7ea5) [0x7ff5be349ea5]
        (13) libc.so.6: clone (+0x6d) [0x7ff5be07296d]

***
* This would indicate a runtime problem or configuration error. If you believe this is a bug in Icinga 2
* please submit a bug report at https://github.com/Icinga/icinga2 and include this stack trace as well as any other
* information that might be useful in order to reproduce this problem.
***

/var/log/icinga2/error.log

*** Error in `/usr/lib64/icinga2/sbin/icinga2': corrupted size vs. prev_size: 0x00007ff4f404e630 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x7f3e4)[0x7ff5bdff33e4]
/lib64/libc.so.6(+0x814db)[0x7ff5bdff54db]
/usr/lib64/icinga2/sbin/icinga2(_ZN6icinga12Notification25ExecuteNotificationHelperENS_16NotificationTypeERKN5boost13intrusive_ptrINS_4UserEEERKNS3_INS_11CheckResultEEEbRKNS_6StringESE_+0x1df)[0xa31e7f]
/usr/lib64/icinga2/sbin/icinga2(_ZNSt17_Function_handlerIFvvESt5_BindIFSt7_Mem_fnIMN6icinga12NotificationEFvNS3_16NotificationTypeERKN5boost13intrusive_ptrINS3_4UserEEERKNS7_INS3_11CheckResultEEEbRKNS3_6StringESI_EEPS4_S5_S9_SD_bSG_SG_EE
E9_M_invokeERKSt9_Any_data+0x42)[0xa2bf22]
/usr/lib64/icinga2/sbin/icinga2[0x907b33]
/usr/lib64/icinga2/sbin/icinga2(_ZN5boost4asio6detail11executor_opINS1_15work_dispatcherIZN6icinga10ThreadPool4PostISt8functionIFvvEEEEbT_NS4_15SchedulerPolicyEEUlvE_EESaIvENS1_19scheduler_operationEE11do_completeEPvPSF_RKNS_6system10err
or_codeEm+0x106)[0xbfed66]
/usr/lib64/icinga2/sbin/icinga2[0x6313e1]
/usr/lib64/icinga2/sbin/icinga2[0x631732]
/usr/lib64/icinga2/sbin/icinga2(boost_asio_detail_posix_thread_function+0xf)[0x823d5f]
/lib64/libpthread.so.0(+0x7ea5)[0x7ff5be349ea5]
/lib64/libc.so.6(clone+0x6d)[0x7ff5be07296d]
======= Memory map: ========
<...>
Caught SIGABRT.
Current time: 2021-01-05 10:57:22 +0100

To Reproduce

A coleague of mine created and deleted hosts (config objects) via API.

Expected behavior

Icinga should survive this.

Your Environment

Include as many relevant details about the environment you experienced the problem in

  • Version used (icinga2 --version):
    2.11.7
  • Operating System and version:
    RHEL 7.9
  • Enabled features (icinga2 feature list):
Disabled features: command compatlog debuglog elasticsearch gelf influxdb livestatus opentsdb statusdata syslog
Enabled features: api checker graphite ido-mysql mainlog notification perfdata
  • Icinga Web 2 version and modules (System - About):
Icinga Web 2 Version
    2.7.4
Git Commit
    3ebfb5c38330d22966d24d2ace57d7eb3fb2a6db
PHP Version
    7.1.8
Git Commit Datum
    2020-08-17
Copyright
    © 2013-2021 Das Icinga Projekt 
  • Config validation (icinga2 daemon -C):
[2021-01-07 08:31:46 +0100] information/cli: Icinga application loader (version: 2.11.7-1)
[2021-01-07 08:31:46 +0100] information/cli: Loading configuration file(s).
[2021-01-07 08:31:46 +0100] information/ConfigItem: Committing config item(s).
[2021-01-07 08:31:46 +0100] information/ApiListener: My API identity: MASTER02
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 2 ScheduledDowntimes.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 37 HostGroups.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 1 FileLogger.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 1 NotificationCommand.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 1 NotificationComponent.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 292 Notifications.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 1 IcingaApplication.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 32 Hosts.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 13 Downtimes.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 1 ApiListener.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 1 GraphiteWriter.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 1 PerfdataWriter.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 1 CheckerComponent.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 9 Zones.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 9 Endpoints.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 5 ApiUsers.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 1 IdoMysqlConnection.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 242 CheckCommands.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 3 TimePeriods.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 1 User.
[2021-01-07 08:31:46 +0100] information/ConfigItem: Instantiated 248 Services.
[2021-01-07 08:31:46 +0100] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2021-01-07 08:31:46 +0100] information/cli: Finished validating the configuration file(s).

Additional context

I think this is related: the same coleague, the same tests :) #8583
This one here had a different behaviour, so I made a seperate issue.

Metadata

Metadata

Assignees

Labels

area/distributedDistributed monitoring (master, satellites, clients)area/notificationsNotification eventscore/crashShouldn't happen, requires attention

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions