Skip to content

IdoPgsqlConnection: blocks Icinga2 forever on shutdown #9263

Open
@yhabteab

Description

@yhabteab

Describe the bug

When you enable ido-pgsql without a proper database, IdoPgsqlConnection is going to block Icinga2 on shutdown indefinitely sometimes. Looking a bit in the logs, you can see that the connection is being tried to be paused in DbConnection but is getting stuck somewhere in the Reconnect handlers and will never get properly paused, as it is always trying to reconnect in DbConnection::Pause() in order to update the program status. I haven't tested it, but since the codebases are all the same, I guess that in mysql conn does the same thing too.

[2022-02-28 08:26:32 +0100] information/Application: Received request to shut down.
[2022-02-28 08:26:32 +0100] information/Application: Shutting down...
[2022-02-28 08:26:32 +0100] information/CheckerComponent: 'checker' stopped.
[2022-02-28 08:26:32 +0100] information/NotificationComponent: 'notification' stopped.
[2022-02-28 08:26:32 +0100] information/DbConnection: Pausing IDO connection: ido-pgsql
[2022-02-28 08:26:32 +0100] critical/IdoPgsqlConnection: Connection to database 'icinga2' with user 'postgres' on '10.211.55.31:5432' failed: "connection to server at "10.211.55.31", port 5432 failed: Connection refused
        Is the server running on that host and accepting TCP/IP connections?
"
Context:
        (0) Reconnecting to PostgreSQL IDO database 'ido-pgsql'

[2022-02-28 08:26:32 +0100] warning/IdoPgsqlConnection: Exception during database operation: Verify that your database is operational!

If it doesn't introduce any other side effects, which I'm unsure of (as the WorkQueue threads aren't joining, but when it didn't establish a connection before, why should we update the program status at all and let the threads join normally?), it would help the following check here.

if (!GetConnected()) {
	ConfigObject::Pause();
	return;
}

Additional context

Though, if the threads would join normally, we would have pretty much the same result, since all the QueryQueue handlers would obviously try to establish a new connection.

frame #0: 0x0000000121026947 dyld`dyld3::MachOLoaded::findClosestSymbol(unsigned long long, char const**, unsigned long long*) const + 563
frame #1: 0x000000012101515a dyld`dyld4::APIs::dladdr(void const*, dl_info*) + 206
frame #2: 0x000000010869447b icinga2`boost::stacktrace::frame::name(this=0x0000700001e75ed0) const at frame_unwind.ipp:94:26
frame #3: 0x00000001086941a2 icinga2`boost::stacktrace::detail::to_string_using_nothing::prepare_function_name(this=0x0000700001e75ff8, addr=0x00000001087a9898) at unwind_base_impls.hpp:23:46
frame #4: 0x0000000108693ead icinga2`boost::stacktrace::detail::to_string_impl_base<boost::stacktrace::detail::to_string_using_nothing>::operator(this=0x0000700001e75ff8, addr=0x00000001087a9898)(void const*) at frame_unwind.ipp:39:15
frame #5: 0x0000000108693a74 icinga2`boost::stacktrace::detail::to_string(frames=0x00007fae181858b0, size=16) at frame_unwind.ipp:76:16
frame #6: 0x00000001086931b1 icinga2`std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > boost::stacktrace::to_string<std::__1::allocator<boost::stacktrace::frame> >(bt=0x00007fae1817e4b8) at stacktrace.hpp:404:12
frame #7: 0x00000001086930e9 icinga2`std::__1::basic_ostream<char, std::__1::char_traits<char> >& boost::stacktrace::operator<<<char, std::__1::char_traits<char>, std::__1::allocator<boost::stacktrace::frame> >(os=0x0000700001e76448, bt=0x00007fae1817e4b8) at stacktrace.hpp:410:18
frame #8: 0x00000001086930a8 icinga2`icinga::operator<<(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, icinga::StackTraceFormatter const&)(os=0x0000700001e76448, f=0x0000700001e76288) at stacktrace.cpp:39:5
frame #9: 0x00000001084c608c icinga2`icinga::DiagnosticInformation(ex=0x00007fae18192970, verbose=true, stack=0x0000700001e76758, context=0x0000700001e76728) at exception.cpp:258:32
frame #10: 0x00000001084c68e0 icinga2`icinga::DiagnosticInformation(eptr=0x0000700001e76ae8, verbose=true) at exception.cpp:298:10
frame #11: 0x000000010961c53a icinga2`icinga::IdoPgsqlConnection::ExceptionHandler(this=0x00007fae1900e000, exp=exception_ptr @ 0x0000700001e76ae8) at idopgsqlconnection.cpp:126:49
frame #12: 0x0000000109655d04 icinga2`icinga::IdoPgsqlConnection::Resume(this=0x00007fae1900e488, exp=<unavailable>)::$_13::operator()(boost::exception_ptr) const at idopgsqlconnection.cpp:90:71
frame #13: 0x0000000109655c73 icinga2`decltype(__f=0x00007fae1900e488, __args=0x0000700001e76c68)::$_13&>(fp)(std::__1::forward<boost::exception_ptr>(fp0))) std::__1::__invoke<icinga::IdoPgsqlConnection::Resume()::$_13&, boost::exception_ptr>(icinga::IdoPgsqlConnection::Resume()::$_13&, boost::exception_ptr&&) at type_traits:3694:1
frame #14: 0x0000000109655c02 icinga2`void std::__1::__invoke_void_return_wrapper<void, true>::__call<icinga::IdoPgsqlConnection::Resume(__args=0x00007fae1900e488, __args=0x0000700001e76c68)::$_13&, boost::exception_ptr>(icinga::IdoPgsqlConnection::Resume()::$_13&, boost::exception_ptr&&) at __functional_base:348:9
frame #15: 0x0000000109655bb2 icinga2`std::__1::__function::__alloc_func<icinga::IdoPgsqlConnection::Resume()::$_13, std::__1::allocator<icinga::IdoPgsqlConnection::Resume()::$_13>, void (boost::exception_ptr)>::operator(this=0x00007fae1900e488, __arg=0x0000700001e76c68)(boost::exception_ptr&&) at functional:1558:16
frame #16: 0x00000001096548c1 icinga2`std::__1::__function::__func<icinga::IdoPgsqlConnection::Resume()::$_13, std::__1::allocator<icinga::IdoPgsqlConnection::Resume()::$_13>, void (boost::exception_ptr)>::operator(this=0x00007fae1900e480, __arg=0x0000700001e76c68)(boost::exception_ptr&&) at functional:1732:12
frame #17: 0x00000001087af79a icinga2`std::__1::__function::__value_func<void (boost::exception_ptr)>::operator(this=0x00007fae1900e480, __args=0x0000700001e76c68)(boost::exception_ptr&&) const at functional:1885:16
frame #18: 0x0000000108788210 icinga2`std::__1::function<void (boost::exception_ptr)>::operator(this= Lambda in File idopgsqlconnection.cpp at Line 90, __arg=<unavailable>)(boost::exception_ptr) const at functional:2560:12
frame #19: 0x0000000108787f90 icinga2`icinga::WorkQueue::RunTaskFunction(this=0x00007fae1900e180, func= Lambda in File idopgsqlconnection.cpp at Line 183)> const&) at workqueue.cpp:248:4
frame #20: 0x0000000108788554 icinga2`icinga::WorkQueue::WorkerThreadProc(this=0x00007fae1900e180) at workqueue.cpp:279:3
frame #21: 0x00000001087a9898 icinga2`icinga::WorkQueue::EnqueueUnlocked(this=0x00007fae186f7190)>&&, icinga::WorkQueuePriority)::$_1::operator()() const at workqueue.cpp:63:39
frame #22: 0x00000001087a918c icinga2`boost::detail::thread_data<icinga::WorkQueue::EnqueueUnlocked(std::__1::unique_lock<std::__1::mutex>&, std::__1::function<void ()>&&, icinga::WorkQueuePriority)::$_1>::run(this=0x00007fae186f7040) at thread.hpp:120:17
frame #23: 0x0000000113a8ef51 libboost_thread-mt.dylib`boost::(anonymous namespace)::thread_proxy(void*) + 129
frame #24: 0x00007ff80de954f4 libsystem_pthread.dylib`_pthread_start + 125
frame #25: 0x00007ff80de9100f libsystem_pthread.dylib`thread_start + 15

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/db-idoDatabase outputbugSomething isn't working

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions