Skip to content

Conversation

zhangyue1818
Copy link

Fixes #ISSUE_Number

What does this PR do?

Type of Change

  • Bug fix (non-breaking change)
  • New feature (non-breaking change)
  • Breaking change (fix or feature with breaking changes)
  • Documentation update

Breaking Changes

Test Plan

  • Unit tests added/updated
  • Integration tests added/updated
  • Passed make installcheck
  • Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Additional Context

CI Skip Instructions


Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @zhangyue1818 welcome!🎊 Thanks for taking the effort to make our project better! 🙌 Keep making such awesome contributions!

xuebinsu and others added 7 commits September 23, 2025 13:19
This patch aims to fix the memory leak issue #10314 .

Previously, the IC history table got pruned only after begining a
new distributed transaction (dtx). As a result, when a transaction
contains many commands, or when a session never begins any
distributed transaction, the history will never get pruned. This can
cause memory leak.

This patch fixes this issue by reverting this part of code to the
last version before dependency on distributed transaction was
introduced. After that, we will only a constant-length IC history.
This ensures that

- The history table takes only constant amount of memory, and
- The history is long enough to handle mismatched packets.
This modernizes and refactors the UDP IC socket setup code somewhat,
drawing inspiration from pgstat_init(), where similar UDP socket setup
is performed.

Significant changes:

* Prefer pg_getaddrinfo_all() to getaddrinfo().
* Consistently use guarded logging for address retries.
* Simplify interface for sizing UDP IC socket send/receive buffers.
…ion (#16458)

Previous PR #13355 fixed issue #10314.
But it introduced a new bug: IC-UDP may hang forever in some scenarios (e.g. lots of IC instances in single one UDF), please see the discussion in #13411.

Enhancing it in this PR:
* for the non-read-only transaction, keep the previous logic (before PR-13355) to prevent the new bug.
* for the read-only transaction, introduce gp_interconnect_cursor_ic_table_size to config the size of Cursor History Table as a workaround.
`SendDummyPacket` eventually calls `getaddrinfo` (which is a reentrant),
however, `getaddrinfo` is not an async-signal-safe function.
`getaddrinfo` internally calls `malloc`, which is strongly advised to
not do within a signal handler as it may cause deadlocks.
Cache the accepted socket information for the listener, so that it can
be reused in `SendDummyPacket()`.

The purpose of `SendDummyPacket` is to exit more quickly; it
circumvents the polling that happens, which eventually times out after 250ms.

Without `SendDummyPacket()`, there will be multiple test failures since
some tests expects the backend connection to terminate almost
immediately.

To view all the async-signal-safe functions, please view
the signal-safety(7) — Linux manual page.

Reviewed-by: Soumyadeep Chakraborty <soumyadeep2007@gmail.com>
Reviewed-by: Andrew Repp <reppa@vmware.com>
Previously on commit 70306db18e2, we removed pg_getaddrinfo_all for
signal handlers. However, in doing so, the capability of supporting both
AF_INET6 and AF_INET was lost; this responsibility must now be handled
by us. The commit mentioned above fixed the issue for AF_INET (IPv4),
but not for AF_INET6 (IPv6).

This commit addresses the situation for both AF_INET and AF_INET6.

Reviewed-by: Soumyadeep Chakraborty <soumyadeep2007@gmail.com>
`sendControlMessage()` method has no retry attempts, this refactor abstracted a `sendto` system call wrapper with retry enabled.

Co-authored-by: zwenlin <zwenlin@vmware.com>
…nd packet" (#17164)

* Add hint message for MTU settings when IC reports ERROR "Failed to send packet".
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants