Skip to content

Event resending: Performance Issues with large number of publications (Batching, Indexing, OoM) #1148

Open
@michalkopacz

Description

@michalkopacz

Hi Spring Modulith Team,

I've tried to use Spring Modulith's event publication mechanism (JPA outbox pattern) and have encountered performance challenges when dealing with a large backlog of unpublished events.

Environment:

Spring Boot Version: 3.4.4
Spring Modulith Version: 1.3.1
Database: PostgreSQL 15
Java Version: 21

Problem Description:

When an external system consuming events is temporarily unavailable, a significant number of events can accumulate in the event_publication table. When the system recovers and Spring Modulith attempts to publish this backlog, I've observed the following issues:

  1. Inefficient Processing: The mechanism seems to fetch and process all incomplete publications individually. This is inefficient for clearing a large backlog quickly.
  2. Potential OutOfMemory Errors: Fetching a large number of incomplete publications might lead to OoM errors if the implementation tries to load too many event details into memory simultaneously.
  3. Missing Index: The default PostgreSQL schema appears to lack an optimal index for the JpaEventPublicationRepository.findIncompletePublicationsOlderThan(...) query. This can cause performance degradation (e.g., slow queries, high DB CPU) when the event_publication table grows significantly. An index covering (completion_date, publication_date) might be necessary.

Requested Enhancements:

  1. Implement configurable batch processing for fetching and publishing incomplete events (e.g., using pagination or streaming) to improve throughput and prevent OoM errors during backlog processing.
  2. Review and add an optimized index to the default PostgreSQL schema to support efficient querying of incomplete publications by findIncompletePublicationsOlderThan.

I believe these changes would significantly improve the robustness and performance of the event publication feature in recovery scenarios.

I am also willing to contribute to implementing these changes if the team is open to external contributions in these areas.

Thanks for considering this issue!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions