Skip to content

Dragonfly becomes unresponsive during full sync #4787

@arkorwan

Description

@arkorwan

Describe the bug

We run Dragonfly in a 2-node master-replica setup. During full sync, the master node become largely unresponsive.

We have reported this before one year ago: https://dragonfly.discourse.group/t/unresponsive-during-full-sync/135. The issue disappeared in v1.15.0, but we started seeing it again since v1.19.0. Last version we've tried is v1.25.5, still experiencing the problem.

To Reproduce

  1. Prepare two dragonfly instances. (We use two dockerized instances in our minimal reproducible setup)
  2. Put in some sizable number of keys to one instance.
  3. Generate a constant load with a mix of MGET and SET operations.
  4. Start full-sync by making the second instance to be a replica of the first.
  5. Observe throughput drops and response time skyrockets.

We have posted the script to reproduce before in the discourse link.

Expected behavior

Full sync should not have this much impact to the master node.

Screenshots

Version with no issue (1.15.0). Full sync happened right in the middle but it's not really noticeable.
Image

v1.14.5
Image

v1.25.5
Image

Environment (please complete the following information):

  • OS: Ubuntu 22.04.5 LTS
  • Kernel: Linux 5.15.0-134-generic # 145-Ubuntu SMP Wed Feb 12 20:08:39 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
  • Containerized?: Bare Metal
  • Dragonfly Version: seeing problem in v1.19.2, 1.20.1, 1.25.5. Not seeing the problem between v1.15.0 - v1.18.1

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions