Skip to content

[-]: fix race conditions in 'fileMetricReader', fixes #1198#1199

Open
gemy26 wants to merge 4 commits intocybertec-postgresql:masterfrom
gemy26:fix-yaml-metrics-race-condition
Open

[-]: fix race conditions in 'fileMetricReader', fixes #1198#1199
gemy26 wants to merge 4 commits intocybertec-postgresql:masterfrom
gemy26:fix-yaml-metrics-race-condition

Conversation

@gemy26
Copy link

@gemy26 gemy26 commented Feb 11, 2026

This PR fixes #1198

  • Add sync.Mutex to fileMetricReader struct.
  • used mutex locks with all operations (Create/Update/Delete).
  • Add concurrent test cases.

Copy link
Collaborator

@0xgouda 0xgouda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to duplicate mu.Lock() mu.Unlock() everywhere we only need it around the critical section in the shared fmr.WriteMetrics()

Also, we should use RWLock instead, and lock the critical section in getMetrics() while reading.

Also, please do the same for yaml sources handler

@0xgouda 0xgouda self-assigned this Feb 11, 2026
@0xgouda 0xgouda added bug Something isn't working metrics Metrics related issues sources What sources and in what way to monitor labels Feb 11, 2026
@gemy26
Copy link
Author

gemy26 commented Feb 12, 2026

Thanks for reply,
If we only lock inside the read and write functions, we can still have a race condition. in case of multiple goroutines may call CreateMetric (or any other operations) at the same time Each goroutine would : GetMetrics() -> modify -> WriteMetrics(). Even if GetMetrics and WriteMetrics are protected, the sequence is not atomic. Two goroutines could read the same initial state, apply different changes, and then one write would overwrite the other, causing data loss.
Did i miss something ?

@0xgouda
Copy link
Collaborator

0xgouda commented Feb 14, 2026

Oh yeah, you are right, sorry for that, I got a bit confused.

But we still should use RWLocks.

@gemy26
Copy link
Author

gemy26 commented Feb 14, 2026

I applied the changes and fixed sources yaml also I implemented getMetricsNoLock/getSourcesNoLock to be able to use RWLock and avoid deadlock in write operations.

Without it: CreateMetric acquires Lock -> Calls GetMetrics() which tries to acquire RLock() = Deadlock.

Please let me know if there's a better approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working metrics Metrics related issues sources What sources and in what way to monitor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Race condition in YAML metrics file operations causes data loss under concurrent access

2 participants