Skip to content

Join memberlist on starting with no retry #4804

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 28, 2022

Conversation

danielblando
Copy link
Contributor

Signed-off-by: Daniel Blando ddeluigg@amazon.com

What this PR does:
Add logic to join memberlist while starting also avoiding 'empty ring'. On startup we dont have retry or failure to not block start of service. We leave these responsibilities to running as it was before.

Which issue(s) this PR fixes:
Fixes #4798

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@danielblando danielblando force-pushed the joinMemberlist branch 3 times, most recently from 0cafdb9 to d22d172 Compare July 28, 2022 01:25
@danielblando danielblando marked this pull request as ready for review July 28, 2022 02:08
Signed-off-by: Daniel Blando <ddeluigg@amazon.com>
@alanprot
Copy link
Member

Thanks for working on this!!

@alvinlin123 alvinlin123 merged commit 004742c into cortexproject:master Jul 28, 2022
alexqyle pushed a commit to alexqyle/cortex that referenced this pull request Aug 2, 2022
Signed-off-by: Daniel Blando <ddeluigg@amazon.com>
alanprot added a commit that referenced this pull request Sep 3, 2022
* Introduced lock file to shuffle sharding grouper

Signed-off-by: Alex Le <leqiyue@amazon.com>

* let redis cache logs log with context (#4785)

* let redis cache logs log with context

Signed-off-by: Mengmeng Yang <mengmengyang616@gmail.com>

* fix import

Signed-off-by: Mengmeng Yang <mengmengyang616@gmail.com>
Signed-off-by: Alex Le <leqiyue@amazon.com>

* DoBatch preference to 4xx if error (#4783)

* DoBatch preference to 4xx if error

Signed-off-by: Daniel Blando <ddeluigg@amazon.com>

* Fix comment

Signed-off-by: Daniel Blando <ddeluigg@amazon.com>
Signed-off-by: Alex Le <leqiyue@amazon.com>

* Updated CHANGELOG and ordered imports

Signed-off-by: Alex Le <leqiyue@amazon.com>

* Fixed lint and removed groupCallLimit

Signed-off-by: Alex Le <leqiyue@amazon.com>

* Changed lock file to json format and make sure planner would not pick up group that is locked by other compactor

Signed-off-by: Alex Le <leqiyue@amazon.com>

* Fix updateCachedShippedBlocks - new thanos (#4806)

Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: Alex Le <leqiyue@amazon.com>

* Join memberlist on starting with no retry (#4804)

Signed-off-by: Daniel Blando <ddeluigg@amazon.com>

* Fix alertmanager log message (#4801)

Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>
Signed-off-by: Alex Le <leqiyue@amazon.com>

* Grafana Cloud uses Mimir now, so remove Grafana Cloud as hosted service in documents (#4809)

* Grafana Cloud uses Mimir, for of Cortex, now

Signed-off-by: Alvin Lin <alvinlin123@gmail.com>

* Improve doc

Signed-off-by: Alvin Lin <alvinlin@amazon.com>
Signed-off-by: Alex Le <leqiyue@amazon.com>

* Created block_locker to handle all block lock file operations. Added block lock metrics.

Signed-off-by: Alex Le <leqiyue@amazon.com>

* Moved lock file heart beat into planner and refined planner logic to make sure blocks are locked by current compactor

Signed-off-by: Alex Le <leqiyue@amazon.com>

* Updated documents

Signed-off-by: Alex Le <leqiyue@amazon.com>

* Return concurrency number of group. Use ticker for lock file heart beat

Signed-off-by: Alex Le <leqiyue@amazon.com>

* Renamed lock file to be visit marker file

Signed-off-by: Alex Le <leqiyue@amazon.com>

* Fixed unit test

Signed-off-by: Alex Le <leqiyue@amazon.com>

* Make sure visited block can be picked by compactor visited it

Signed-off-by: Alex Le <leqiyue@amazon.com>

Signed-off-by: Alex Le <leqiyue@amazon.com>
Signed-off-by: Mengmeng Yang <mengmengyang616@gmail.com>
Signed-off-by: Daniel Blando <ddeluigg@amazon.com>
Signed-off-by: Alan Protasio <approtas@amazon.com>
Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>
Signed-off-by: Alvin Lin <alvinlin@amazon.com>
Signed-off-by: Alex Le <emoc1989@gmail.com>
Co-authored-by: Mengmeng Yang <mengmengyang616@gmail.com>
Co-authored-by: Daniel Blando <ddeluigg@amazon.com>
Co-authored-by: Alan Protasio <approtas@amazon.com>
Co-authored-by: Xiaochao Dong <the.xcdong@gmail.com>
Co-authored-by: Alvin Lin <alvinlin@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Follow up of #4068: Fixed race condition causing queries to fail right after querier startup with the "empty ring" error + Memberlist
3 participants