Support index cleaner for rollover indices and add integration tests #1689

pavolloffay · 2019-07-25T13:42:24Z

Resolves #1681
Resolves #1682

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

codecov · 2019-07-25T15:58:49Z

Codecov Report

Merging #1689 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #1689   +/-   ##
=======================================
  Coverage   98.49%   98.49%           
=======================================
  Files         193      193           
  Lines        9286     9286           
=======================================
  Hits         9146     9146           
  Misses        111      111           
  Partials       29       29

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7f1daf2...5f28d63. Read the comment docs.

pavolloffay · 2019-07-25T16:27:45Z

@objectiser could you please have a look?

objectiser

LGTM - just a couple of questions/comments.

objectiser · 2019-07-26T09:04:51Z

plugin/storage/es/esCleaner.py

@@ -15,7 +15,8 @@ def main():
        print('HOSTNAME ... specifies which Elasticsearch hosts URL to search and delete indices from.')
        print('TIMEOUT ...  number of seconds to wait for master node response.'.format(TIMEOUT))
        print('INDEX_PREFIX ... specifies index prefix.')
-        print('ARCHIVE ... specifies whether to remove archive indices (default false).')
+        print('ARCHIVE ... specifies whether to remove archive indices (only works for rollover) (default false).')


Is this a change in behaviour, or did it never work for archive (with daily indices)?

It is just a comment change.

It worked and works for archive - but it is only supported for archive managed by rollover. If there is one archive we do not touch it.

plugin/storage/es/esCleaner.py

objectiser · 2019-07-26T09:08:49Z

plugin/storage/es/esCleaner.py

@@ -60,13 +64,36 @@ def main():
 def filter_main_indices(ilo, prefix):
    ilo.filter_by_regex(kind='prefix', value=prefix + "jaeger")
    empty_list(ilo, "No indices to delete")
+    ilo.filter_by_alias(aliases=[prefix + 'jaeger-span-read'], exclude=True)


Why are these included, aren't they related to rollover?

They are excluded, see the last parameter

Sorry poor use of words, I mean why are these lines included/added - as they seem to relate to aliases related to the rollover approach. From looking at the tests, it seems like it is so that existing aliases (from rollover) are not touched - but that would imply the user has gone from a rollover environment back to using daily indices?

Yes, I have simplified this in the last commit. Now there is a regex which matches exactly the daily inidices so no need to remove rollover indices afterwards.

objectiser · 2019-07-26T09:18:34Z

plugin/storage/es/esCleaner.py

    # This excludes archive index as we use source='name'
    # source `creation_date` would include archive index
    ilo.filter_by_age(source='name', direction='older', timestring='%Y-%m-%d', unit='days', unit_count=int(sys.argv[1]))


+def filter_main_indices_rollover(ilo, prefix):
+    ilo.filter_by_alias(aliases=[prefix + 'jaeger-*-read'])
+    empty_list(ilo, "No indices to delete")


Are these required after each filter, or just at the end? Does it cause problems calling the filter_by_alias on an empty list?

Yes it's required otherwise it fails on empty cluster

objectiser · 2019-07-26T09:21:34Z

plugin/storage/es/esCleaner.py

+    # This excludes archive index as we use source='name'
+    # source `creation_date` would include archive index
+    # TODO it might be useful to allow filter_by_space
+    ilo.filter_by_age(source='creation_date', direction='older', unit='days', unit_count=int(sys.argv[1]))


Just wondering about a particular scenario - if rollover criteria is based on index size (for example) in low activity environment, it might be possible that the same index is in use for more days that the unit_count, in which case it would remove the index - even though it is in current use?

I think you are right removing by size does not really make sense.

plugin/storage/integration/es_index_cleaner_test.go

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

objectiser · 2019-07-26T13:53:04Z

plugin/storage/es/esCleaner.py

+    ilo.filter_by_regex(kind='regex', value=prefix + "jaeger-(span|service)-\d{6}")
+    empty_list(ilo, "No indices to delete")
+    # do not remove active write indices
+    ilo.filter_by_alias(aliases=[prefix + 'jaeger-span-write'], exclude=True)


If filtering by the regex now, are these two aliases filters required?

See the comment above. We need to exclude current active write indices.

objectiser · 2019-07-26T13:55:33Z

plugin/storage/es/esCleaner.py

+
+
+def filter_archive_indices_rollover(ilo, prefix):
+    # Remove only rollover archive indices
    # Do not remove active write archive index
    ilo.filter_by_alias(aliases=[prefix + 'jaeger-span-archive-write'], exclude=True)


Would it be better to filter by regex here aswell to be consistent with the change in filter_main_indices_rollover?

unfortunatelly this is not filter_by_regex method

Not sure what happened - the comment should have been associated with two lines down - so changing the ilo.filter_by_alias(aliases=[prefix + 'jaeger-span-archive-read']) to be ilo.filter_by_regex(kind='regex', value=prefix + "jaeger-(span|service)-\d{6}") ?

Sorry, I do not understand what you mean.

we use regex to find indices which should be removed

but we also need to exclude indices which are associated with write alias - to do that we use filter for an alias.

Just mean that filter_main_indices and filter_main_indices_rollover are both using filter_by_regex to identify the indices to be removed - whereas filter_archive_indices_rollover uses ilo.filter_by_alias(aliases=[prefix + 'jaeger-span-archive-read']) - just suggesting that it would be better if filter_archive_indices_rollover also uses filter_by_regex for consistency.

Yes, that sounds better #1693

pavolloffay requested review from black-adder, jpkrohling, objectiser, tiffon, vprithvi and yurishkuro as code owners July 25, 2019 13:42

pavolloffay changed the title ~~Index cleaner rollover itest~~ Support index cleaner for rollover indices and add integration tests Jul 25, 2019

pavolloffay added the storage/elasticsearch label Jul 25, 2019

Support index cleaning for rollover

83e2621

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

pavolloffay force-pushed the index-cleaner-rollover-itest branch from 7f5b45b to 83e2621 Compare July 25, 2019 15:58

objectiser reviewed Jul 26, 2019

View reviewed changes

Simplify regexes

5f28d63

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

pavolloffay merged commit 5cd5752 into jaegertracing:master Jul 26, 2019

objectiser reviewed Jul 26, 2019

View reviewed changes

pavolloffay mentioned this pull request Jul 26, 2019

Use find by regex for archive index in index cleaner #1693

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support index cleaner for rollover indices and add integration tests #1689

Support index cleaner for rollover indices and add integration tests #1689

pavolloffay commented Jul 25, 2019

codecov bot commented Jul 25, 2019 •

edited

Loading

pavolloffay commented Jul 25, 2019

objectiser left a comment

objectiser Jul 26, 2019

pavolloffay Jul 26, 2019

objectiser Jul 26, 2019

pavolloffay Jul 26, 2019

objectiser Jul 26, 2019

pavolloffay Jul 26, 2019

objectiser Jul 26, 2019

pavolloffay Jul 26, 2019

objectiser Jul 26, 2019

pavolloffay Jul 26, 2019

objectiser Jul 26, 2019

pavolloffay Jul 26, 2019

objectiser Jul 26, 2019

pavolloffay Jul 26, 2019

objectiser Jul 26, 2019

pavolloffay Jul 26, 2019

objectiser Jul 26, 2019

pavolloffay Jul 26, 2019

Support index cleaner for rollover indices and add integration tests #1689

Support index cleaner for rollover indices and add integration tests #1689

Conversation

pavolloffay commented Jul 25, 2019

codecov bot commented Jul 25, 2019 • edited Loading

Codecov Report

pavolloffay commented Jul 25, 2019

objectiser left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jul 25, 2019 •

edited

Loading