Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rollover support #267

Merged
merged 4 commits into from
Mar 6, 2019
Merged

Conversation

pavolloffay
Copy link
Member

Signed-off-by: Pavol Loffay ploffay@redhat.com

This at the moment does not work with self-provisioned ES due to curator settings: openshift/origin-aggregated-logging#1517

@jpkrohling
Copy link
Contributor

This change is Reviewable

@jpkrohling
Copy link
Contributor

This will conflict with #265. Please wait to work on this, let's get that one merged first.

@pavolloffay pavolloffay force-pushed the rollover branch 2 times, most recently from 58b6652 to 33b100a Compare March 6, 2019 10:31
@pavolloffay
Copy link
Member Author

Test CR

# setup an elasticsearch with `make es`
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: simple-prod
spec:
  strategy: production
  storage:
    type: elasticsearch
    options:
      es:
        server-urls: http://elasticsearch:9200
        use-aliases: true

Self-provisioned ES does not at the moment - my PR with SG config has been merged but the image is not yet available.

@pavolloffay pavolloffay changed the title WIP: Add rollover support Add rollover support Mar 6, 2019
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
@codecov
Copy link

codecov bot commented Mar 6, 2019

Codecov Report

Merging #267 into master will increase coverage by 0.11%.
The diff coverage is 88.34%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #267      +/-   ##
==========================================
+ Coverage   88.91%   89.02%   +0.11%     
==========================================
  Files          68       70       +2     
  Lines        2886     3089     +203     
==========================================
+ Hits         2566     2750     +184     
- Misses        211      226      +15     
- Partials      109      113       +4
Impacted Files Coverage Δ
pkg/apis/jaegertracing/v1/jaeger_types.go 100% <ø> (ø) ⬆️
pkg/strategy/all-in-one.go 94.59% <0%> (-5.41%) ⬇️
pkg/storage/dependency.go 100% <100%> (ø) ⬆️
pkg/strategy/controller.go 100% <100%> (ø) ⬆️
pkg/storage/elasticsearch_dependencies.go 100% <100%> (ø)
pkg/util/util.go 100% <100%> (ø) ⬆️
pkg/storage/elasticsearch_role.go 100% <100%> (ø) ⬆️
pkg/cronjob/es_index_cleaner.go 100% <100%> (+2.85%) ⬆️
pkg/storage/elasticsearch_secrets.go 100% <100%> (ø) ⬆️
pkg/strategy/production.go 68.57% <20%> (-6.84%) ⬇️
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fcb26f4...7324959. Read the comment docs.

Copy link
Contributor

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 21 of 21 files at r1.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @objectiser and @pavolloffay)


pkg/controller/jaeger/jaeger_controller.go, line 155 at r1 (raw file):

	// storage dependencies have to be deployed after ES is ready
	if err := r.handleDependencies(str); err != nil {
		jaeger.Logger().WithError(err).Error("failed to handle the dependencies")

We can get rid of this log entry then, otherwise we end up with two messages when this fails.


pkg/cronjob/es_rollover_test.go, line 15 at r1 (raw file):

func TestCreateRollover(t *testing.T) {
	cj := CreateRollover(v1.NewJaeger("pikachu"))

We are running out of names :) I wonder if 🦄 is a valid name


pkg/cronjob/es_rollover_test.go, line 42 at r1 (raw file):

	unitCount := 7
	j.Spec.Storage.Rollover.UnitCount = &unitCount
	j.Spec.Storage.Rollover.Unit = "minutes"

I was wondering what this unit/unit-count were, I guess I have an idea of what a unit and unit count is now. Have you considered accepting a duration instead of unit + unit_count? I think we have a duration for some other field (or we have a PR with it)


pkg/storage/dependency.go, line 17 at r1 (raw file):

	}
	if EnableRollover(jaeger.Spec.Storage) {
		return elasticsearchDependencies(jaeger)

It would probably make more sense to have a conditional like the above, and decide on the rollover inside the storage-specific function. Like:

if strings.EqualFold(jaeger.Spec.Storage.Type, "elasticsearch") {
		return elasticsearchDeps(jaeger)
}

pkg/strategy/production.go, line 94 at r1 (raw file):

	cDep := collector.Get()
	queryDep := inject.Sidecar(jaeger, inject.OAuthProxy(jaeger, query.Get()))
	c.dependencies = storage.Dependencies(jaeger)

Don't you end up with the rollover cron job two times here? One in the dependencies and one in the cronJobs?


pkg/util/util.go, line 127 at r1 (raw file):

	urlArr := strings.Split(urls, ",")
	if len(urlArr) == 0 {
		return ""

Could you check the test coverage? Looks like this line isn't being tested.

Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Copy link
Contributor

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 6 files at r2.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @objectiser and @pavolloffay)


deploy/examples/simple-prod.yaml, line 13 at r2 (raw file):

      es:
        server-urls: http://elasticsearch:9200
        use-aliases: true

The "simple" examples are supposed to be the simplest way to achieve something. If use-aliases isn't required to get a production setup with ES to work, then please leave it out.


pkg/cronjob/es_rollover.go, line 28 at r2 (raw file):

	}
	one := int32(1)
	ttlHourInSec := int32(60 * 60)

1 hour -- no idea whether this is too much or if it won't be enough in some cases. What happens when the TTL is reached? Will the job be killed? What if there's enough data that justifies a processing of, say, 2h ?


pkg/strategy/all-in-one.go, line 82 at r2 (raw file):

	}

	if storage.EnableRollover(jaeger.Spec.Storage) {

Isn't rollover something specific to Elasticsearch? If so, it doesn't belong to the strategy. I would argue that the same applies to the index cleaner, but we didn't catch that during the review that introduced it, so, no need to change that now.

Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Copy link
Member Author

@pavolloffay pavolloffay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 17 of 22 files reviewed, 7 unresolved discussions (waiting on @jpkrohling, @objectiser, and @pavolloffay)


pkg/controller/jaeger/jaeger_controller.go, line 155 at r1 (raw file):

Previously, jpkrohling (Juraci Paixão Kröhling) wrote…

We can get rid of this log entry then, otherwise we end up with two messages when this fails.

Done.


pkg/cronjob/es_rollover_test.go, line 15 at r1 (raw file):

Previously, jpkrohling (Juraci Paixão Kröhling) wrote…

We are running out of names :) I wonder if 🦄 is a valid name

that would work in unit tests 👯


pkg/cronjob/es_rollover_test.go, line 42 at r1 (raw file):

for some other field
Those are pretty much custom values from ES curator. If we go with duration we have to parse it to the curator units and also round it to an appropriate value.


pkg/storage/dependency.go, line 17 at r1 (raw file):

Previously, jpkrohling (Juraci Paixão Kröhling) wrote…

It would probably make more sense to have a conditional like the above, and decide on the rollover inside the storage-specific function. Like:

if strings.EqualFold(jaeger.Spec.Storage.Type, "elasticsearch") {
		return elasticsearchDeps(jaeger)
}

Done.


pkg/strategy/production.go, line 94 at r1 (raw file):

Previously, jpkrohling (Juraci Paixão Kröhling) wrote…

Don't you end up with the rollover cron job two times here? One in the dependencies and one in the cronJobs?

Yes but those have different functions. Dependncies initialize storage and cronjobs do rolover and lookback


pkg/util/util.go, line 127 at r1 (raw file):

Previously, jpkrohling (Juraci Paixão Kröhling) wrote…

Could you check the test coverage? Looks like this line isn't being tested.

there are many lines which are not being tested in codebase

@pavolloffay
Copy link
Member Author

this is ready for re-review

Copy link
Contributor

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: , just a couple of clarifications needed.

Reviewed 6 of 6 files at r3.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @objectiser and @pavolloffay)


pkg/controller/jaeger/jaeger_controller.go, line 132 at r3 (raw file):

	if jaeger.Spec.Storage.Rollover.ReadTTL != "" {
		if _, err := time.ParseDuration(jaeger.Spec.Storage.Rollover.ReadTTL); err != nil {
			return errors.Wrap(err, "could not parse esRollover.readTTL")

Is the value included in the error message? It's useful to know what's the string that couldn't be parsed.


pkg/cronjob/es_rollover.go, line 88 at r3 (raw file):

			envs = append(envs, corev1.EnvVar{Name: "UNIT_COUNT", Value: strconv.Itoa(d.count)})
		} else {
			jaeger.Logger().WithError(err).Error("Could not parse esRollover.readTTL")

.WithField("readTTL", jaeger.Spec.Storage.Rollover.ReadTTL) so that the user knows what we attempted to parse.


pkg/cronjob/es_rollover.go, line 150 at r3 (raw file):

func parseToUnits(d time.Duration) pythonUnits {
	b := big.NewFloat(d.Hours())

Why don't you just use Seconds()? time.ParseDuration("1m30s").Seconds() returns 90, so, you can just use something like return pythonUnits{units: seconds, count: int(time.ParseDuration("1m30s").Seconds())}


pkg/util/util.go, line 127 at r1 (raw file):

Previously, pavolloffay (Pavol Loffay) wrote…

there are many lines which are not being tested in codebase

I'm aware only of lines with error handling that are hard to test. Are you aware of other places where the coverage is lacking? I'll take a look at the latest coverage report, but we should try to get the best coverage, no matter how the code base currently looks like.

@pavolloffay
Copy link
Member Author

Why don't you just use Seconds()? time.ParseDuration("1m30s").Seconds() returns 90, so, you can just use something like return pythonUnits{units: seconds, count: int(time.ParseDuration("1m30s").Seconds())}

I am with what is there now, it's easier to read/debug.

Signed-off-by: Pavol Loffay <ploffay@redhat.com>
@pavolloffay
Copy link
Member Author

.WithField("readTTL", jaeger.Spec.Storage.Rollover.ReadTTL) so that the user knows what we attempted to parse.

the value is present I have modified the logging to produce nicer logs :)

@pavolloffay pavolloffay merged commit 24937ef into jaegertracing:master Mar 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants