Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Revert "Skip x-pack libbeat tests again as flaky (#10068)" #10179

Merged
merged 2 commits into from
Jan 22, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -80,10 +80,10 @@ jobs:
env: STRESS_TEST_OPTIONS="-timeout=20m -race -v -parallel 1" TARGETS="-C libbeat stress-tests"
go: $GO_VERSION
stage: test
#- os: linux
# env: TARGETS="-C x-pack/libbeat testsuite"
# go: $GO_VERSION
# stage: test
- os: linux
env: TARGETS="-C x-pack/libbeat testsuite"
go: $GO_VERSION
stage: test

# Metricbeat
- os: linux
Expand Down
2 changes: 1 addition & 1 deletion auditbeat/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
version: '2.1'
version: '2.3'
services:
beat:
build: ${PWD}/.
Expand Down
2 changes: 1 addition & 1 deletion filebeat/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
version: '2.1'
version: '2.3'
services:
beat:
build: ${PWD}/.
Expand Down
2 changes: 1 addition & 1 deletion heartbeat/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
version: '2.1'
version: '2.3'
services:
beat:
build: ${PWD}/.
Expand Down
2 changes: 1 addition & 1 deletion libbeat/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
version: '2.1'
version: '2.3'
services:
beat:
build: ${PWD}/.
Expand Down
2 changes: 1 addition & 1 deletion libbeat/scripts/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -202,7 +202,7 @@ integration-tests: prepare-tests
.PHONY: integration-tests-environment
integration-tests-environment: ## @testing Runs the integration inside a virtual environment. This can be run on any docker-machine (local, remote)
integration-tests-environment: prepare-tests build-image
${DOCKER_COMPOSE} run beat make integration-tests RACE_DETECTOR=$(RACE_DETECTOR) DOCKER_COMPOSE_PROJECT_NAME=${DOCKER_COMPOSE_PROJECT_NAME}
${DOCKER_COMPOSE} run beat make integration-tests RACE_DETECTOR=$(RACE_DETECTOR) DOCKER_COMPOSE_PROJECT_NAME=${DOCKER_COMPOSE_PROJECT_NAME} || docker inspect --format "{{json .State.Health }}" $$(${DOCKER_COMPOSE} ps -q) | jq && ${DOCKER_COMPOSE} logs --tail 200
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this was for debugging purpose. Should we keep it in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin YES we should, because when docker-compose fails it can be hard to either reproduce it or get the logs, it took me a few tries to get an idea how to have useful debug information and when I finally found a good way it took 50 runs (7m each) to finally make it fails..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above will do two things:

  1. Display the last information for the healthcheck.
  2. Display 200 log lines for each container.

The 2, was why I was able to actually make sense of this error, Logging at the log it was clearly demonstrating that ES was actually not in a green state and the authentification failed.


# Runs the system tests
.PHONY: system-tests
Expand Down
2 changes: 1 addition & 1 deletion testing/environments/5.x.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# This is the latest stable 5x release environment.

version: '2.1'
version: '2.3'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:5.6.9
Expand Down
2 changes: 1 addition & 1 deletion testing/environments/6.0.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# This is the latest 6.0 environment.

version: '2.1'
version: '2.3'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch-platinum:6.0.1
Expand Down
2 changes: 1 addition & 1 deletion testing/environments/latest.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# This is the latest released environment.

version: '2.1'
version: '2.3'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:6.5.4
Expand Down
2 changes: 1 addition & 1 deletion testing/environments/local.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# This is useful for testing locally with a full elastic stack setup.
# All services can be reached through localhost like localhost:5601 for Kibana
# This is not used for CI as otherwise ports conflicts could happen.
version: '2.1'
version: '2.3'
services:
kibana:
ports:
Expand Down
2 changes: 1 addition & 1 deletion testing/environments/snapshot-oss.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# This should start the environment with the latest snapshots.

version: '2.1'
version: '2.3'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch-oss:7.0.0-alpha1-SNAPSHOT
Expand Down
2 changes: 1 addition & 1 deletion testing/environments/snapshot.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# This should start the environment with the latest snapshots.

version: '2.1'
version: '2.3'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.0.0-SNAPSHOT
Expand Down
2 changes: 1 addition & 1 deletion x-pack/auditbeat/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
version: '2.1'
version: '2.3'
services:
beat:
build: ../../auditbeat
Expand Down
2 changes: 1 addition & 1 deletion x-pack/filebeat/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
version: '2.1'
version: '2.3'
services:
beat:
build: ../../filebeat
Expand Down
2 changes: 1 addition & 1 deletion x-pack/functionbeat/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
version: '2.1'
version: '2.3'
services:
beat:
build: ${PWD}/.
Expand Down
17 changes: 7 additions & 10 deletions x-pack/libbeat/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
version: '2.1'
version: '2.3'
services:
beat:
build: ${PWD}/.
Expand All @@ -20,27 +20,24 @@ services:
depends_on:
elasticsearch: { condition: service_healthy }
kibana: { condition: service_healthy }
healthcheck:
interval: 1s
retries: 2400

elasticsearch:
extends:
file: ${ES_BEATS}/testing/environments/${TESTING_ENVIRONMENT}.yml
service: elasticsearch
healthcheck:
test: ["CMD", "curl", "-u", "elastic:changeme", "-f", "http://localhost:9200"]
test: ["CMD-SHELL", 'python -c ''import urllib, json; response = urllib.urlopen("http://elastic:changeme@localhost:9200/_cluster/health"); data = json.loads(response.read()); exit(1) if data["status"] != "green" else exit(0);''']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently this should only matter for x-pack or anything that uses security, I didn't want to change anything on other docker-compose file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin Just to clarify, I will not change the healthcheck for the testing/environment files, become as of today nothing assume that a cluster need to be green other than security, but as we add more test or when/if metricbeat has integration tests that requires security we could move all the docker-compose files to that strategy.

retries: 1200
interval: 1s
interval: 5s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the old one again to speed up bulids?

start_period: 60s
environment:
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
- "network.host="
- "transport.host=127.0.0.1"
- "http.host=0.0.0.0"
- "xpack.security.enabled=true"
- "xpack.license.self_generated.type=trial"
command: bash -c "bin/elasticsearch-keystore create && echo changeme | bin/elasticsearch-keystore add --stdin bootstrap.password && /usr/local/bin/docker-entrypoint.sh eswrapper"
restart: on-failure:5
- ELASTIC_PASSWORD=changeme
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


kibana:
depends_on:
Expand All @@ -51,6 +48,6 @@ services:
healthcheck:
test: ["CMD-SHELL", 'python -c ''import urllib, json; response = urllib.urlopen("http://elastic:changeme@localhost:5601/api/status"); data = json.loads(response.read()); exit(1) if data["status"]["overall"]["state"] != "green" else exit(0);''']
retries: 1200
interval: 1s
interval: 5s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for this quite long delay and lower interval? This will slow down builds I would assume?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well slow down is actually relative in the healthcheck, if you wait for 5s instead of 1s, it's 4 fewer requests ES has to deal before the next check. Since Travis is not the fastest thing in the world, giving it more time to start is a good compromise. You see that I've only modified theses limit for the x-pack/libbeat/docker-compose.yml and not the other files, it's actually because it takes more times to start and recover the security index to allow authentification.

@ruflin if the test suite becomes flaky or unreliable we will have to switch to file-based authentification for ES.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would hope that ES can cope with 1 request per sec ;-) For the start_period, tried to find some good docs but I assume it waits initially 60s? Seems a bit too much.

Overall my goal is to keep the CI time down, as soon as a service is available, tests should start.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think too, but well its Travis its docker, I prefer to give it as much chance as possible, maybe call that I am fed up with that and I would prefer to not whack a mole more :D

The start_period is a really bad name, it should be called grace period.
When the container starts, it means that health check that fails will not increase the fail count but if the health check succeeds during that period the container can be marked as healthy, when this happen the remaining time of the start_period is ignored. So it doesn't really impact that much the startup it just gives a bit more time to be healthy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM, good argument ;-)

start_period: 60s
command: /usr/local/bin/kibana-docker --xpack.security.enabled=true --elasticsearch.username=elastic --elasticsearch.password=changeme
restart: on-failure:5
2 changes: 1 addition & 1 deletion x-pack/metricbeat/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
version: '2.1'
version: '2.3'
services:
beat:
build: ../../metricbeat
Expand Down