-
-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Travis VM Infrastructure #337
Conversation
Travis-CI has or will shortly make in early December 2018 a number of beneficial changes to their Linux continuous integration testing infrastructure. Changes that impact pelias/schema are: * Linux infrastructure combined into one (virtualized), from two previously (virtualized and container-based). [0][1] * Offering a more modern, supported Ubuntu Xenial (16.04 LTS). [2] * Modest speed improvements from the fully virtualized-based infrastructure. NOTE: Until openjdk/oraclejdk dependencies can be resolved on modern Ubuntu and Travis-CI environment, keep the image at Ubuntu Trusty (14.04 LTS). Projects using "sudo: false" (container-based infrastructure), have been recommended to remove that configuration soon. In any case, the transition will happen regardless for projects by December 7, 2018. [0] https://blog.travis-ci.com/2018-10-04-combining-linux-infrastructures [1] https://blog.travis-ci.com/2018-11-19-required-linux-infrastructure-migration [2] https://docs.travis-ci.com/user/reference/xenial/
A hard switch-over is coming soon.
Hmm... Yeah that's super weird, I agree it looks like the VMs are sharing data between them. I wonder if there is a new way of specifying the matrix config which enforces separation. |
Agh.. so maybe this? https://docs.travis-ci.com/user/caching/#caches-and-build-matrices An easy test to see if it's caching related is to set |
I don't think it's actually caching. If you look at the build output for a test i did, there are no indices returned from the |
This SO issue might be related, looks like "index already exists" might be returned in cases which are not strictly true and that additional error info might not be showing in our debug output? |
Okay, I figured it out. The duplicate index issue is a red-herring. It turns out Elasticsearch retries on index creation failure (as it does in all other requests, unless you set Running a debug travis build, I did so, and saw the real error:
So what does that mean? Node.js has a (quite reasonable) limit set for maximum header length. Could Elasticsearch be sending back an invalid request that looks like lots of headers? I turned to
The response is pretty...amazing:
It appears Elasticsearch decided it was a good idea to put deprecation warnings in the HTTP headers. We have a lot of them, which we can't remove until we drop ES2 support. As far as I can tell, there is currently no way to disable these headers but there are lots of people complaining. I'll investigate why this only happens on TravisCI. But surely we can fix this somehow. |
Face |
Okay, I think I understand now. I don't believe this particular error is related to Travis VMs at all. I was able to reproduce the error locally on all versions of Node.js but the latest. The 8192 byte header limit was added in the recent security fix Node.js releases, and it appears that Node.js is now adding some followup PRs to do things like allow configuring the limit on the command line, which would help us here. There are actually new config parameters introduced to control deprecation warning headers, but they were only introduced in Elasticsearch 6.3 and don't appear to be backported to 5. |
I've updated this PR to change the Node.js versions to specify 8.13.0 and 10.13.0, the latest version in each release line without the header size limit, and everything now works! I've opened elastic/elasticsearch#36243 to see if Elasticsearch would consider backporting the changes that can bring us to the latest Node.js version safely. If they do, we can use that version and at least set the defaults in our Docker images. |
Due to a limit to header sizes in the latest security releases of Node.js, combined with Elasticsearch's default of sending lots of deprecation warning errors as headers, we need to use slightly older versions of Node.js until either Elasticsearch offers more configuration options, or Node.js releases a CLI option for the header limit. See #337 for details
9d7a594
to
ff69783
Compare
Due to a conflict with large headers sent by Elasticsearch 5, Node.js 10.14.0, which introduces an 8k header limit, is not suitable for use with pelias/schema. Connects pelias/schema#337
Due to a limit to header sizes in the latest security releases of Node.js, combined with Elasticsearch's default of sending lots of deprecation warning errors as headers, we need to use slightly older versions of Node.js until either Elasticsearch offers more configuration options, or Node.js releases a CLI option for the header limit. See #337 for details
This is to prevent conflicts between ES5 deprecation headers (which can be quite large) and the Node.js 10.14.0+ header limit of 8kb. See #337
I realized that changing how we run our Travis tests is not required, only the Node.js version pinning is relevant, and opened #339 with just that change. This PR is no longer needed. |
This change makes our Elasticsearch schema compatible with Elasticsearch 5 and 6. It shouln't have any effect on performance or operation, but it will completely drop compatibility for Elasticsearch 2. The primary change is that Elasticsearch 5 introduces two types of text fields: `text` and `keyword`, whereas Elasticsearch 2 only had 1: `string`. Roughly, a `text` field is for true full text search and a `keyword` field is for simple values that are primarily used for filtering or aggregation (for example, our `source` and `layer` fields). The `string` datatype previously filled both of those roles depending on how it was configured. Fortunately, we had already roughly created a concept similar to the `keyword` datatype in our schema, but called it `literal`. This has been renamed to `keyword` to cut down on the number of terms needed One nice effect of this change is that it removes all deprecation warnings printed by Elasticsearch 5. Notably, as discovered in #337 (comment), these warnings were quite noisy and required special handling to work around Node.js header size restrictions. This special handling can now been removed. Fixes pelias/whosonfirst#457 Connects pelias/pelias#719 Connects pelias/pelias#461
This change makes our Elasticsearch schema compatible with Elasticsearch 5 and 6. It shouldn't have any effect on performance or operation, but it will completely drop compatibility for Elasticsearch 2. The primary change is that Elasticsearch 5 introduces two types of text fields: `text` and `keyword`, whereas Elasticsearch 2 only had 1: `string`. Roughly, a `text` field is for true full text search and a `keyword` field is for simple values that are primarily used for filtering or aggregation (for example, our `source` and `layer` fields). The `string` datatype previously filled both of those roles depending on how it was configured. Fortunately, we had already roughly created a concept similar to the `keyword` datatype in our schema, but called it `literal`. This has been renamed to `keyword` to cut down on the number of terms needed One nice effect of this change is that it removes all deprecation warnings printed by Elasticsearch 5. Notably, as discovered in #337 (comment), these warnings were quite noisy and required special handling to work around Node.js header size restrictions. This special handling can now been removed. Fixes pelias/whosonfirst#457 Connects pelias/pelias#719 Connects pelias/pelias#461
Building on the changes to Travis config in #336, this PR modifies our Elasticsearch setup to be compatible with VM based infrastructure. It's based heavily on the Travis Elasticsearch docs.
However, there currently appears to be an issue where for some reason, the Elasticsearch 5 builds think the Pelias index already exists. It almost looks as if the different builds are sharing data. However, that shouldn't be the case.
Since Travis has already started rolling out the switch to VM based builds, we have to fix this soon, and until then live with failing schema builds.
@missinglink any ideas?