Skip to content

Conversation

@spalger
Copy link
Contributor

@spalger spalger commented Jul 15, 2020

We're seeing fairly regular failures in the baseline capture jobs that we suspect are caused by a resource shortage.

Resource graph from a recently failed job:

image

That sharp spike in I/O Wait is probably because we're not using ramDisks for these workers, mostly because we're running on small instances which don't have the ram necessary. Hoping that a switch to a s-highmem instance + using a ramDisk will prevent that and make these jobs more stable.

@spalger spalger added Team:Operations Kibana-Operations Team v8.0.0 release_note:skip Skip the PR/issue when compiling release notes v7.10.0 v7.9.0 labels Jul 15, 2020
@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Build metrics

✅ unchanged

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@spalger spalger marked this pull request as ready for review July 15, 2020 21:10
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-operations (Team:Operations)

@spalger spalger requested a review from a team July 15, 2020 21:11
@spalger
Copy link
Contributor Author

spalger commented Jul 15, 2020

@spalger spalger merged commit b695d60 into elastic:master Jul 17, 2020
@spalger spalger deleted the implement/larger-baseline-capture-nodes branch July 17, 2020 00:13
spalger added a commit to spalger/kibana that referenced this pull request Jul 17, 2020
Co-authored-by: spalger <spalger@users.noreply.github.com>
spalger added a commit to spalger/kibana that referenced this pull request Jul 17, 2020
Co-authored-by: spalger <spalger@users.noreply.github.com>
spalger added a commit that referenced this pull request Jul 17, 2020
…#72210)

Co-authored-by: spalger <spalger@users.noreply.github.com>
spalger added a commit that referenced this pull request Jul 17, 2020
…#72209)

Co-authored-by: spalger <spalger@users.noreply.github.com>
gmmorris added a commit to gmmorris/kibana that referenced this pull request Jul 17, 2020
* master: (214 commits)
  replacing hard coded links for ela.st (elastic#72240)
  skip flaky suite (elastic#60865)
  chore(NA): teardown dynamic dll plugin (elastic#72096)
  Register navLink actions for declared applications (elastic#72109)
  Fix value for process.hash.sha256 draggable (elastic#72142)
  Call setupIngest before fleet_install tests (elastic#72214)
  [Security Solution][Detections] Better toast errors (elastic#72205)
  skip flaky suite (elastic#64696)
  [Security Solution][Detections] Disable exceptions for Threshold and ML rules (elastic#72137)
  [Security Solution][Detections,Lists] Miscellaneous post-FF fixes (elastic#71990)
  [baseline/capture] use high-memory nodes with ramDisks (elastic#71894)
  skip flaky suite (elastic#77207)
  [Maps] Fix issue preventing TMS from rendering correctly (elastic#71946)
  using test_user with minimum privs (elastic#71988)
  Fixed Webhook connector doesn't retain added HTTP header settings (elastic#71924)
  [Ingest Manager] Do not show enrolling and unenrolling agents as online in agent counters (elastic#71921)
  [Maps] fix 'New Map' from getting added to recently accessed (elastic#72125)
  [Visualizations] Pass 'aggs' parameter to custom request handlers (elastic#71423)
  [Monitoring] Out of the box alert tweaks (elastic#71942)
  [ML] Fix datafeed start time is incorrect when the job has trailing empty buckets (elastic#71976)
  ...
gmmorris added a commit to gmmorris/kibana that referenced this pull request Jul 17, 2020
* master: (55 commits)
  updates 'External alerts' tab text (elastic#72237)
  [Security Solution][Case] Fix connector's dropdown with conflicting requests (elastic#72037)
  replacing hard coded links for ela.st (elastic#72240)
  skip flaky suite (elastic#60865)
  chore(NA): teardown dynamic dll plugin (elastic#72096)
  Register navLink actions for declared applications (elastic#72109)
  Fix value for process.hash.sha256 draggable (elastic#72142)
  Call setupIngest before fleet_install tests (elastic#72214)
  [Security Solution][Detections] Better toast errors (elastic#72205)
  skip flaky suite (elastic#64696)
  [Security Solution][Detections] Disable exceptions for Threshold and ML rules (elastic#72137)
  [Security Solution][Detections,Lists] Miscellaneous post-FF fixes (elastic#71990)
  [baseline/capture] use high-memory nodes with ramDisks (elastic#71894)
  skip flaky suite (elastic#77207)
  [Maps] Fix issue preventing TMS from rendering correctly (elastic#71946)
  using test_user with minimum privs (elastic#71988)
  Fixed Webhook connector doesn't retain added HTTP header settings (elastic#71924)
  [Ingest Manager] Do not show enrolling and unenrolling agents as online in agent counters (elastic#71921)
  [Maps] fix 'New Map' from getting added to recently accessed (elastic#72125)
  [Visualizations] Pass 'aggs' parameter to custom request handlers (elastic#71423)
  ...
gmmorris added a commit to gmmorris/kibana that referenced this pull request Jul 17, 2020
…feature-privileges

* alerting/consumer-based-rbac: (56 commits)
  take into account which features available in the active space
  updates 'External alerts' tab text (elastic#72237)
  [Security Solution][Case] Fix connector's dropdown with conflicting requests (elastic#72037)
  replacing hard coded links for ela.st (elastic#72240)
  skip flaky suite (elastic#60865)
  chore(NA): teardown dynamic dll plugin (elastic#72096)
  Register navLink actions for declared applications (elastic#72109)
  Fix value for process.hash.sha256 draggable (elastic#72142)
  Call setupIngest before fleet_install tests (elastic#72214)
  [Security Solution][Detections] Better toast errors (elastic#72205)
  skip flaky suite (elastic#64696)
  [Security Solution][Detections] Disable exceptions for Threshold and ML rules (elastic#72137)
  [Security Solution][Detections,Lists] Miscellaneous post-FF fixes (elastic#71990)
  [baseline/capture] use high-memory nodes with ramDisks (elastic#71894)
  skip flaky suite (elastic#77207)
  [Maps] Fix issue preventing TMS from rendering correctly (elastic#71946)
  using test_user with minimum privs (elastic#71988)
  Fixed Webhook connector doesn't retain added HTTP header settings (elastic#71924)
  [Ingest Manager] Do not show enrolling and unenrolling agents as online in agent counters (elastic#71921)
  [Maps] fix 'New Map' from getting added to recently accessed (elastic#72125)
  ...
@spalger spalger added the v7.8.1 label Jul 22, 2020
spalger added a commit to spalger/kibana that referenced this pull request Jul 22, 2020
Co-authored-by: spalger <spalger@users.noreply.github.com>
spalger added a commit that referenced this pull request Jul 22, 2020
… and [pipeline/commitStatus] update commit status in baseline-capture job (#72366) (#72981)

* [baseline/capture] use high-memory nodes with ramDisks (#71894)

Co-authored-by: spalger <spalger@users.noreply.github.com>

* [pipeline/commitStatus] update commit status in baseline-capture job (#72366)

Co-authored-by: spalger <spalger@users.noreply.github.com>
(cherry picked from commit a221e04)

Co-authored-by: spalger <spalger@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release_note:skip Skip the PR/issue when compiling release notes Team:Operations Kibana-Operations Team v7.8.1 v7.9.0 v7.10.0 v8.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants