Skip to content

[Obs AI Assistant] Fix re-deploy model timeout and status polling #220445

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

viduni94
Copy link
Contributor

@viduni94 viduni94 commented May 7, 2025

Closes https://github.com/elastic/obs-ai-assistant-team/issues/247
Closes #217912

Summary

Problems

  • The /warmup_model endpoint doesn't return immediately and waits for the KB to be ready. If there is no ML nodes or sufficient capacity in the ML node, the API can timeout.
  • Since the endpoint doesn't return immediately, we don't poll for status continuously.
  • Knowledge base tab doesn't show Inspect if no ML nodes are available.

Solutions

  • Show Inspect information in the knowledge base
  • Return /warmup_model immediately (we don't need to wait for the model to be ready since we are polling), and start polling
  • If the user refreshes the browser and if the kbState is in DEPLOYING_MODEL keep polling for status

Checklist

  • The PR description includes the appropriate Release Notes section, and the correct release_note:* label is applied per the guidelines

@viduni94 viduni94 self-assigned this May 7, 2025
@viduni94 viduni94 requested review from a team as code owners May 7, 2025 21:51
@viduni94 viduni94 added release_note:skip Skip the PR/issue when compiling release notes Team:Obs AI Assistant Observability AI Assistant backport:version Backport to applied version labels v9.1.0 v8.19.0 labels May 7, 2025
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ai-assistant (Team:Obs AI Assistant)

@botelastic botelastic bot added the ci:project-deploy-observability Create an Observability project label May 7, 2025
Copy link
Contributor

github-actions bot commented May 7, 2025

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • /oblt-deploy : Deploy a Kibana instance using the Observability test environments.
  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@@ -84,7 +84,7 @@ const warmupModelKnowledgeBase = createObservabilityAIAssistantServerRoute({
requiredPrivileges: ['ai_assistant'],
},
},
handler: async (resources): Promise<void> => {
handler: async (resources): Promise<{ currentInferenceId: string }> => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currentInferenceId will always be the same as inferenceId, right? What's the use case for returning this to the client?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sorenlouv yes, we don't need to return it back. I updated it to just return instead of returning the inferenceId
6ed092f

Copy link
Member

@sorenlouv sorenlouv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Just one question about a return value

@viduni94 viduni94 force-pushed the fix-reploy-model-timeout-and-status-polling branch from bdf985b to 082fbcb Compare May 8, 2025 16:02
@elasticmachine
Copy link
Contributor

⏳ Build in-progress

  • Buildkite Build
  • Commit: a81ece5
  • Kibana Serverless Image: docker.elastic.co/kibana-ci/kibana-serverless:pr-220445-a81ece5b2295

Failed CI Steps

History

cc @viduni94

@viduni94 viduni94 merged commit ff3822d into elastic:main May 8, 2025
9 checks passed
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 8.19

https://github.com/elastic/kibana/actions/runs/14916807930

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request May 8, 2025
…astic#220445)

Closes elastic/obs-ai-assistant-team#247
Closes elastic#217912

## Summary

### Problems
- The `/warmup_model` endpoint doesn't return immediately and waits for
the KB to be ready. If there is no ML nodes or sufficient capacity in
the ML node, the API can timeout.
- Since the endpoint doesn't return immediately, we don't poll for
status continuously.
- Knowledge base tab doesn't show `Inspect` if no ML nodes are
available.

### Solutions

- Show `Inspect` information in the knowledge base
- Return `/warmup_model` immediately (we don't need to wait for the
model to be ready since we are polling), and start polling
- If the user refreshes the browser and if the `kbState` is in
`DEPLOYING_MODEL` keep polling for status

### Checklist

- [x] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)

(cherry picked from commit ff3822d)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.19

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request May 8, 2025
…ing (#220445) (#220591)

# Backport

This will backport the following commits from `main` to `8.19`:
- [[Obs AI Assistant] Fix re-deploy model timeout and status polling
(#220445)](#220445)

<!--- Backport version: 9.6.6 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Viduni
Wickramarachchi","email":"viduni.wickramarachchi@elastic.co"},"sourceCommit":{"committedDate":"2025-05-08T21:49:20Z","message":"[Obs
AI Assistant] Fix re-deploy model timeout and status polling
(#220445)\n\nCloses
https://github.com/elastic/obs-ai-assistant-team/issues/247\nCloses
https://github.com/elastic/kibana/issues/217912\n\n## Summary\n\n###
Problems\n- The `/warmup_model` endpoint doesn't return immediately and
waits for\nthe KB to be ready. If there is no ML nodes or sufficient
capacity in\nthe ML node, the API can timeout.\n- Since the endpoint
doesn't return immediately, we don't poll for\nstatus continuously.\n-
Knowledge base tab doesn't show `Inspect` if no ML nodes
are\navailable.\n\n### Solutions\n\n- Show `Inspect` information in the
knowledge base\n- Return `/warmup_model` immediately (we don't need to
wait for the\nmodel to be ready since we are polling), and start
polling\n- If the user refreshes the browser and if the `kbState` is
in\n`DEPLOYING_MODEL` keep polling for status\n\n### Checklist\n\n- [x]
The PR description includes the appropriate Release Notes section,\nand
the correct `release_note:*` label is applied per
the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)","sha":"ff3822d0a34f7d2c9ac00953b9d198be2661f717","branchLabelMapping":{"^v9.1.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Team:Obs
AI
Assistant","ci:project-deploy-observability","backport:version","v9.1.0","v8.19.0"],"title":"[Obs
AI Assistant] Fix re-deploy model timeout and status
polling","number":220445,"url":"https://github.com/elastic/kibana/pull/220445","mergeCommit":{"message":"[Obs
AI Assistant] Fix re-deploy model timeout and status polling
(#220445)\n\nCloses
https://github.com/elastic/obs-ai-assistant-team/issues/247\nCloses
https://github.com/elastic/kibana/issues/217912\n\n## Summary\n\n###
Problems\n- The `/warmup_model` endpoint doesn't return immediately and
waits for\nthe KB to be ready. If there is no ML nodes or sufficient
capacity in\nthe ML node, the API can timeout.\n- Since the endpoint
doesn't return immediately, we don't poll for\nstatus continuously.\n-
Knowledge base tab doesn't show `Inspect` if no ML nodes
are\navailable.\n\n### Solutions\n\n- Show `Inspect` information in the
knowledge base\n- Return `/warmup_model` immediately (we don't need to
wait for the\nmodel to be ready since we are polling), and start
polling\n- If the user refreshes the browser and if the `kbState` is
in\n`DEPLOYING_MODEL` keep polling for status\n\n### Checklist\n\n- [x]
The PR description includes the appropriate Release Notes section,\nand
the correct `release_note:*` label is applied per
the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)","sha":"ff3822d0a34f7d2c9ac00953b9d198be2661f717"}},"sourceBranch":"main","suggestedTargetBranches":["8.19"],"targetPullRequestStates":[{"branch":"main","label":"v9.1.0","branchLabelMappingKey":"^v9.1.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/220445","number":220445,"mergeCommit":{"message":"[Obs
AI Assistant] Fix re-deploy model timeout and status polling
(#220445)\n\nCloses
https://github.com/elastic/obs-ai-assistant-team/issues/247\nCloses
https://github.com/elastic/kibana/issues/217912\n\n## Summary\n\n###
Problems\n- The `/warmup_model` endpoint doesn't return immediately and
waits for\nthe KB to be ready. If there is no ML nodes or sufficient
capacity in\nthe ML node, the API can timeout.\n- Since the endpoint
doesn't return immediately, we don't poll for\nstatus continuously.\n-
Knowledge base tab doesn't show `Inspect` if no ML nodes
are\navailable.\n\n### Solutions\n\n- Show `Inspect` information in the
knowledge base\n- Return `/warmup_model` immediately (we don't need to
wait for the\nmodel to be ready since we are polling), and start
polling\n- If the user refreshes the browser and if the `kbState` is
in\n`DEPLOYING_MODEL` keep polling for status\n\n### Checklist\n\n- [x]
The PR description includes the appropriate Release Notes section,\nand
the correct `release_note:*` label is applied per
the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)","sha":"ff3822d0a34f7d2c9ac00953b9d198be2661f717"}},{"branch":"8.19","label":"v8.19.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Viduni Wickramarachchi <viduni.wickramarachchi@elastic.co>
kdelemme pushed a commit to kdelemme/kibana that referenced this pull request May 9, 2025
…astic#220445)

Closes elastic/obs-ai-assistant-team#247
Closes elastic#217912

## Summary

### Problems
- The `/warmup_model` endpoint doesn't return immediately and waits for
the KB to be ready. If there is no ML nodes or sufficient capacity in
the ML node, the API can timeout.
- Since the endpoint doesn't return immediately, we don't poll for
status continuously.
- Knowledge base tab doesn't show `Inspect` if no ML nodes are
available.

### Solutions

- Show `Inspect` information in the knowledge base
- Return `/warmup_model` immediately (we don't need to wait for the
model to be ready since we are polling), and start polling
- If the user refreshes the browser and if the `kbState` is in
`DEPLOYING_MODEL` keep polling for status

### Checklist

- [x] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
akowalska622 pushed a commit to akowalska622/kibana that referenced this pull request May 29, 2025
…astic#220445)

Closes elastic/obs-ai-assistant-team#247
Closes elastic#217912

## Summary

### Problems
- The `/warmup_model` endpoint doesn't return immediately and waits for
the KB to be ready. If there is no ML nodes or sufficient capacity in
the ML node, the API can timeout.
- Since the endpoint doesn't return immediately, we don't poll for
status continuously.
- Knowledge base tab doesn't show `Inspect` if no ML nodes are
available.

### Solutions

- Show `Inspect` information in the knowledge base
- Return `/warmup_model` immediately (we don't need to wait for the
model to be ready since we are polling), and start polling
- If the user refreshes the browser and if the `kbState` is in
`DEPLOYING_MODEL` keep polling for status

### Checklist

- [x] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
qn895 pushed a commit to qn895/kibana that referenced this pull request Jun 3, 2025
…astic#220445)

Closes elastic/obs-ai-assistant-team#247
Closes elastic#217912

## Summary

### Problems
- The `/warmup_model` endpoint doesn't return immediately and waits for
the KB to be ready. If there is no ML nodes or sufficient capacity in
the ML node, the API can timeout.
- Since the endpoint doesn't return immediately, we don't poll for
status continuously.
- Knowledge base tab doesn't show `Inspect` if no ML nodes are
available.

### Solutions

- Show `Inspect` information in the knowledge base
- Return `/warmup_model` immediately (we don't need to wait for the
model to be ready since we are polling), and start polling
- If the user refreshes the browser and if the `kbState` is in
`DEPLOYING_MODEL` keep polling for status

### Checklist

- [x] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:version Backport to applied version labels ci:project-deploy-observability Create an Observability project release_note:skip Skip the PR/issue when compiling release notes Team:Obs AI Assistant Observability AI Assistant v8.19.0 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[AI Assistant] Assistant stuck in "setting up the knowledge base" phase if ML nodes are undersized
4 participants