Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core][Bugfix][Perf] Introduce
MQLLMEngine
to avoidasyncio
OH #8157[Core][Bugfix][Perf] Introduce
MQLLMEngine
to avoidasyncio
OH #8157Changes from 1 commit
a7a6e43
ce7d159
569cd43
d99ce6f
8d6b2e9
14f3637
3b8311b
5e2eb74
aa62f2e
863081b
965b97a
8fd72f6
ddeb7c6
6539e10
4b111e4
a5ffd2c
1395872
938cf85
72d1d42
fcdcfc9
659169e
9886f3d
5b2f057
ae4564c
f9ccecc
89b730b
f3dc82b
ac97a9e
becd7ab
b7f49ed
58ae3b0
d0f9641
aa64042
5c6e5ef
25174a5
db55c1a
f97e1f2
dd96d3e
14d4afe
bc386ea
c0d0d60
b7c1fcc
a661b76
5d00f3a
5598494
98aaa7d
ba5ef38
18b5a94
b048961
6b2e18b
7a7ff5b
b7e1fe9
48068d5
e3daa28
7880b75
29fe3c8
a720947
5b8cee6
97a241d
6d0570e
eb26791
e256050
59c5aca
62f654a
75c6157
6f1cced
370c104
0cf9551
72f72fd
ded4540
364ed7f
f7fdf69
7e61cdb
3e84c8c
2559813
d0a0f8b
70e4916
021fed3
fd6ee43
ccb43a3
1aa0823
fe22fe2
3ce8702
1f3fc24
ead62dd
672fb81
86312e4
c4f6898
678e8e5
78b9e21
66c6961
6e1e2bb
610b349
28bb8a4
b266249
f8036a5
a649f75
5b3535d
7097e05
ad3d0f8
6e9c6c9
fb8e2f9
75523b2
5546d2e
6403f49
bc68b51
3bb5e52
2ac814f
179a667
97d6c09
78badc1
a499733
6a5d8d8
dfab5eb
96f84fe
ae14670
117c024
c059713
1af3297
97ae38d
d0fab11
bb4d839
1967f6a
a911323
95ff4f3
302868e
add68ee
242b952
3dafa26
836a9d2
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the sleep needed? I thought the
await engine.make_client()
would only return once the client is connected to a healthy engineThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this behavior cause a race condition where the internal
LLMEngine
could finish a request, but before the response gets to the http server, the connection times out and the http server tries to abort the request, murdering the engine in the process?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont follow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Instead of triggering the bad abort task before generation starts and relying on the 2s timing interval, can we instead start the abort task once we get to the first iteration of this body loop? That should ensure that the abort happens after generation has started and make this test a lot faster
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the generation iterator is separate from the engine process