-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: preload textual model #12729
base: main
Are you sure you want to change the base?
feat: preload textual model #12729
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
import { ApiProperty } from '@nestjs/swagger'; | ||
import { Type } from 'class-transformer'; | ||
import { IsNotEmpty, IsNumber, IsString, Max, Min } from 'class-validator'; | ||
import { IsNotEmpty, IsNumber, IsObject, IsString, Max, Min, ValidateNested } from 'class-validator'; | ||
import { ValidateBoolean } from 'src/validation'; | ||
|
||
export class TaskConfig { | ||
|
@@ -14,7 +14,20 @@ export class ModelConfig extends TaskConfig { | |
modelName!: string; | ||
} | ||
|
||
export class CLIPConfig extends ModelConfig {} | ||
export class LoadTextualModelOnConnection extends TaskConfig { | ||
@IsNumber() | ||
@Min(0) | ||
@Type(() => Number) | ||
@ApiProperty({ type: 'number', format: 'int64' }) | ||
ttl!: number; | ||
} | ||
|
||
export class CLIPConfig extends ModelConfig { | ||
@Type(() => LoadTextualModelOnConnection) | ||
@ValidateNested() | ||
@IsObject() | ||
loadTextualModelOnConnection!: LoadTextualModelOnConnection; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can this be less wordy? Maybe Also, I imagine there are a lot of sessions where nothing gets searched and the model is loaded unnecessarily. If this were instead done when the user clicks the search bar, it'd almost always be a positive and could be enabled by default. |
||
} | ||
|
||
export class DuplicateDetectionConfig extends TaskConfig { | ||
@IsNumber() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
InferenceEntry
is just a type for a dict, so this__init__
shouldn't exist.Also, I have some reservations about the
ttl
field here. TTL is currently set by an env and assumed to be the same for all models. At minimum, you would need to change the idle shutdown conditions to not rely on the env anymore and instead (synchronously) check if the model cache is empty.Moreover, I'm not sure if this is something that the caller should be able to decide. The relationship between server and ML isn't necessarily one-to-one, so the settings should be designed with that in mind. If you have multiple servers that share a machine learning service, the effective TTL now depends on the order of requests. TTL is also relevant to whoever is deploying ML and may not be something they want a caller to be able to configure.